Senior DevOps Engineer for Distributed AI Infrastructure
About the Role
Intetics Inc., a global technology company providing custom software application development, distributed professional teams, software product quality assessment, and "all-things-digital" solutions, is seeking a highly skilled and experienced Senior DevOps Engineer to join our dynamic team on a full-time basis. This Senior DevOps Engineer remote position offers the opportunity to work on cutting-edge technology in a fast-paced environment.
What You'll Do
- Build, operate, and improve the infrastructure powering Parasail's distributed inference platform.
- Own reliability, scalability, and operational excellence across AWS-based control planes and our multi-provider GPU fleet.
- Design and maintain the networking layer connecting control planes, Kubernetes clusters, and geographically distributed GPU hosts.
- Operate and improve Kubernetes-based inference orchestration, primarily on EKS.
- Manage deployments and infrastructure changes using Helm, FluxCD, and Terraform.
- Improve observability across the platform using metrics, logs, traces, dashboards, and alerting built on Prometheus, Grafana, Loki, Jaeger, and OpenTelemetry.
- Tune alerts, improve runbooks, and strengthen operational readiness as the system scales.
- Respond to production issues, perform root cause analysis, and implement durable fixes.
Requirements
- 5+ years of experience in SRE, DevOps, platform engineering, or infrastructure engineering.
- Strong production experience with networking and Kubernetes.
- Experience operating AWS infrastructure in production, especially EKS.
- Strong hands-on experience managing Linux hosts, clusters, and distributed systems in environments that are not fully abstracted by a major cloud provider.
- Experience with Prometheus, Grafana, Loki, Jaeger, and OpenTelemetry.
- Experience with deployment and GitOps workflows using tools such as Helm and FluxCD.
- Experience with infrastructure as code, ideally Terraform.
- Familiarity with alert tuning, runbook development, and practical incident management in production systems.
Nice to Have
- Experience with AI inference, ML infrastructure, or adjacent high-performance distributed systems.
- Experience operating heterogeneous GPU fleets, bare-metal infrastructure, or multi-provider compute environments.
- Experience using AI tools productively in engineering workflows.
What We Offer
- Competitive salary and relocation support.
- Flexible working hours and remote work opportunities.
- Dynamic and innovative work environment.
- Professional development and growth opportunities.
- Access to cutting-edge technologies and tools.
This Senior DevOps Engineer role at Intetics offers a unique opportunity to work on cutting-edge AI infrastructure in a fully remote setting, with competitive salary and relocation support.
Generating success profile...
Analyzing job requirements and market data
Loading market overview...
Analyzing market trends and skill demands
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months