AI SCORE 8.5

Senior MLOps Engineer - Training & Inference Optimization

$140K–$180K/year

About the Role

We are seeking a Senior MLOps Engineer to join our team remotely and lead the charge in Training & Inference Optimization. In this role, you will architect the infrastructure that powers our next-generation AI models, ensuring they are state-of-the-art and production-ready. You will work with cutting-edge technologies and collaborate with a multicultural team dedicated to pushing the boundaries of quantum computing and artificial intelligence.

What You'll Do

  • Architect and maintain scalable distributed training pipelines using NVIDIA NeMo, optimizing GPU utilization and implementing automated fault tolerance.
  • Lead the deployment of large language models (LLMs) using vLLM, TensorRT-LLM, or SGLang, tuning techniques to maximize throughput.
  • Utilize SLURM, Flyte, Ray, or SkyPilot for workload orchestration across diverse cloud providers.
  • Standardize model tracking and versioning using MLflow, ensuring reproducible training runs.
  • Conduct deep-dive profiling and bottleneck analysis across the full stack, from CUDA kernels to Python-level orchestration.
  • Monitor and optimize GPU expenditures through intelligent scaling policies.
  • Drive the engineering roadmap, perform rigorous code reviews, and mentor junior engineers.

Requirements

  • 5+ years of experience in MLOps, DevOps, or Software Engineering, with at least 2 years focused on LLM infrastructure.
  • Expert-level proficiency in PyTorch and the NVIDIA stack (CUDA, NCCL, Triton).
  • Hands-on experience with NVIDIA NeMo or Megatron-Bridge for distributed training.
  • Proven experience with SLURM, Flyte, Ray, or SkyPilot for cluster management.
  • Deep expertise in Kubernetes and K8s operators.
  • Mastery of Python and a functional understanding of C++ or Rust.
  • Familiarity with high-performance networking and NVIDIA H200/B200 architectures.

Nice to Have

  • Active contributions to relevant open-source projects.
  • Experience with model compression techniques.
  • Expertise in ML observability stacks like Prometheus and Grafana.

What We Offer

  • Comprehensive relocation package to help you settle into your new role.
  • Visa sponsorship for international candidates.
  • Competitive salary with performance bonuses.
  • Language courses to help you adapt to your new environment.
  • A multicultural and inclusive workplace that values diversity.
  • Opportunities for professional growth and development.
  • Work alongside world-leading experts in AI and quantum computing.
Why This Job8.5 of 10

This Senior MLOps Engineer position offers an exciting opportunity to work remotely with a leading quantum computing company. Enjoy competitive compensation and comprehensive relocation support.

Salary Range
Required
0/1
Optional
0/1
Bonus
0/1

Generating success profile...

Analyzing job requirements and market data

Loading market overview...

Analyzing market trends and skill demands

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.