Mid-Senior Inference Runtime Engineer - AI Optimization
About the Role
We're hiring a Mid-Senior Inference Runtime Engineer to join our innovative team at Inferact. This remote position allows you to work from anywhere in the United States while contributing to the advancement of AI inference technology. As an Inference Runtime Engineer, you will play a crucial role in optimizing how large language models (LLMs) and diffusion models are served, making AI inference cheaper and faster.
What You'll Do
- Develop and optimize the vLLM inference engine to enhance performance across diverse hardware architectures.
- Implement cutting-edge techniques for model execution, focusing on mixture-of-experts and multimodal architectures.
- Collaborate with a team of experts to push the boundaries of AI inference capabilities.
- Contribute to the development of performant and maintainable code within complex ML codebases.
- Engage in debugging and troubleshooting to ensure the reliability of inference systems.
Requirements
- Bachelor's degree in computer science, engineering, or a related field.
- Strong programming skills in Python, with experience in PyTorch internals.
- Deep understanding of transformer architectures and their variants.
- Experience with LLM inference systems such as vLLM, TensorRT-LLM, or SGLang.
- Ability to read and implement model architectures from research papers.
Nice to Have
- Familiarity with KV-cache memory management and hybrid model serving.
- Experience with RL frameworks for LLMs and multimodal inference.
- Contributions to open-source ML projects.
What We Offer
- Annual salary range of $200,000 - $400,000, commensurate with experience.
- Equity options to share in the company's success.
- Comprehensive health, dental, and vision benefits.
- 401(k) company match to support your financial future.
- Visa sponsorship available on a case-by-case basis.
This Mid-Senior Inference Runtime Engineer role at Inferact offers a unique opportunity to work on cutting-edge AI technologies while enjoying a competitive salary and remote work flexibility.
About Inferact
Explore Inferact careers in 2026. Discover a wide range of remote, hybrid, and office positions tailored to your skills. Utilize our advanced filters, application tracking, and gain valuable company insights to enhance your job search experience. Stay informed with the latest industry news and seize the best career opportunities at Inferact. Start your journey towards an exciting future today!
Generating success profile...
Analyzing job requirements and market data
Loading market overview...
Analyzing market trends and skill demands
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months