Systems Engineer - AI Infrastructure (Remote)
About the Role
We’re hiring a Systems Engineer - AI Infrastructure to join a well-funded AI hardware startup that is revolutionizing the industry with next-gen inference systems. This role offers the opportunity to work remotely while contributing to projects that deliver over 10x performance gains compared to current solutions. As a Systems Engineer, you will be part of a Supercomputing team focused on building and scaling cluster-level AI compute systems.
What You'll Do
- Build low-level control-plane software for system bring-up and management.
- Develop services that interact with hardware, firmware, and operating systems.
- Create telemetry, logging, and tracing systems to enhance system observability.
- Optimize performance across PCIe, memory, networking, and kernel layers.
- Collaborate with cross-functional teams to ensure seamless integration of systems.
Requirements
- Strong proficiency in C/C++ or Rust programming languages.
- Deep knowledge of Linux systems and experience working closely with hardware (drivers, DMA, memory, etc.).
- Strong debugging and observability skills.
- Willingness to relocate to the United States if necessary.
- Experience in developing high-performance computing systems is a plus.
Nice to Have
- Familiarity with AI frameworks and libraries.
- Experience with cloud computing platforms.
- Knowledge of containerization technologies like Docker or Kubernetes.
What We Offer
- Competitive salary range of $120,000 to $150,000 per year.
- Comprehensive relocation packages to assist with your move.
- Flexible remote work options to maintain a healthy work-life balance.
- Opportunities for professional development and career advancement.
- A dynamic and innovative work environment at the forefront of AI technology.
This Systems Engineer role offers a unique opportunity to work remotely with a leading AI hardware startup, competitive salary, and relocation support.
Who Will Succeed Here
Proficient in C++ and Rust, with hands-on experience in developing and optimizing software for AI hardware, ensuring efficient resource utilization and performance gains.
Self-motivated and disciplined, thriving in a remote work environment by effectively managing time and prioritizing tasks to meet project deadlines and deliverables.
Analytical mindset with strong debugging skills, capable of using telemetry data to identify performance bottlenecks and implement solutions that enhance system efficiency.
Learning Resources
Career Path
Market Overview
Skills & Requirements
Domain Trends
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months