About the Role
We are seeking a Senior DevOps Engineer to enhance our high-performance computing (HPC) services and collaborate closely with the scientific community to optimize research computing. This Senior DevOps Engineer remote position offers the flexibility to work from any location in the United States, allowing you to contribute to innovative computational solutions in a dynamic environment.
What You'll Do
- Design, implement, and maintain robust platform infrastructure using Infrastructure as Code tools such as Terraform.
- Develop, deliver, and operate research computing services and applications.
- Apply Site Reliability Engineering principles to manage HPC service deployment, monitoring, and incident response.
- Solve complex technical problems related to HPC services and user applications.
- Manage large-scale HPC, HTC, or BC computing environments for optimal performance.
- Collaborate with scientific users to tailor HPC resources to research needs.
- Automate deployment processes to ensure consistency across HPC infrastructure.
- Maintain and administer large-scale cluster and server computing software such as Slurm, LSF, or Grid Engine.
- Develop and maintain monitoring dashboards using tools like Grafana and Prometheus.
- Work within a DevOps team environment following agile methodologies.
- Operate and utilize virtualized private cloud resources such as OpenStack.
- Administer large-scale parallel filesystems including Weka, GPFS, or Lustre.
- Use configuration management tools like Ansible, Salt, or Puppet to manage IT operations.
- Develop scripts and tools for HPC and DevOps platform operations using Bash and Python.
Requirements
- 3+ years of experience with DevOps processes and automation using Infrastructure as Code tools such as Terraform.
- Hands-on experience operating or engineering large-scale HPC or similar computing environments.
- Proven expertise in Linux system administration including TCP/IP networking and storage subsystems.
- Experience administering large-scale cluster management software such as Slurm, LSF, or Grid Engine.
- Knowledge of configuration management tools like Ansible, Salt, or Puppet.
- Experience working in agile DevOps teams.
- Ability to develop and maintain monitoring tools such as Grafana and Prometheus.
- Experience with scripting languages such as Bash and Python for automation and tool development.
- Strong experience managing virtualized private cloud environments like OpenStack.
- Scientific degree or equivalent experience in computationally intensive scientific data analysis.
- Proven ability to manage relationships with third-party suppliers.
- Upper-intermediate proficiency in English (B2+).
Nice to Have
- Experience with container technologies such as LXD, Singularity, Docker, or Kubernetes.
- Operation and configuration experience with public cloud platforms like AWS, Azure, or Google Cloud Platform.
- Experience with HashiCorp tools such as Vault, Consul, and Nomad.
- Development experience with programming languages such as Java, C++, Python, Ruby, or Perl.
- Experience with parallel filesystems like Weka, GPFS, or Lustre.
What We Offer
- Remote setup with flexibility to work from any location in Georgia.
- Opportunity to work abroad for up to two months per year.
- Relocation opportunities within offices in 55+ countries.
- Corporate and social events.
- Leadership development, career advising, soft skills, and well-being programs.
- Certifications (Google Cloud Platform, Azure, AWS).
- Unlimited access to LinkedIn Learning and Get Abstract.
- Free English classes with certified teachers.
- Participation in the Employee Stock Purchase Plan.
- Monetary bonuses for referral program.
- Comprehensive medical & family care package.
- Five trust days per year.
- Benefits package (sports activities, variety of stores and services).
EPAM Georgia is a team of innovators united by a passion for technology. The dynamic and inclusive culture we embrace helps positively impact our communities, clients, and employees. Here you will collaborate with multi-national teams, contribute to numerous cutting-edge projects, deliver the most creative solutions, and have an opportunity to learn. Our people are at the heart of our success, and we are proud to provide talents with a solid ground to develop and grow.
Why Choose Us
2024 Best Place to Work 2024, 2024 Sitecore's Partner Experience Awards. Looking for something else? Find a vacancy that works for you. Send us your CV to receive a personalized offer.
This Senior DevOps Engineer position offers a unique opportunity to work remotely while enhancing high-performance computing services. With competitive salary and comprehensive benefits, it’s an attractive role for tech professionals.
About EPAM Georgia
Explore EPAM Georgia careers in 2026, where exciting job opportunities await in remote, hybrid, and office roles. Utilize our advanced filters to find your ideal position, tailor your resume for success, and track your applications seamlessly. Stay updated with company insights and industry news to enhance your job search experience at EPAM Georgia. Start your journey towards a rewarding career today!
Generating success profile...
Analyzing job requirements and market data
Loading market overview...
Analyzing market trends and skill demands
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months