Senior Site Reliability Engineer - Remote
About the Role
We are looking for a Senior Site Reliability Engineer to join our Central Infrastructure Team at Keyrock. This remote position focuses on AWS, Kubernetes, and modern DevSecOps best practices. As a Senior Site Reliability Engineer, you will play a crucial role in designing, implementing, and maintaining highly scalable and resilient cloud infrastructure to support our trading operations.
What You'll Do
- Design, deploy, and maintain scalable and resilient infrastructure on AWS using Infrastructure-as-Code (IaC).
- Manage and optimize Kubernetes clusters for containerized applications, ensuring high availability and security.
- Implement and manage CI/CD pipelines for efficient deployment, testing, and monitoring of applications.
- Develop comprehensive monitoring solutions using Prometheus, Grafana, LGTM stack, or similar tools to improve system reliability.
- Apply best practices for cloud security, IAM policies, and compliance frameworks (SOC2, ISO 27001, etc.).
- Troubleshoot issues, perform root cause analysis, and implement fixes to optimize performance.
- Utilize Terraform, Ansible, or similar tools to automate infrastructure provisioning and configuration management.
- Work closely with software engineering, architecture, and security teams to promote DevOps culture and best practices.
- Design failover and backup strategies to ensure business continuity in the event of failures.
Requirements
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in cloud infrastructure, SRE, or DevOps roles.
- AWS Certified SysOps Administrator – Associate is desired.
- Strong expertise in AWS (EC2, S3, Lambda, RDS, VPC, IAM, etc.).
- Hands-on experience with Kubernetes (EKS, K3s, or self-managed clusters).
- Proficiency in scripting and automation using Python, Bash, or similar.
- Experience with Infrastructure as Code (Terraform, CloudFormation, or Ansible).
- Familiarity with monitoring, logging, and observability tools (Prometheus, Grafana, Datadog, etc.).
Nice to Have
- Interest in or any exposure to trading or similar themes.
- Familiarity with serverless architectures and event-driven computing.
- Familiarity with Rust compilation processes and techniques.
What We Offer
- A competitive salary package, with various benefits depending on method of engagement (Employee vs Contractor).
- Autonomy in your time management thanks to flexible working hours and the opportunity to work remotely.
- The freedom to create your own entrepreneurial experience by being part of a team of people in search of excellence.
- Continuing Professional Development plan with learning and certification path in accordance with both the team objectives and areas of interests.
This Senior Site Reliability Engineer role at Keyrock offers a competitive salary and the opportunity to work remotely in a dynamic and innovative environment.
Generating success profile...
Analyzing job requirements and market data
Loading market overview...
Analyzing market trends and skill demands
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months