AI SCORE 8.5 / 10

Site Reliability Engineer - Remote Opportunity

$90K–$120K/year

Linux•Unix•Docker•Kubernetes•Java•Python•Golang•PostgreSQL•MySQL•AWS•Azure•Google Cloud•Prometheus•Grafana•ELK

About the Role

Join our Global Services Engineering Team as a Site Reliability Engineer and take your career to the next level. This remote Site Reliability Engineer position offers you the chance to work on innovative solutions that empower our Global Services teams to deliver exceptional value to our customers. You will play a crucial role in ensuring system availability, reliability, scalability, and performance.

What You'll Do

Design, build, and maintain highly reliable, scalable, and performant systems as a Site Reliability Engineer.
Define, measure, and monitor key service reliability metrics and SLAs.
Develop and improve monitoring, alerting, and incident response processes.
Proactively identify performance bottlenecks and reliability risks, driving long-term solutions.
Investigate, troubleshoot, and resolve complex production issues across distributed systems.
Automate operational tasks to reduce manual effort and improve system efficiency.
Participate in on-call rotations and lead incident management and post-incident reviews.
Collaborate with engineering teams to influence system architecture and reliability best practices.
Continuously improve deployment, release, and rollback processes to minimize risk and downtime.
Enhance and maintain CI/CD pipelines and other tooling as required.

Requirements

At least 3 years in an SRE role with a strong understanding of Linux/Unix systems and networking fundamentals.
Experience with distributed systems and microservices architectures.
Strong understanding of security and compliance considerations in production environments.
Proficient in orchestration and containerization technologies such as Docker & Kubernetes.
Good working knowledge of Java, Python, or GoLang, and follows common development practices and methodologies.
Hands-on experience with databases, especially PostgreSQL and MySQL.
Good hands-on experience with cloud platforms, such as AWS, Azure, or Google Cloud.
Experience with monitoring and observability applications such as Prometheus, Grafana, and ELK.

Nice to Have

Familiarity with incident management tools.
Experience in a fast-paced environment.
Knowledge of DevOps practices.

What We Offer

Flexible working patterns to suit your lifestyle.
Comprehensive health and wellness benefits.
Opportunities for professional growth and development.
A collaborative and innovative work environment.
Support for your work-life balance.

Why This Job8.5 of 10

This Site Reliability Engineer role at Akamai offers a unique opportunity to work remotely while enhancing system performance in a collaborative environment.

Salary Range

Required

0/1

Optional

0/1

Bonus

0/1

Who Will Succeed Here

→

Proficient in container orchestration with Kubernetes and Docker, enabling seamless deployment and scaling of microservices in a cloud environment.

→

Strong analytical mindset with a focus on monitoring and troubleshooting system performance, leveraging tools like Prometheus and Grafana for observability.

→

Hands-on experience with both relational databases (PostgreSQL, MySQL) and cloud services (AWS), ensuring robust data management and high availability.

Learning Resources

→Linux Command Line Basicsguide

→Docker Mastery: with Kubernetes +Swarm from a Docker Captaincourse

→Introduction to Site Reliability Engineeringcourse

Career Path

Site Reliability Engineer(Now)→Senior Site Reliability Engineer(1-2 years)→Site Reliability Engineering Manager(3-5 years)

Market Overview

Market Size 2024

$32B

Annual Growth

15.2%

AI Adoption in DevOps

40%

Investment in Cloud Infrastructure

+25%

Labour Demand for SREs

+20%

Avg Salary for SREs

$115K

Skills & Requirements

Required

LinuxUnixDocker

Growing in Demand

TerraformPrometheusKustomize

Declining

Bash ScriptingApache HTTP Server

Domain Trends

Increased Automation in Operations

By 2025, 70% of organizations will automate operational processes, reducing manual intervention by 50%.

Shift to Multi-Cloud Strategies

Over 60% of enterprises are adopting multi-cloud strategies, leading to a 30% increase in demand for cloud-native skills.

Focus on Security in SRE Practices

Security-focused SRE practices are being adopted by 55% of companies, highlighting the need for skills in security automation and compliance.

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.