Underdog Sports17.02.26
AI SCORE 8.5

Remote Senior Site Reliability Engineer - Infrastructure

$140K–$180K/year

About the Role

We are looking for a Remote Senior Site Reliability Engineer to join our team at Underdog Sports. In this role, you will play a critical part in ensuring the reliability and scalability of our infrastructure as we continue to grow. As a founding member of the SRE team, you will help define our approach to operational excellence and reliability. This position offers a unique opportunity to make a significant impact from day one.

What You'll Do

  • Own and maintain the incident response process, defining procedures, tools, and best practices.
  • Guide teams in establishing and monitoring Service Level Objectives (SLOs), including setting up alerts and reporting systems.
  • Lead capacity planning initiatives, focusing on scalability and performance during peak traffic and game-day spikes.
  • Collaborate closely with platform, infrastructure, and product teams to enhance system reliability and developer experience.
  • Identify high-leverage reliability challenges and shape our incident response strategy.

Requirements

  • 5+ years of experience in Site Reliability Engineering or a related field.
  • Strong understanding of incident response processes and best practices.
  • Experience with monitoring and alerting tools.
  • Proficiency in cloud infrastructure (AWS, GCP, or Azure).
  • Excellent problem-solving skills and a proactive approach to challenges.

Nice to Have

  • Familiarity with container orchestration (Kubernetes, Docker).
  • Experience in capacity planning and performance tuning.
  • Knowledge of programming/scripting languages (Python, Go, etc.).

What We Offer

  • Competitive salary and performance-based bonuses.
  • Flexible remote work environment.
  • Health, dental, and vision insurance.
  • Generous paid time off and holiday schedule.
  • Opportunities for professional development and growth.
Why This Job8.5 of 10

This Remote Senior Site Reliability Engineer role at Underdog Sports offers a unique opportunity to shape the company's reliability practices while enjoying competitive pay and flexible work arrangements.

Salary Range
Required
0/1
Optional
0/1
Bonus
0/1

Who Will Succeed Here

Proficient in managing cloud infrastructure across AWS, GCP, and Azure, with hands-on experience in deploying and maintaining scalable applications in Kubernetes and Docker environments.

Strong analytical mindset with a proven track record in incident response, demonstrating the ability to quickly diagnose and resolve complex system outages while implementing effective SLO monitoring strategies.

Self-motivated and comfortable working in a fully remote environment, exhibiting excellent time management skills to balance multiple priorities and deliver operational excellence without direct supervision.

Learning Resources

Incident Response Guideguide

Career Path

Remote Senior Site Reliability Engineer - Infrastructure(Now)Lead Site Reliability Engineer(1-2 years)Director of Site Reliability Engineering(3-5 years)

Market Overview

Market Size 2024
$8.5B
Annual Growth
12.3%
AI Adoption
45%
Investment
+200%
Labour Demand
+30%
Avg Salary
$135K

Skills & Requirements

Required
Incident ResponseSLO MonitoringCloud Infrastructure
Growing in Demand
Chaos EngineeringObservability Tools (e.g., Prometheus, Grafana)Automated Incident Management
Declining
Traditional ITIL Incident ManagementOn-Premise Infrastructure Management

Domain Trends

Increased Automation in Incident Response
By 2025, 60% of organizations will automate incident response processes to improve efficiency and reduce human error.
Shift to Cloud-Native Incident Management
80% of companies are adopting cloud-native technologies, leading to a 25% increase in demand for SREs skilled in cloud incident response.
Rise of AI-Driven Monitoring Solutions
AI-driven monitoring solutions are projected to grow by 35% in the next two years, enhancing real-time incident detection and response.

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.