Remote Staff Site Reliability Engineer - Cloud Infrastructure
About the Role
We are seeking a Remote Staff Site Reliability Engineer to join our dynamic team at Achievers. In this role, you will play a critical part in managing and advancing our global infrastructure, ensuring that our systems are reliable and scalable. As a Remote Staff Site Reliability Engineer, you will leverage your extensive technical expertise to architect our GCP/GKE environment and lead the integration of AI-driven workflows.
What You'll Do
- Lead the design and ongoing evolution of our global, high-availability infrastructure on Google Cloud Platform (GCP) and Kubernetes (GKE).
- Implement AI-integrated workflows, such as Slack or Teams bots for incident triage and automated PR generation.
- Collaborate with Product, Engineering, and Leadership teams to manage complex changes and define the long-term reliability roadmap.
- Establish best practices for Terraform and CI/CD pipelines, empowering development teams to deploy code rapidly and securely.
- Lead initiatives in disaster recovery, multi-region networking, and the design of zero-trust security architectures.
- Guide design reviews and promote best practices, enhancing the technical skills of the entire SRE organization.
Requirements
- 15 years of systems engineering experience with in-depth knowledge of Linux kernels, network protocols (TCP/IP, BGP, DNS), and cloud-native architecture.
- Hands-on experience in architecting and managing production workloads on Google Cloud Platform and GKE.
- Practical experience or a strong vision for integrating AI tools and LLMs to automate SRE tasks.
- Advanced skills in Python or Go for developing internal tools and automation frameworks.
- Expert understanding of observability frameworks (New Relic, Prometheus, Grafana).
- Deep knowledge of managing relational databases (MySQL, MongoDB) at scale.
Nice to Have
- Hands-on experience with Service Mesh (Istio) and advanced GCP Networking features.
- A proven history of migrating legacy automation systems to modern, AI-augmented CI/CD workflows.
What We Offer
- Competitive salary range of $124,000 - $170,000 based on experience and skills.
- Health Benefits and Life Insurance Coverage starting on your first day.
- Parental Leave Top-up and Employer matched RRSP contributions.
- Flexible Vacation policy to recharge and bring your best self to work.
- Employee and Family Assistance Program offering mental health, legal, and financial counselling.
- Supported professional development and career growth opportunities.
- Hybrid flexibility with time in our Liberty Village, Toronto office.
- Regular events designed to build connection, belonging, and well-being.
This role offers a unique opportunity to lead cloud infrastructure initiatives in a supportive and innovative environment. Achievers values its employees and provides numerous benefits.
Who Will Succeed Here
Proficient in Google Cloud Platform and Kubernetes, with hands-on experience in deploying and managing containerized applications in a GKE environment, ensuring high availability and scalability.
Strong proficiency in automation and infrastructure as code using Terraform, along with a deep understanding of CI/CD pipelines to facilitate rapid deployment and integration of applications.
A proactive mindset with a strong focus on observability practices, capable of implementing monitoring solutions using tools like Prometheus and Grafana to ensure system reliability and performance.
Learning Resources
Career Path
Market Overview
Skills & Requirements
Domain Trends
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months