Remote Site Reliability Engineer II - SaaS Platform
About the Role
We're hiring a Remote Site Reliability Engineer II to join our innovative team at Restaurant365. As a Remote Site Reliability Engineer II, you will play a crucial role in supporting and enhancing our cloud-based platform that revolutionizes the restaurant industry. This position offers an exciting opportunity to work with cutting-edge technologies while collaborating with cross-functional teams to ensure the reliability and performance of our SaaS solutions.
What You'll Do
- Respond to production incidents, perform triage and troubleshooting, and contribute to post-incident analysis.
- Identify and automate manual processes to improve efficiency and reduce risk.
- Enhance and evolve monitoring tools and platforms to improve observability.
- Promote and apply best practices for reliability, scalability, and performance across engineering.
- Implement and support cloud automation using Terraform, Ansible, or CloudFormation.
- Participate in on-call rotation, providing 24x7 support for incidents and contributing to root cause analysis.
- Research and remediate vulnerabilities in coordination with security teams.
- Maintain documentation of infrastructure, monitoring, runbooks, and incident response procedures.
Requirements
- BS in Computer Science, Information Systems, or related field (or equivalent experience).
- 2–4 years of experience in site reliability engineering, DevOps, or cloud operations.
- Experience with cloud platforms (Azure or AWS), including services such as AKS, ECS/EKS, Functions/Lambda, S3, and Blob storage.
- Proficiency with infrastructure-as-code and automation (Terraform, Ansible, YAML, Python, Bash, PowerShell).
- Strong Linux engineering skills; working knowledge of Windows administration.
- Experience supporting production environments and participating in on-call rotations.
- Familiarity with web servers and middleware (Nginx, Apache Tomcat).
- Strong written, oral, and interpersonal communication skills.
Nice to Have
- Experience with monitoring tools (Prometheus, Grafana, ELK, Site24x7, Nagios).
- Knowledge of performance analysis and system vulnerability remediation.
- Cloud certification (AWS or Azure) preferred.
- Familiarity with restaurant industry SaaS platforms and customer-facing applications.
What We Offer
- Salary range of $98,583-$138,016 annually.
- Comprehensive medical benefits, 100% paid for employee.
- 401k + matching.
- Unlimited PTO + Company holidays.
- Wellness initiatives.
- Equitable pay practices.
Join us as a Remote Site Reliability Engineer II and contribute to our mission of delivering a best-in-class SaaS solution for the restaurant industry. Apply now!
This Remote Site Reliability Engineer II position at Restaurant365 offers a competitive salary, unlimited PTO, and the chance to work with cutting-edge technology in a growing SaaS company.
Who Will Succeed Here
Proficient in AWS and Azure cloud platforms, with hands-on experience in deploying and managing applications and services using Terraform and Ansible, ensuring high availability and scalability.
A self-motivated and disciplined individual who thrives in a remote work environment, able to manage time effectively and maintain productivity while collaborating with cross-functional teams across different time zones.
Strong problem-solving mindset with at least 3-5 years of experience in Site Reliability Engineering, demonstrating the ability to implement monitoring tools and CI/CD pipelines to enhance system reliability and performance.
Learning Resources
Career Path
Market Overview
Skills & Requirements
Domain Trends
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months