MisuJob - AI Job Search Platform MisuJob

Sr. Reliability Operations Engineer (Mexico)

Serverobotics

Mexico Remote permanent

Posted: April 6, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

The role involves developing, testing, and maintaining robotic deliveries in urban environments, working closely with our team to deliver high-quality products on time.

Job Description

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses.

The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los Angeles, Miami, Dallas, Atlanta and Chicago while doing commercial deliveries. We’re looking for talented individuals who will grow robotic deliveries from surprising novelty to efficient ubiquity.

Who We Are

We are tech industry veterans in software, hardware, and design who are pooling our skills to build the future we want to live in. We are solving real-world problems leveraging robotics, machine learning and computer vision, among other disciplines, with a mindful eye towards the end-to-end user experience. Our team is agile, diverse, and driven. We believe that the best way to solve complicated dynamic problems is collaboratively and respectfully.

The Senior Reliability Operations Engineer leads operational reliability by region owning incident response, escalations, and Tier 2 support for robotic and cloud systems. This role drives the creation and improvement of runbooks, automations, and operational processes while coordinating closely with product engineering and SREs. This position serves as the regional incident lead, ensuring issues are resolved efficiently and communicated clearly to all stakeholders.

Responsibilities

• Serve as the primary incident lead during your region’s daytime hours, coordinating technical investigations, centralizing communication, and engaging the appropriate engineering and SRE teams when escalation is required.

• Respond to escalations from Tier 1 support, using runbooks, metrics, logs, and system diagnostics to investigate and remediate issues or determine when escalation to Tier 3 is necessary.

• Develop and update runbooks, workflows, and operational documentation to ensure consistent and reliable responses to recurring issues, collaborating with product teams to expand coverage over time.

• Write, maintain, and enhance automation scripts and tools that streamline common remediation steps, improve response times, and reduce manual operational overhead.

• Use metrics, logs, and tracing tools (Grafana/Prometheus, GCP Monitoring, OpenTelemetry) to proactively identify problems, validate system behavior, and support continuous improvement of detection mechanisms.

• Act as the central point of communication during active incidents, ensuring timely updates and clear routing to the correct product engineering and SRE stakeholders.

• Collaborate with reliability and product teams to share insights, recommend improvements, and help refine processes that enhance the stability and operability of our systems.

• Participate in a shared weekend on-call rotation to help maintain operational coverage for production systems, responding to incidents and escalations as needed and coordinating with engineering teams when issues arise.

• Help establish operational best practices, refine workflows, and prepare the foundation for a broader reliability operations function.

Qualifications

• Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent practical experience.

• 5+ years of professional experience in Reliability Operations, Site Reliability Engineering, DevOps, IT Operations, or a related technical support function.

• Demonstrated experience owning or participating in Tier 2 or Tier 3 technical investigations, including triage, log analysis, and structured escalation.

• Experience supporting distributed systems, cloud-hosted services, or production operational environments.

• Hands-on experience participating in incident response processes.

• Strong proficiency with Linux, including navigating systems, reviewing logs, and performing diagnostics.

• Experience writing, executing, and maintaining runbooks, automations, and operational workflows.

• Ability to interpret metrics, logs, and traces using tools such as Grafana/Prometheus, Google Cloud Monitoring, and OpenTelemetry.

• Familiarity with modern cloud environments, preferably Google Cloud Platform (GCP), including basic debugging, permissions, and service-level triage.

• Ability to investigate and remediate issues following documented procedures, escalating effectively when needed.

• Understanding of CI/CD pipelines, deployed application behavior, and operational dependencies across microservices.

• Proficiency with Jira or similar platforms for ticketing and structured incident tracking.

• Exceptional communication skills, especially during high-pressure incidents where clear, concise updates are critical.

• Calm and methodical approach to troubleshooting, prioritization, and decision-making.

• Strong collaboration skills when coordinating with product engineering, SRE, and global support teams.

• High level of ownership, reliability, and accountability when handling operational

What Makes You Stand Out

• Experience acting as an incident commander or primary incident response lead for high-severity events.

• Hands-on experience with robot fleets, IoT devices, or edge systems operations.

• Experience building lightweight tools, scripts, or internal automations to increase operational efficiency.

• Familiarity with incident management tools such as PagerDuty, OpsGenie, Jira Service Management, or Grafana IRM.

• Background creating or improving operational documentation, runbooks, or support processes at scale.

• Ability to coach and mentor others, and to uplift operational maturity within a region or team.

• Strong networking fundamentals, including experience diagnosing connectivity issues across distributed systems. Familiarity with Tailscale or similar zero-trust networking tools is a major plus.

Additional Information

• As part of maintaining continuous operational coverage, this role also participates in a rotating weekend on-call schedule shared across the Reliability Operations team.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply