Site Reliability Engineer
Qode
Posted: April 16, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Work as a Site Reliability Engineer in Pune, India, to manage and maintain multi-cloud infrastructure for our global clients, focusing on AWS and Azure.
Required Skills
Job Description
Site Reliability Engineer
Location: Pune, India
Workplace Type: Onsite
Shift: US Shift
About the Role
We are seeking an experienced Site Reliability Engineer to join our dynamic team in Pune. In this role, you will be instrumental in managing our multi-cloud infrastructure, focusing on AWS and Azure. You will be responsible for setting up and maintaining the infrastructure to support our cloud migration and future division expansion. This position offers a unique opportunity to work in a global environment, collaborate with Automotive and corporate IT teams, learn new skills, and shape the future direction of our infrastructure. The ideal candidate will have a strong background in cloud computing, infrastructure as code, and automation, with a proactive approach to problem-solving and performance optimization. You will be part of the Tech Ops / SRE Team, which operates in a sharing and learning culture to maintain continuous access to our products.
Key Responsibilities
• Gather and analyze metrics from operating systems and applications to assist in performance tuning and fault finding.
• Partner with development teams to improve services through rigorous testing and release procedures.
• Participate in system design consulting, platform management, and capacity planning.
• Create sustainable systems and services through automation.
• Balance feature development speed and reliability with well-defined service-level objectives.
• Manage day-to-day operations of AWS/Azure Infrastructure.
• Build and document automation processes for Infrastructure as a Service/Infrastructure as code.
• Manage backup and patch management processes.
• Provide adequate support in architecture planning, migration, and installation for new projects.
• Lead the structural/architectural design of platforms, middleware, databases, and backups according to system requirements.
• Conduct technology capacity planning by reviewing current and future requirements.
• Strategize and implement disaster recovery plans, including creating and implementing backup and recovery plans.
• Manage day-to-day operations by troubleshooting issues, conducting root cause analysis (RCA), and developing fixes.
• Plan for and manage upgrades, migrations, maintenance, backups, installations, and configurations.
• Review technical performance and deploy ways to improve efficiency and fine-tune performance.
• Develop shift rosters to ensure no disruption in the tower.
• Create and update SOPs, Data Responsibility Matrices, operations manuals, and daily test plans.
• Provide weekly status reports to client leadership and internal stakeholders.
• Leverage technology to develop Service Improvement Plans (SIP) through automation.
Required Skills & Qualifications
• Bachelor’s degree (or equivalent) in computer science or a related discipline with at least 7 years of experience.
• Strong understanding and hands-on experience with EKS, including configuring, deploying, maintaining, troubleshooting, upgrading, and monitoring EKS on AWS.
• Hands-on experience with CI/CD pipelines and DevOps tooling, including Git-based version control (GitLab preferred), pipeline design and maintenance, automated builds, testing, and deployments for cloud-native and containerized workloads.
• Hands-on Experience with Linux Server, AD, LDAP, DNS, Network Storage, AWS Compute services (EC2, FSX, Managed AD, Route 53, etc…).
• Ability to program using scripting with tools or languages, such as PowerShell, Python, Ansible, Terraform, and Bash.
• Familiarity with ITSM processes like Incident, Problem, and Change Management using ServiceNow (preferable).
• Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
• Strong interpersonal skills, analytical and problem-solving ability, along with strong written and verbal communication.
• Ability to communicate ideas in both technical and non-technical ways.
• A strong capacity for teamwork and a sense of ownership, with the ability to work independently and be self-driven.
• Experience with Infra Cloud Computing Consulting.