MisuJob - AI Job Search Platform MisuJob

Jobs

Browse 198+ jobs updated daily

Latest Job Openings

CRI-Lagunilla de Heredia-Ultra permanent
FinanceOracle Fusion Cloud ERPApplication SupportService ManagementIT Operations

ABOUT US: LSEG (London Stock Exchange Group) is more than a diversified global financial markets infrastructure and data business. We are dedicated, open-access partners with a commitment to excellenc...

January 21, 2026 View Details
Hyderabad permanent
Service Level ManagementReliability FrameworkService Level IndicatorsService Level ObjectivesError Budget ManagementReliability GovernanceBusiness Impact CorrelationIncident ManagementLearning CultureBlameless Incident ResponseIncident CommandPredictive Issue Prevention

Position Title: Reliability Engineering Lead Location: Hyderabad Role Description (Process-First Responsibilities) 1. Service Level Management & Reliability Framework Process Owner: SLO-driven reliabi...

January 20, 2026 View Details
new york, new york, United States permanent
CloudKubernetesAWSProduction CodeOn-call ExperienceMonitoring SystemsDebuggingReliabilitySystems ThinkingAuto Scaling

ABOUT US: Modal provides the infrastructure foundation for AI teams. With instant GPU access, sub-second container startups, and native storage, Modal makes it simple to train models, run batch jobs,...

January 19, 2026 View Details
Singapore, Singapore, Singapore Hybrid permanent
CloudDockerKubernetesTerraformPostgreSQLMonitoringAlertingCI/CDAutomation

Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, de...

December 29, 2025 View Details
Highlands Ranch, CO, United States Hybrid permanent
LeadershipTeam ManagementIncident ManagementProblem ManagementChange ManagementProcess DevelopmentAutomationTechnology AdoptionSecurityMetrics-based KPIs

Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, de...

January 8, 2026 View Details
Foster City, CA, United States Hybrid permanent
PythonJavaGoTerraformAnsibleJenkinsgitPrometheusGrafanaELK

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than...

January 14, 2026 View Details
Barcelona, Spain (Spain) Remote permanent
LeadershipTechnical DirectionIncident ResponseMonitoring & AlertingAutomationReliability Best PracticesCross-Functional CollaborationSystem OptimizationObservabilityStrategic Planning

Get to know Okta Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platfo...

January 13, 2026 View Details
Albuquerque, NM (New Mexico (5201 Hawking Drive SE, Albuquerque, NM 87106)) permanent
Reliability EngineeringFailure Modes and Effects Analysis (FMEA)Fault Tree Analysis (FTA)Data Analysis/automationPythongitLaTeXBash/Shell/Batch scriptingDockerGitHub

Company Overview Kairos Power is a new nuclear energy technology and engineering company whose mission is to enable the world’s transition to clean energy, with the ultimate goal to dramatically impr...

January 6, 2026 View Details
Chicago, IL (Chicago) Remote permanent
Site Reliability EngineeringGCPAWSKubernetesTerraformCI/CDMonitoringAlertingSLIsSLOs

About Attain Klover’s engineering team powers one of the fastest-growing fintech platforms in the U.S., supporting over one million active users each month. Our systems process and move more than $1....

January 6, 2026 View Details
Remote US (Remote Canada, Remote US) Remote permanent
PythonDockerKubernetesTerraformAnsibleGitLab CIJenkinsPostgreSQLMySQLAWS

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. Affirm is reinventing cred...

December 23, 2025 View Details
Remote - US (Remote US) Remote permanent
TerraformAWSCI/CDDockerPythonGoObservability toolsCommunicationCollaboration

We are Bugcrowd. Since 2012, we’ve been empowering organizations to take back control and stay ahead of threat actors by uniting the collective ingenuity and expertise of our customers and trusted all...

December 19, 2025 View Details
Redwood City, US, California Hybrid permanent
JavaSpring FrameworkPythonKubernetesCloud PlatformsReliability EngineeringObservabilityIncident Management

We're Celonis, the global leader in Process Intelligence technology and one of the world's fastest-growing SaaS firms. We believe there is a massive opportunity to unlock productivity by placing AI, d...

December 16, 2025 View Details
Dublin, IE (Dublin) Remote permanent
SecurityAI/automationDevOpsIncidentResponseMonitoringScalableLeadershipAWSKubernetesTerraformAnsibleDockerGitLab CIJenkinsSecure Reliability

At Klaviyo, we value the unique backgrounds, experiences and perspectives each Klaviyo (we call ourselves Klaviyos) brings to our workplace each and every day. We believe everyone deserves a fair shot...

December 12, 2025 View Details
Remote - Ireland; Remote - Spain; Remote - United Kingdom (Remote - Ireland) Remote permanent
Reliability EngineeringSoftware ArchitectureKubernetesAWSTerraformObservabilityIncidentResponseDisaster Recovery ManagementCapacity planningTalent mentorship

Who we are At Twilio, we’re shaping the future of communications, all from the comfort of our homes. We deliver innovative solutions to hundreds of thousands of businesses and empower millions of dev...

December 12, 2025 View Details
Trabajo a distancia permanent
AWSDevSecOpsAWS CDKMulti-RegionNetwork ConnectivityDirect ConnectTransit GatewayAWS VPCVPNDisaster RecoveryCloud MigrationInfrastructure as Code

Bold Nuestra compañía fue fundada en Mayo de 2019 por un equipo de personas increíbles y con una experiencia única, el grupo de fundadores está conformado por los creadores de PayU Latam y otras empr...

February 5, 2026 View Details
Trabajo a distancia permanent
AWSDevSecOpsMulti-Region ArchitectureNetworkingCloud InfrastructureIaCAWS CDKDisaster RecoveryHigh AvailabilitySecurityAutomation

Bold Nuestra compañía fue fundada en Mayo de 2019 por un equipo de personas increíbles y con una experiencia única, el grupo de fundadores está conformado por los creadores de PayU Latam y otras empr...

January 30, 2026 View Details
Pune, India (India) permanent
DockerKubernetesCI/CDIaCObservabilityChaos TestingAI/automationLeadershipIncident Management

Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data re...

December 11, 2025 View Details
Remote US Remote permanent
ProgrammingLanguages_CBackend DevelopmentDataEngineeringSiteReliabilityEngineeringObservabilityIncident ManagementChange ManagementCapacity planningAI/automationTalent Development

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. Affirm is reinventing cred...

December 11, 2025 View Details
Remote Canada Remote permanent
PythonKotlinAWSMySQLKubernetesReliability EngineeringDistributed SystemsBackend DevelopmentObservability

Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. Site Reliability Engineeri...

December 11, 2025 View Details
Bellevue, Washington (Bellevue) Remote permanent
Technical leadershipPeople ManagementAgile Project MethodsDevOpsLoad Infrastructure PlatformsCloud ServicesKubernetesObservabilityPeopleDevelopment

Get to know Okta Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platfo...

December 11, 2025 View Details
UK - HQ - London Remote permanent
AI/automationDevOpsSecurityDeveloper ExperienceInfrastructure-as-productReliabilityScalableCloudDockerKubernetesTerraformAnsibleCI/CDAutomation

About Euc We’re making good health last a lifetime More than 1 billion people globally live with obesity, a significant leading indicator of many preventable chronic diseases such as diabetes and he...

December 11, 2025 View Details
Paris, France (Paris) Remote permanent
SiteReliabilityEngineeringProductionEngineeringAI SearchCloudPlatformsLoadbalancingBackupRestoreSystemsFinOpsLeadershipTeamMentorship

At Algolia, we’re proud to be a pioneer and market leader in AI Search, empowering 17,000+ businesses to deliver blazing-fast, predictive search and browse experiences at internet scale. Every week, w...

December 10, 2025 View Details
Dallas, Texas (DFW1) permanent
LeadershipTeam ManagementTroubleshootingIssue ResolutionProcess ImprovementTechnical communicationLinux AdministrationAutomotive Parts ExperienceService Management Knowledge

Who we are Aurora’s mission is to deliver the benefits of self-driving technology safely, quickly, and broadly. The Aurora Driver will create a new era in mobility and logistics, one that will bring...

December 10, 2025 View Details
Location not specified Remote
ContainerizationTechnical AutomationObservabilityCI/CDInfrastructure as Code (IaC)AIOps ServicesKubernetesTerraformDockerSAP BTPAWSAzureHelmLoggingMetrics

We help the world run better At SAP, we keep it simple: you bring your best to us, and we'll bring out the best in you. We're builders touching over 20 industries and 80% of global commerce, and we ne...

December 9, 2025 View Details
Location not specified Remote
LinuxPythonJavaShellTalend ETLPostgresSnowflakeDB2SybaseMSSQL ServerMongoDBDevOps

We're seeking someone to join our Data Protection Fleet as a Site Reliability Engineering (SRE) Specialist in Cyber to help drive performance, reliability, enhanced observability and efficiency for th...

December 9, 2025 View Details
Mexico City, Mexico City, Mexico (Mexico City) Hybrid permanent
Data ModelingData PipelinesOntologiesMetric DefinitionObservabilitydata documentationDatabase reliability

About Crunchyroll Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territor...

December 9, 2025 View Details
Location not specified Remote
AWSPostgreSQLAmazonRDSDockerKubernetesTerraformPythonCI/CDObservability

About the opportunity We are seeking a Senior Site Reliability Engineer to join the Database Platform Team within the Platform Engineering Domain. Platform Engineering's mission is to provide truste...

December 8, 2025 View Details
Remote Remote permanent
ContainerizationInfrastructure as Code (IaC)Project ManagementIncidentResponseService Level ReportingAutomationCrossFunctionalCommunicationTeam ManagementHands-on ExperienceTroubleshootingCloud ComputingNetworkingLinux Systems AdministrationInfrastructure as CodeIndependence

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE team, reporting to the Director of Site Reliability Engineering. As Engineering Manager, you will be responsible for supp...

December 8, 2025 View Details
Remote Remote permanent
ContainerizationInfrastructure as Code (IaC)Project ManagementIncidentResponseService Level ReportingTeam ManagementSoftware EngineeringReliability EngineeringCloud ComputingLinux Systems AdministrationInfrastructure as CodeIncident ResponseService Level ManagementAutomationIndependence

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE team, reporting to the Director of Site Reliability Engineering. As Engineering Manager, you will be responsible for supp...

December 8, 2025 View Details
Remote Remote permanent
Team ManagementSoftware EngineeringReliability EngineeringCloud Computing (AWS)Project ManagementIncidentResponseService Level ManagementAutomationTravel

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE team, reporting to the Director of Site Reliability Engineering. As Engineering Manager, you will be responsible for supp...

December 8, 2025 View Details
Remote Remote permanent
Linux AdministrationContainerizationTerraformKubernetesProject ManagementIncidentResponseCloud Computing (AWS)Automation

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE team, reporting to the Director of Site Reliability Engineering. As Engineering Manager, you will be responsible for supp...

December 8, 2025 View Details
Remote Remote permanent
ContainerizationInfrastructure as Code (IaC)Project ManagementIncidentResponseService Level ManagementTeam Managementteam_managementhands_on_experiencecomplex_systems_analysisproject_managementcloud_computingnetworkinglinux_administrationinfrastructure_as_codeautomation

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE team, reporting to the Director of Site Reliability Engineering. As Engineering Manager, you will be responsible for supp...

December 8, 2025 View Details
United States, Remote Remote permanent
Reliability EngineeringAWSAzureDevOpsLeadershipProcess OptimizationObservabilityKubernetesGitOps deployment pipelinesJenkinsGitHub ActionsAnsibleJava

Are you driven to be an innovative and dynamic Manager of Site Reliability Engineering, and looking to join a team where open collaboration, customer focus, and a commitment to excellence are core val...

December 5, 2025 View Details
San Ramon, California, United States (Beverly Hills) permanent
PythonMonitoringAI/automationIncident ManagementObservabilityDevOpsNew RelicPagerDutyKubernetesMS SQL

WHY JOIN ALO? Mindful movement. It’s at the core of why we do what we do at ALO—it’s our calling. Because mindful movement in the studio leads to better living. It changes who yogis are off the mat, ...

December 3, 2025 View Details
Singapore HQ (Singapore) permanent
Reliability EngineeringTechnical OperationsPredictive MaintenanceCloud InfrastructureSustainabilityPredictive AnalyticsDatacentre OperationsInfrastructureCross-functional CollaborationSustaining engineering

Want to be a part of Asia Pacific & Middle East's (APME) largest, most innovative, and rapidly growing data centre company? AirTrunk is a technology company with a powerful purpose - to scale and...

December 2, 2025 View Details
Novi Sad, South Bačka, Serbia, EMEA (SRB - Novi Sad) Remote permanent
AWSAzureKubernetesCI/CDTerraformFastAPIGrafanaLeadershipTalent Management

From Fivetran’s founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical...

December 2, 2025 View Details
Office Location or Remote - USA Remote permanent
Observability ToolsEnterprise Monitoring ToolsIncident ManagementReliability EngineeringLeadershipAutomationTeam ManagementStrategyImplementationObservabilityMonitoringUnified PlatformAPMSLIs

The Senior Manager, Site Reliability Engineering (SRE) will lead the SRE organization to deliver reliable, scalable, and resilient platforms and services. This role will own the strategy, implementati...

December 2, 2025 View Details
Home Based - APAC; Home based - EMEA (Home Based - Asia Pacific, Home Based - EMEA) Remote permanent
LinuxUnit operationsInfrastructure as Code (IaC)KubernetesGitOpsAgileTeamLeadershipService Level AgreementsMentoringStakeholder Management

Canonical is a leading provider of open-source software and operating systems for global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiat...

December 1, 2025 View Details
São Paulo, Brazil (C6 Bank ) permanent
AWSKubernetesTerraformPythonAnsibleCI/CDDevOpsShellPowerShellMonitoring

Nossa área de Site Reliability Engineering Nossa área é responsável pela disponibilidade das suas verticais, tais como, desempenho, eficiência de performance e financeira, execução de mudanças, plane...

November 26, 2025 View Details
Dublin, Ireland Remote permanent
JavaKotlinGoPythonDistributed SystemsSystemsObservabilityIncident ManagementDockerKubernetesTerraformGitLab CIScrumCloud architecturesNetworking

Toast is driven by building the platform that helps restaurants adapt, take control, and focus on what they do best: creating experiences their guests love. Tremendous business growth has spurred a ne...

November 19, 2025 View Details
Colombo, Western Province, Sri Lanka permanent
LeadershipTeam ManagementCloud InfrastructureReliabilityScalabilityPerformance OptimizationSystem MonitoringIncident ResponseAutomationCommunication

Cloud Solutions International Pvt Ltd is seeking a highly skilled and motivated Site Reliability Engineering Manager to join our dynamic Site Reliability Engineering department. As the Site Reliabilit...

January 30, 2026 View Details
Helsinki Remote permanent
LeadershipPeople ManagementSRE Team LeadershipCloud StrategyInfrastructure ReliabilityAutomationoperational efficienciesAgile Development ProcessHands-On Engineering

We’re looking for an Engineering Manager to lead the SRE team within our Infrastructure & Production group. Site Reliability Engineering (SRE) is responsible for our infrastructure development and...

November 13, 2025 View Details

Site Reliability Engineering (SRE) Consultant (m/w/d)

Firmenname für EXPERT-Mitglieder sichtbar

Remote Remote
SREObservabilityMonitoringRelease ManagementTechnical AutomationKubernetesDevOpsCI/CDInfrastructure-as-CodeLinux

Projektbeschreibung Description: SAP for Me is SAP's strategic customer portal, serving as a single, digital entry point for a customer's entire SAP relationship. It provides a personalized and trans...

November 5, 2025 View Details
Chicago, IL (Chicago) Remote permanent
KubernetesAWSGCPTerraformCI/CDObservabilityAutomationHelmPostgreSQLBigQuerySpannerMySQL

About Attain Built for consumers and companies, alike. In a world driven by data, we believe consumers and businesses can coexist. Our founders had a vision to empower consumers to leverage their gr...

November 3, 2025 View Details
North Bethesda, MD (Lexington, KY, North Bethesda, MD, Waltham, MA) Remote permanent
ProgrammingLanguages_CCloudPlatformsCI/CDKubernetesAWSObservabilityMonitoringDevOpsLeadership

Xometry (NASDAQ: XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry’s digital marketplace gives manufact...

October 30, 2025 View Details
Waltham, MA (Lexington, KY, North Bethesda, MD, Waltham, MA) Remote permanent
DockerKubernetesCI/CDAWSObservabilitySoftware-DevelopmentSREStrategic directionCost-effective systemsSecure systemsFast systemsReliable systemsOperational rigorOperational efficiencyEngineering velocity

Xometry (NASDAQ: XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry’s digital marketplace gives manufact...

October 30, 2025 View Details
Toronto, Canada (Hybrid) (Toronto) Remote permanent
Incident ManagementTeamLeadershipStrategic_PlanningOperational excellenceService Level ManagementAutomationLeadershipMentorshipOn-call PracticesRunbook ReviewsGame DaysIncident RetrospectivesTechnical StrategyObservabilityService Level Objectives

About Tubi: Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and T...

October 16, 2025 View Details
Bengaluru, Karnataka, India (APAC) (Bengaluru) Remote permanent
KubernetesInfrastructure as Code (IaC)AutomationComponent OwnershipArchitecture & DesignReliability EngineeringObservabilityIncident ManagementTechnical leadership

The Aviatrix SRE team is a small but highly skilled global group of Systems Engineers/SREs dedicated to ensuring the reliability, availability, and performance of Aviatrix’s critical systems and servi...

September 29, 2025 View Details