Director, Site Reliability & Operations
Confidential
Posted: April 8, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Switchfly is hiring a Director of Site Reliability & Operations to own the operational backbone of a platform that processes travel and loyalty transactions for some of the world’s largest airlines and financial institutions.
Required Skills
Job Description
Switchfly is hiring a Director of Site Reliability & Operations to own the operational backbone of a platform that processes travel and loyalty transactions for some of the world’s largest airlines and financial institutions. This is not a role for someone who manages by dashboard and delegates by email. We need a technically engaged leader with a full tool belt — someone who understands our platform deeply enough to participate in the hard conversations, and who owns outcomes rather than activities.
This role enables 50+ developers to ship secure, PCI-compliant releases at least weekly — supporting the DevOps culture that makes that pace sustainable. We need a leader who supports delivery velocity, spends the reliability and change budget strategically, and drives toward faster, safer delivery. Security and compliance set the floor; velocity is the ambition. And as AI reshapes how software is built and operated, we expect this leader to help us embrace it thoughtfully — not lock it out. You will lead a team of SRE, DevOps, DBA, network, security, and corporate IT professionals, working in close partnership with engineering leadership across a PCI Level 1-compliant, 24x7 enterprise platform.
Responsibilities
Own site reliability and availability for our cloud-hosted platform — 24x7 uptime, monitoring, alerting, anomaly detection, and incident response programs
Drive security outcomes across the platform — tracking findings from SonarQube, penetration tests, and vulnerability tools, and acting as a business stakeholder to get remediation work scoped, prioritized, and into the engineering delivery pipeline
Own PCI Level 1 compliance currency — when standards evolve, you understand the requirement, translate it into engineering terms, and drive adoption; you don’t just surface the factoid
Participate as a business stakeholder in SAFe planning — bringing security, compliance, and reliability work into the engineering delivery pipeline alongside feature development
Lead infrastructure patching and maintenance — OS, database, and system-level currency within our AWS environment, coordinating monthly maintenance windows and CI-driven image refresh cadence
Manage and develop an internationally distributed team across SRE, DevOps, DBA, network, security, and corporate IT functions
Own the AWS cost and capacity budget — monitoring spend, optimizing resource utilization, and making strategic tradeoff decisions in partnership with engineering leadership
Partner with engineering directors to define the boundary between infrastructure and application-layer security, and ensure nothing falls between the cracks
Own vendor outcomes across our cloud and tooling ecosystem — holding partners accountable and ensuring contracts reflect our operational needs
Guide personal and career development of your people
Foster a culture where reliability and security are shared team values, not external mandates
About You
You have a full tool belt — technically engaged, platform-curious, and willing to log into systems, participate in firefights, and develop genuine understanding of what you’re operating
7+ years in SRE, DevOps, cloud infrastructure, or security engineering, with 4+ years leading technical teams
Deep AWS experience across compute, networking, storage, and managed services in a production enterprise environment
Hands-on familiarity with the security and compliance discipline — you’ve operated in PCI, SOC 2, or equivalent regulated environments and understand what compliance actually requires versus what it looks like on paper
You operate as a business stakeholder, not just a technical function — comfortable working within SAFe or similar delivery frameworks to get security and reliability work into the roadmap alongside feature development
You manage through credibility and technical engagement, not just title — your team respects you because you understand their work
You are inquisitive, direct, and outcome-oriented — you form opinions, communicate them clearly, and own what happens next
Your colleagues are inspired to follow your lead
About the Environment
Our infrastructure is AWS-native — we have no on-premises footprint. The ecosystem includes CloudFlare, Splunk, AppDynamics, Grafana, OpsGenie, Snowflake, Cloud HSM, GitLab, and Jenkins hosted on EC2, alongside standard AWS managed services. PostgreSQL is our primary database. The platform runs Java/Spring Boot and Python backends serving Ember.js and React frontends. Familiarity with any part of this stack accelerates your effectiveness; deep expertise across all of it is not expected.
Corporate IT supports a primarily remote, technically self-sufficient workforce with a small number of call center, sales, and executive users requiring additional support. The function runs on Okta and Microsoft 365.
Company Perks & Benefits
At Switchfly, we believe in giving people the flexibility, support, and benefits they need to do their best work.
Discretionary Time Off (DTO) – Take time off when you need it. We trust our employees to manage their time responsibly while meeting business needs.
15 Company-Paid Holidays – Including a company-wide break from Christmas Eve through New Year’s Day.
Comprehensive Benefits Package – Switchfly offers a full suite of health benefits, with the company covering an average of 87% of employee premiums.
401(k) with Company Match – We support long-term financial wellness with a competitive retirement plan.
Switchfly Core Values to consider for this position:
- Adaptability & Calculated Risk
Our culture of learning is powered by an iterative mindset, a shared desire for high performance, and a willingness to take risks and push through limits. Our data driven approach fosters new ideas and continuous learning, but also enables us to be flexible and adjust, learning from success and failure.
- Ownership and Accountability
Taking professional responsibility reflects an approach that understands the stakes, and how success requires everyone to be truly accountable. We’re visionaries, not mercenaries, and from the CEO to our newest colleague, we own what we do, because the buck stops with each of us.
At Switchfly, we don’t just accept difference — we celebrate it, we support it, and we thrive on it for the benefit of our employees, our products, and our community. At this time, we're unable to provide visa sponsorship for this role.