MisuJob - AI Job Search Platform MisuJob

Senior DevOps Engineer

Gridware

San Francisco, CA Hybrid permanent

Posted: May 7, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

Design and develop a comprehensive Active Grid Response platform for Gridware's customers, focusing on proactive maintenance and fault mitigation.

Job Description

About Gridware
Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid. We pioneered a groundbreaking new class of grid management called active grid response (AGR), focused on monitoring the electrical, physical, and environmental aspects of the grid that affect reliability and safety. Gridware’s advanced Active Grid Response platform uses high-precision sensors to detect potential issues early, enabling proactive maintenance and fault mitigation. This comprehensive approach helps improve safety, reduce outages, and ensure the grid operates efficiently. The company is backed by climate-tech and Silicon Valley investors. For more information, please visit www.Gridware.io.

Role Description

We’re scaling the deployment of critical infrastructure monitoring devices to detect real-world fault events that lead to wildfires. The platform you’ll build and operate ingests millions of events per day from devices in the field, powers customer-facing dashboards and alerting, and supports the data science work that turns raw signals into grid intelligence.

You will own AWS infrastructure, Kubernetes (EKS), CI/CD, and observability end-to-end, partnering with our Cloud Security team to keep the platform safe and compliant, and with backend, firmware, and data teams to keep them shipping fast. As an early member of the DevOps team, you’ll have a direct hand in shaping how Gridware builds, deploys, and runs production systems for years to come.


Responsibilities :
• Design, build, and maintain scalable, secure, and highly available infrastructure on AWS (EKS, EC2, RDS / Aurora Postgres, MSK, S3, VPC, IAM).
• Manage and optimize Kubernetes clusters (EKS) across multiple environments, and deploy applications using Argo CD with GitOps best practices.
• Implement and maintain CI/CD pipelines using GitHub Actions, including reusable workflows, build/push/scan flows for ECR, and frontend deployment pipelines.
• Operate and tune Kafka-based event streaming on Amazon MSK for high-throughput, low-latency device data pipelines.
• Define and manage Infrastructure as Code with Terraform and Terragrunt, with reusable modules, sensible environment separation, and review-friendly plans.
• Manage identity and access across platforms with Auth0 / EntraID integrations, IAM roles for service accounts (IRSA), and short-lived credentials.
• Build and maintain observability with Grafana, Loki, Prometheus / Mimir, and related tooling so on-call engineers can quickly find and fix issues.
• Monitor and optimize infrastructure cost across environments, partnering with engineering teams on right-sizing, capacity planning, and waste reduction.
• Partner with our Cloud Security team to enforce security standards, integrate with SIEM tooling, and respond to vulnerabilities and incidents.
• Debug complex production issues across infrastructure, deployment, and networking layers, and turn the lessons learned into automation and runbooks.


Required Skills:
• 5+ years in DevOps, SRE, or Platform Engineering with production experience operating AWS infrastructure.
• Deep hands-on experience administering Kubernetes (EKS or equivalent) and deploying via GitOps (Argo CD or Flux).
• Proficiency with Infrastructure as Code using Terraform; comfort with Terragrunt or a similar wrapper.
• Hands-on experience designing and maintaining CI/CD pipelines, preferably with GitHub Actions and reusable workflows.
• Production experience operating distributed systems such as Kafka (MSK).
• Strong understanding of networking, DNS, TLS, and security best practices, including IdP-driven access control (Auth0, EntraID, or similar).
• Solid experience with monitoring and logging stacks such as Grafana, Loki, Prometheus, Mimir, or equivalents.
• Ability to debug complex production issues across infrastructure, deployment, and networking layers.
• Comfortable working in Linux environments with strong scripting skills (Python or Bash preferred for automation).
• Knowledge of version control workflows, automated testing, and release management.


Bonus Skills:
• Experience operating Apollo Router / GraphQL federation gateways in production.
• Experience operating Argo Workflows or similar Kubernetes-native job / pipeline runners in production.
• Familiarity with Databricks or ML Ops pipelines for data and model deployment.
• Experience designing, operating, and exercising Disaster Recovery (DR) environments, including cross-region replication, backups, and tested failover runbooks.
• Experience with Tailscale or other zero-trust networking tools.
• Experience supporting IoT / embedded fleets at scale, including secure device-to-cloud connectivity.
• Experience in high-growth startup environments where you must wear many hats.


This describes the ideal candidate; many of us have picked up this expertise along the way. Even if you meet only part of this list, we encourage you to apply!

Benefits
Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
Paid parental leave
Alternating day off (every other Monday)
“Off the Grid”, a two week per year paid break for all employees.
Commuter allowance
Company-paid training

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply