ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

ML Infra Engineer - Platform

Physicalintelligence

San Francisco, California, United States permanent

Posted: January 28, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

We are looking for a skilled ML Engineer to join our Infrastructure team, responsible for building and deploying AI models on physical robots.

Job Description

Who We Are

Physical Intelligence is bringing general-purpose AI into the physical world. We are a team of engineers, scientists, roboticists, and company builders developing foundation models and learning algorithms to power the robots of today and the physically-actuated devices of the future.

The Team

The Infrastructure team builds and operates the backbone of everything PI does: from training state-of-the-art VLA models, to orchestrating large-scale simulation, to reliably deploying intelligence across fleets of physical robots. The team works closely with researchers, robotics runtime, product, and platform engineers to ensure infrastructure scales from prototype to production-grade deployments.

In This Role You Will

• Own core cloud platform infrastructure: You will design, build, and operate CPU compute platforms (such as Anyscale and similar systems), including cluster lifecycle management, capacity planning, quotas, and scheduling primitives. A key goal is making it straightforward and predictable to bring up new clusters and environments as needs evolve.

• Build and scale platform systems: You will operate and evolve Kubernetes clusters and service deployment patterns, and help build a scalable microservice platform for internal systems such as evaluation services, operational tooling, and internal APIs. This includes supporting safe rollouts, upgrades, and rollback strategies.

• Own workflow orchestration infrastructure: You will take platform-level ownership of async and multi-stage workflows, ensuring they are reliable, observable, and easy to extend. These workflows power large-scale evaluation, data processing, and long-running infrastructure tasks.

• Drive observability and cost-aware infrastructure: You will treat logging, metrics, tracing, and alerting as first-class platform primitives, and build systems that surface reliability and performance issues early. You will also help improve cost visibility and enable cost-aware decision-making at the infrastructure level.

• Harden cloud foundations: You will own cloud-first infrastructure with multi-cloud considerations, designing networking, DNS, quotas, and cloud primitives that behave predictably. A major part of this work is reducing infra churn by standardizing patterns, abstractions, and interfaces.

• Improve developer experience: You will build clear, documented interfaces for using platform infrastructure, reducing the gap between “I need infra” and “I can run my workload.” This includes supporting consistent local vs. remote development workflows and improving self-serve infrastructure usage.

• Collaborate and lead through ownership: You will work closely with researchers and infra peers to understand requirements and constraints, translate fast-moving needs into reusable infrastructure, and own systems end-to-end, from design through operation.

What We Hope You'll Bring

• Deep experience with cloud platforms (GCP, AWS) and distributed systems: compute orchestration, networking, autoscaling, service meshes, load balancing.

• Ability to reason about system bottlenecks, performance tuning, and cost optimizations across compute, networking, and storage.

• Comfort with Kubernetes, cluster-level reliability, and service-oriented architectures.

• Solid intuition around scalability, performance, and failure modes.

• Experience with infrastructure-as-code (e.g. Terraform), containerization, and modern platform engineering practices.

• Familiarity with logging, metrics, tracing, incident response, SLOs, and debugging complex distributed systems.

• Ability to take full ownership of systems and operate them in production.

• Strong cross-functional communication and ownership mindset.

• Experience (2-5 years) working in fast-moving or early-stage environments where ambiguity is normal with demonstrated growth trajectory.

Bonus Points If You Have

• Experience with large-scale ML training, evaluation, or simulation infrastructure.

• Familiarity with workflow orchestration systems (e.g. Temporal).

• Experience with secrets management systems (e.g., Doppler).

• Experience designing shared compute platforms or quota systems.

• Background in observability, cost optimization, or internal platform tooling.

• Exposure to robotics, simulation, or real-time systems.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply