ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

Staff Machine Learning Infrastructure Engineer

Dyna Robotics

Redwood City, California , United States permanent

Posted: January 27, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

Dyna Robotics is a leading provider of general-purpose robots powered by proprietary AI foundation models that generalize and self-improve across varied environments with commercial-grade performance.

Job Description

Company Overview:

Dyna Robotics makes general-purpose robots powered by a proprietary embodied AI foundation model that generalizes and self-improves across varied environments with commercial-grade performance. Dyna's robots have been deployed at customers across multiple industries. Its frontier model has the top generalization and performance in the industry.

Dyna Robotics was founded by repeat founders Lindon Gao and York Yang, who sold Caper AI for $350 million, and former DeepMind research scientist Jason Ma. The company has raised over $140M, backed by top investors, including CRV and First Round.We're positioned to redefine the landscape of robotic automation. Join us to shape the next frontier of AI-driven robotics!

Learn more at dyna.co

Position Overview:

We are seeking an experience Machine Learning Infrastructure Engineer to join our team and help scale our ML training platform. In this role, you will be responsible for designing, implementing, and maintaining large-scale ML infrastructure to accelerate model iteration and improve training performance across an expanding GPU ecosystem. You will work on cutting-edge high-performance computing systems, optimizing distributed training environments, and ensuring system reliability as we scale.

Key Responsibilities:

• Infrastructure Design & Scalability:

• Architect and implement large-scale ML training pipelines that leverage parallel GPU processing on platforms like GCP or AWS.

• Enhance our existing infrastructure to fully exploit parallelism and design for future expansion, ensuring that our system is ready to support growth.

• High-Performance ML Computing & Distributed Systems:

• Manage and optimize high-performance computing resources.

• Develop robust distributed computing solutions, addressing challenges like race conditions, memory optimization, and resource allocation.

• Optimize model training with techniques like mixed precision, ZeRO, Lora, etc.

• Job Scheduling & Reliability:

• Design systems for job rescheduling, automated retries, and failure recovery to maximize uptime and training efficiency.

• Implement intelligent job queuing mechanisms to optimize training workloads and resource utilization.

• Storage & Data Handling:

• Evaluate and implement tradeoffs between different local and networked storage solutions to improve data throughput and access.

• Develop strategies for caching training data to optimize performance.

• Collaboration & Continuous Improvement:

• Work closely with ML researchers and data scientists to understand training requirements and bottlenecks.

• Continuously monitor system performance, identify areas for improvement, and implement best practices to enhance scalability and reliability.

Required Qualifications:

• Bachelor’s degree or higher in Computer Science or a related field.

• At least 7 years of professional experience in the software industry, with a minimum of 2 years in a tech lead role.

• Proven experience with high-performance computing environments and distributed systems.

• Demonstrated ability to scale ML training systems and optimize resource utilization.

• Hands-on experience with job scheduling systems and managing cloud GPU environments (GCP, AWS, etc.).

• Deep understanding of distributed computing concepts, including race conditions, memory optimization, and parallel processing.

• Hands-on experience in ML model tuning for performance.

• Experience with common ML training and inference tools including PyTorch, TensorRT, Triton, Accelerate, etc.

• Strong analytical and problem-solving skills with the ability to troubleshoot complex system issues.

• Excellent communication skills to collaborate effectively with cross-functional teams.

Preferred Qualifications:

• Experience with container orchestration tools (e.g., Kubernetes) and infrastructure-as-code frameworks.

If you're passionate about building scalable ML systems and optimizing high-performance computing infrastructures, we'd love to hear from you.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply