ML Ops Engineer (Boston, MA)
Foundation Llm Technologies
Posted: May 14, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
An experienced ML Ops Engineer is needed to architect, build, and maintain end-to-end ML pipelines on Google Cloud and AWS, including logging, monitoring, and alerting for model performance and data drift, as well as automating CI/CD using GitHub Actions or equivalent.
Required Skills
Job Description
Requirements:
• Architect, build, and operate end-to-end ML pipelines for training, validation and deployment on Google Cloud and AWS.
• Define, instrument, and maintain logging, monitoring, and alerting for model performance and data drift.
• Automate CI/CD for ML artifacts and infrastructure using GitHub Actions or equivalent.
• Collaborate with cross-functional teams, including frontend engineers, backend engineers, research engineers, and infrastructure engineers.
• Write clean, well-documented, fast, and maintainable code.
• Help ensure our systems have high availability and performance.
• Experience in computer graphics or physics-based simulation.
• Background in setting up Prometheus/Grafana, ELK, or similar monitoring stacks.
• Experience with Vertex AI.
• Experience working with custom Domain-Specific Languages.
About Us:
We are an MIT-born, venture-backed Silicon Valley startup building a real-life 'Jarvis'—an AI Copilot for design and manufacturing. Our goal is to utilize advanced AI, physics simulation, and computer graphics to reduce costs and improve engineering productivity across all steps of the design and manufacturing process.
What we're looking for:
• BS in Computer Science or a related field.
• 5+ years of experience as a AI/ML Ops, DevOps, Infrastructure Engineer or equivalent.
• Expert-level Python and TypeScripts skills.
• Experience with Docker, Kubernetes, Terraform, Google Cloud and AWS.
• Deep understanding of machine learning models, including LLMs.
• Experience designing and maintaining CI/CD pipelines to fine-tune or train ML models.
• Excellent written and verbal communication skills.
Bonus Points:
• Experience in computer graphics or physics-based simulation.
• Background in setting up Prometheus/Grafana, ELK, or similar monitoring stacks.
• Experience with Vertex AI.
• Experience working with custom Domain-Specific Languages.
Our tech stack:
• Google Cloud, AWS
• Python, TypeScript
• Protobuf, gRPC
• Next.JS, React.JS
• GitHub Actions
• Docker, Kubernetes, Spinnaker
• PostgreSQL