Senior MLOps Engineer
Ifm Us
Posted: November 3, 2025
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
We are seeking a Senior MLOps Engineer to join our team in Abu Dhabi, UAE, where we will be building and deploying large-scale AI systems.
Job Description
About the Institute of Foundation Models (IFM)
The Institute of Foundation Models is a dedicated research lab for building, understanding, deploying, and risk-managing large-scale AI systems. We drive innovation in foundation models and their operationalization, empowering research, education, and industry adoption through scalable infrastructure and real-world applications.
As part of our engineering team, you will operate at the intersection of machine learning and systems design — building the cloud, orchestration, and deployment layers that power the next generation of intelligent applications at MBZUAI. You’ll work alongside world-class AI researchers and engineers to productionize LLMs, voice models, and multimodal systems at scale.
The Role
As a Senior MLOps Engineer, you will design, build, and maintain robust ML(Machine Learning) infrastructure across training, inference, and deployment pipelines. You will take ownership of the model lifecycle — from data ingestion to real-time serving — and ensure our LLM and speech models are deployed efficiently, securely, and reproducibly in Kubernetes-based environments.
This position requires deep hands-on experience with Kubernetes (EKS), Helm, AWS cloud infrastructure, and modern MLOps toolchains (e.g., vLLM, SGLang, OpenWebUI, Weights & Biases, MLflow). Familiarity with speech/voice AI frameworks like ElevenLabs, Whisper, and RVC is also valuable.
Key Responsibilities:
• Design and manage scalable ML infrastructure on AWS using EKS, EC2, RDS, S3, and IAM-based access control.
• Build and maintain Kubernetes deployments for LLM and TTS inference using Helm, ArgoCD, and Prometheus/Grafana monitoring.
• Implement and optimize model serving pipelines using vLLM, SGLang, TensorRT, or similar frameworks for high-throughput inference.
• Develop CI/CD and MLOps automation for data versioning, model validation, and deployment (GitHub Actions, Jenkins, or AWS CodePipeline).
• Integrate OpenWebUI, Gradio, or similar UIs for user-facing model demos and internal evaluation tools.
• Collaborate with ML researchers to productize models — including TTS (e.g., ElevenLabs API), ASR (Whisper), and LLM-based chat systems.
• Ensure observability, cost optimization, and reliability of cloud resources across multiple environments.
• Contribute to internal tools for dataset curation, model monitoring, and retraining pipelines.
• Maintain infrastructure-as-code using Terraform and Helm charts for reproducibility and governance.
• Support real-time multimodal workloads (voice, text, vision) across inference clusters.
Academic Qualifications:
• 4+ years of experience in MLOps, DevOps, or Cloud Infrastructure Engineering for ML systems.
• Strong proficiency in Kubernetes, Helm, and container orchestration.
• Experience deploying ML models via vLLM, SGLang, TensorRT, or Ray Serve.
• Proficiency with AWS services (EKS, EC2, S3, RDS, CloudWatch, IAM).
• Solid experience with Python, Docker, Git, and CI/CD pipelines.
• Strong understanding of model lifecycle management, data pipelines, and observability tools (Grafana, Prometheus, Loki).
• Excellent collaboration skills with ML researchers and software engineers.
Professional Experience – Preferred:
• Extensive Experience with vLLM, K8s, Elevenlabs, Whisper, Gradio/OpenWebUI, or custom TTS/ASR model hosting.
• Familiarity with multi-GPU scheduling, NCCL optimization, and HPC cluster integration.
• Knowledge of security, cost management, and network policy in multi-tenant Kubernetes clusters and cloudflare systems.
• Prior work in LLM deployment, fine-tuning pipelines, or foundation model research.
• Exposure to data governance and responsible AI operations in research or enterprise settings.