Senior MLOps Engineer

Ifm Us

Abu Dhabi permanent

Posted: November 3, 2025

Job Description

About the Institute of Foundation Models (IFM)

The Institute of Foundation Models is a dedicated research lab for building, understanding, deploying, and risk-managing large-scale AI systems. We drive innovation in foundation models and their operationalization, empowering research, education, and industry adoption through scalable infrastructure and real-world applications.
As part of our engineering team, you will operate at the intersection of machine learning and systems design — building the cloud, orchestration, and deployment layers that power the next generation of intelligent applications at MBZUAI. You’ll work alongside world-class AI researchers and engineers to productionize LLMs, voice models, and multimodal systems at scale.

The Role

As a Senior MLOps Engineer, you will design, build, and maintain robust ML(Machine Learning) infrastructure across training, inference, and deployment pipelines. You will take ownership of the model lifecycle — from data ingestion to real-time serving — and ensure our LLM and speech models are deployed efficiently, securely, and reproducibly in Kubernetes-based environments.
This position requires deep hands-on experience with Kubernetes (EKS), Helm, AWS cloud infrastructure, and modern MLOps toolchains (e.g., vLLM, SGLang, OpenWebUI, Weights & Biases, MLflow). Familiarity with speech/voice AI frameworks like ElevenLabs, Whisper, and RVC is also valuable.

Key Responsibilities:
• Design and manage scalable ML infrastructure on AWS using EKS, EC2, RDS, S3, and IAM-based access control.
• Build and maintain Kubernetes deployments for LLM and TTS inference using Helm, ArgoCD, and Prometheus/Grafana monitoring.
• Implement and optimize model serving pipelines using vLLM, SGLang, TensorRT, or similar frameworks for high-throughput inference.
• Develop CI/CD and MLOps automation for data versioning, model validation, and deployment (GitHub Actions, Jenkins, or AWS CodePipeline).
• Integrate OpenWebUI, Gradio, or similar UIs for user-facing model demos and internal evaluation tools.
• Collaborate with ML researchers to productize models — including TTS (e.g., ElevenLabs API), ASR (Whisper), and LLM-based chat systems.
• Ensure observability, cost optimization, and reliability of cloud resources across multiple environments.
• Contribute to internal tools for dataset curation, model monitoring, and retraining pipelines.
• Maintain infrastructure-as-code using Terraform and Helm charts for reproducibility and governance.
• Support real-time multimodal workloads (voice, text, vision) across inference clusters.

Academic Qualifications:
• 4+ years of experience in MLOps, DevOps, or Cloud Infrastructure Engineering for ML systems.
• Strong proficiency in Kubernetes, Helm, and container orchestration.
• Experience deploying ML models via vLLM, SGLang, TensorRT, or Ray Serve.
• Proficiency with AWS services (EKS, EC2, S3, RDS, CloudWatch, IAM).
• Solid experience with Python, Docker, Git, and CI/CD pipelines.
• Strong understanding of model lifecycle management, data pipelines, and observability tools (Grafana, Prometheus, Loki).
• Excellent collaboration skills with ML researchers and software engineers.

Professional Experience – Preferred:
• Extensive Experience with vLLM, K8s, Elevenlabs, Whisper, Gradio/OpenWebUI, or custom TTS/ASR model hosting.
• Familiarity with multi-GPU scheduling, NCCL optimization, and HPC cluster integration.
• Knowledge of security, cost management, and network policy in multi-tenant Kubernetes clusters and cloudflare systems.
• Prior work in LLM deployment, fine-tuning pipelines, or foundation model research.
• Exposure to data governance and responsible AI operations in research or enterprise settings.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.