DevOps & ML Ops Engineer | Spain (BCN, Madrid, Malaga or Palma)
Confidential
Posted: January 30, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Develop and maintain scalable, stable services that deliver machine learning models to end users with guaranteed uptime.
Required Skills
Job Description
DevOps & ML Ops Engineer would be responsible for developing and maintaining scalable, stable services that deliver machine learning models to end users with guaranteed uptime. The primary focus will be on the infrastructure, deployment, and continuous integration/continuous delivery (CI/CD) processes for our ML services.
RESPONSIBILITIES:
• Manage resource allocation and workload scheduling for multiple ML services, ensuring efficient utilization of CPU/GPU resources and creating reliable queues based on service priorities.
• Maintain VM environments and manage OS updates, keep up-to-date VM inventory
• Work alongside the Dev and QA team to detect hot spots in our applications and set preventative measure before it becomes a live issue.
• Troubleshooting and provide solutions for system configurations
• Plan, execute and test disaster recovery
• Monitor and examine all application, performance, event, and system logs to assist in troubleshooting
• Responsible for filing all IT/Colocation tickets ensuring fulfilment of requests, escalating to the right person if necessary.
• Design, develop, and maintain the infrastructure required for deploying and scaling machine learning services.
• Implement and manage the CI/CD pipelines to ensure seamless and efficient deployment of ML models.
• Collaborate with data scientists, ML researchers, and language experts to understand the requirements for deploying ML models and provide necessary infrastructure support.
• Automate and streamline the build, test, and deployment processes to enhance efficiency and reduce time-to-market.
• Monitor and optimize the performance, availability, and scalability of production ML systems.
• Develop and maintain robust monitoring, logging, and alerting systems to proactively identify and address issues.
• Implement security best practices to protect sensitive data and ensure compliance with relevant regulations.
• Stay up-to-date with industry trends and emerging technologies related to ML Ops and DevOps, and propose innovative solutions to improve our ML service delivery.