ML Infrastructure/Platform Engineer
Anthelioncap
Posted: April 10, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
An Anthelion ML Infrastructure/Platform Engineer to work on building proprietary AI and data platform for investment lifecycle.
Required Skills
Job Description
About Anthelion
Anthelion is a next-generation investment firm building a proprietary AI and data platform that powers our investment lifecycle from underwriting to portfolio management. The platform integrates structured and unstructured data, advanced analytics, and automated workflows to drive superior, risk-adjusted returns in private credit and structured finance.
We are engineers and investors working together to redefine how institutional investment decisions are made — faster, smarter, and more transparent.
The Role
We are looking for an ML Infrastructure/Platform Engineer to work on the foundational systems that power our data science and AI platform.
You will work across the infrastructure layer beneath our ML and AI workflows: data pipelines, orchestration, compute provisioning, model serving, and observability. You will also play a key role in operationalizing our agentic AI platform, ensuring agents are hosted, monitored, and integrated into production-grade systems.
What You’ll Do
Data Pipelines & Orchestration
• Design, build, and maintain production data pipelines that ingest, transform, and deliver structured and unstructured data to downstream ML workflows.
• Own and extend our Prefect-based orchestration layer, including flow scheduling, error handling, retry logic, and human-in-the-loop (HITL) suspend/resume patterns.
• Build and maintain feature stores, data contracts, and promotion workflows that ensure data quality and traceability from raw ingestion through model consumption.
• Collaborate with data scientists to operationalize experimental workflows into reliable, repeatable pipelines.
ML/AI Infrastructure & Deployment
• Build and maintain scalable infrastructure for model training, retraining, and inference (batch and real-time), including GPU compute provisioning and container orchestration.
• Implement and manage model serving infrastructure — including containerized endpoints, API gateways, and self-serve deployment frameworks for the data science team.
• Deploy and manage monitoring systems that track model health, data drift, prediction consumption, and pipeline reliability.
• Ensure all deployed systems are highly available, resilient, and well-documented with clear data lineage and runbooks.
Agentic AI Platform & Tooling
• Support the buildout and operationalization of agentic AI workflows, including agent hosting, lifecycle management, and integration with Model Context Protocol (MCP) servers.
• Build shared tooling and infrastructure that enables data scientists to develop, test, and deploy agents with minimal friction.
• Design and implement evaluation frameworks and quality standards for AI agents, including automated benchmarking, regression testing, and production-readiness criteria.
• Ensure observability and reliability across agent execution environments, including logging, tracing, and performance monitoring.
DevOps & Platform Engineering
• Deploy, configure, and maintain shared AI platform services (e.g., observability tools, memory layers, evaluation platforms) as containerized workloads on Azure — including end-to-end ownership of networking, access, and connectivity between services.
• Manage cloud infrastructure (Azure) including container registries, managed identities, Key Vault secrets, storage backends, and virtual network configurations.
• Maintain CI/CD pipelines, branch protection policies, and release management workflows across data science repositories.
• Continuously evaluate and adopt tools and technologies that improve platform reliability, developer experience, and team velocity.
What We’re Looking For
Required
• 3+ years of experience in data engineering, MLOps, or ML infrastructure roles — with a clear track record of building and maintaining production data and ML pipelines.
• Strong proficiency in Python and SQL, with hands-on experience building ETL/ELT pipelines and data transformation workflows.
• Experience with workflow orchestration tools (Prefect, Airflow, Dagster, or similar) in production environments.
• Solid understanding of containerization and cloud infrastructure — Docker, Kubernetes, and at least one major cloud provider (Azure preferred).
• Hands-on experience deploying and operating containerized services in cloud environments, including configuring networking, load balancing, and service-to-service connectivity.
• Experience with model serving and deployment patterns (batch inference, real-time APIs, feature stores).
• Familiarity with monitoring and observability tooling for pipelines and deployed models (data drift detection, health metrics, alerting).
• Strong documentation habits and the ability to communicate technical architecture clearly to diverse stakeholders.
Preferred
• Experience with Azure services: Container Apps, ACI, ACR, Blob Storage, Key Vault, Managed Identities, VNets.
• Familiarity with Prefect (especially cloud-managed work pools, result backends, and HITL patterns).
• Experience with dbt, Snowflake, or similar data transformation and warehousing tools.
• Exposure to LLM serving infrastructure and agentic workflow frameworks (e.g., MCP, LangChain, or similar).
• Experience standing up and maintaining third-party AI/ML platform tools (e.g., Langfuse, MLflow, or similar observability and evaluation platforms).
• Experience managing internal Python package distribution (private PyPI, Artifactory, or similar).
• Familiarity with Git-based release management, branch protection, and CI/CD for data science repos.
Why Join Anthelion
• Build at the frontier of AI, data, and finance — where infrastructure directly shapes institutional investment decisions.
• Work on greenfield architecture with high autonomy and technical depth.
• Collaborate with a multidisciplinary team of data scientists, engineers, and investors.
• Culture grounded in technical excellence, transparency, and measurable impact.
Benefits
• Comprehensive health, dental, and vision insurance.
• Retirement savings plan with company match.
• Hybrid/flexible work arrangements and a supportive work environment.
Culture
• Demonstrates a strong bias for action and executes quickly with limited guidance.
• Takes full ownership of outcomes and drives problems to resolution.
• Approaches challenges with a solutions-first mindset and delivers measurable results.
• Maintains composure under pressure while keeping momentum and focus.
• Simplifies complex issues into clear, actionable steps that move the work forward.
Salary Range: $140,000 to $200,000 per year