Data Infrastructure Engineer (Batch / Lake / Streaming) - LA- Remote: Colombia - Costa Rica, Fulltime.
Confidential
Posted: January 30, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Required Skills
Job Description
- This position is open to candidates located in Colombia or Costa Rica only -
About the role
We are looking for Data Infrastructure Engineers to work on large-scale data platforms for one of our enterprise clients. Depending on your background and interests, you may focus on Batch Infrastructure, Data Lake / Lakehouse Infrastructure, or Streaming Infrastructure.
This role is ideal for engineers who enjoy operating and evolving data platforms in production, improving reliability, automation, and performance across distributed systems.
What You’ll Do
Operate and support production-grade data infrastructure across batch, lakehouse, or streaming environments
Improve reliability through automation, monitoring, observability, and incident response
Build, maintain, and optimize data pipelines and platform tooling
Support upgrades, migrations, and platform evolution initiatives
Collaborate closely with data engineers, analytics teams, and platform stakeholders
Handle KTLO work, including maintenance, troubleshooting, and documentation
Apply infrastructure best practices using cloud-native and IaC approaches
Role Variants
Depending on your experience, you may work primarily in one of these areas:
Batch Infrastructure
Workflow orchestration and scheduling
Batch pipelines and dependency management
Tools such as Airflow, Luigi, Temporal, Dagster
Python-based automation and platform tooling
Lake / Lakehouse Infrastructure
Data lake and lakehouse platforms
Distributed processing with Spark / Databricks
Technologies such as Iceberg, Delta Lake, Snowflake, BigQuery
Performance tuning, governance, and data reliability
Streaming Infrastructure
Real-time or near-real-time data pipelines
Event-driven architectures and streaming systems
Technologies such as Kafka, Flink, Spark Streaming, Kinesis
Low-latency processing and streaming observability
Required Qualifications
Strong experience with Python or Java
Hands-on experience operating distributed systems in production
Experience working with cloud platforms (AWS, GCP, or Azure)
Familiarity with data infrastructure or data platform concepts
Comfortable working on reliability, maintenance, and operational tasks
Ability to collaborate with cross-functional and client-facing teams
Advanced English level to work directly with clients
Nice to Have
Experience with orchestration tools (Airflow, Dagster, Temporal, Luigi)
Experience with Spark, Databricks, or lakehouse technologies
Experience with Kafka, Flink, or streaming platforms
Infrastructure as Code (Terraform, CloudFormation)
Kubernetes and containerized workloads
Observability tooling (Prometheus, Grafana, Datadog, etc.)