Senior DevOps Engineer - Highload, Cloud & Data-Intensive Systems (EU / Remote)
Alex Staff Agency
Posted: February 18, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Develop and maintain highload, cloud & data-intensive systems, focusing on stability, observability, and automation.
Required Skills
Job Description
About the project
The team develops and maintains distributed services around analytics, APIs, and transaction monitoring. The systems process very large volumes of data — terabytes of storage, trillions of records, continuously growing load.
Infrastructure:
~100 servers (bare metal + VPS)
active use of IaC
Kubernetes clusters in production
focus on stability, observability, and automation
The project is long-term — not a hype startup, but a mature product with real users.
What the work looks like
This is a hands-on role with a clear time allocation:
60% — operations and incidents (including helping teams)
20% — infrastructure automation
20% — prototyping, improvements, technical initiatives
There is on-call responsibility, but normally after-hours incidents happen 2–3 times a year, not every week.
Responsibilities
Operation of production services and infrastructure (server provisioning/decommissioning, updates, replacements, performance troubleshooting)
Support and development of Infrastructure as Code (Terraform / Ansible: modules, roles, standards, reviews)
Monitoring, alerting, backups, and regular recovery checks
Development of service and infrastructure automation
Development of CI/CD and release procedures
Incident diagnosis and resolution, support for product teams
Traffic analytics, bot and attack protection tools
Responsibility for 24/7 platform stability
Requirements:
What’s important
4+ years of experience operating Linux/Ubuntu infrastructure and production services
Strong understanding of networking and troubleshooting
Kubernetes (cluster operations), Rancher, Docker / containerd
Hands-on experience with Ansible and Terraform
Monitoring: Prometheus / Thanos / Telegraf / Grafana / Sentry
CI/CD: Jenkins
Automation: Bash, Python
Experience working with LVM
Nice to have
Experience working with blockchain nodes
Diagnosis and tuning of ClickHouse and MongoDB in high-load clusters
Providers: Hetzner / OVHcloud
Cloudflare (edge, DDoS), experience with AWS
Handling abuse tickets with hosting providers
Technology stack
VPN: WireGuard, OpenVPN
Databases: ClickHouse, MongoDB, Redis, PostgreSQL
Applications: Node.js (pm2), php-fpm, Lua, Tarantool
Supporting services: Go (operatorSDK), Ruby, Node.js, PHP
Benefits:
5,000 – 8,000 € net
Format: office / hybrid / remote
Location: Spain (Barcelona and suburbs) or remote (CET ±2)
Full-time
Opportunity to genuinely influence architecture and processes
Mature engineering team and reasonable expectations