Senior Technical Consultant, Observability
Thinkahead
Posted: May 20, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
We are looking for a Senior Technical Consultant to join our team, where you will be responsible for developing and implementing Observability solutions for our clients. The ideal candidate should have experience in cloud infrastructure, automation, and analytics, and be able to work with teams in various locations.
Required Skills
Job Description
AHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation.
At AHEAD, we prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard. We create spaces to empower everyone to speak up, make change, and drive the culture at AHEAD.
We are an equal opportunity employer, and do not discriminate based on an individual's race, national origin, color, gender, gender identity, gender expression, sexual orientation, religion, age, disability, marital status, or any other protected characteristic under applicable law, whether actual or perceived.
We embrace all candidates that will contribute to the diversification and enrichment of ideas and perspectives at AHEAD.
AHEAD is looking for a Senior Technical Consultant to help clients modernize how they observe, operate, and improve complex digital platforms. This is a hybrid consultant-architect role for someone who can lead discovery, shape strategy, design practical solutions, and help drive delivery from plan through execution.
The right candidate brings strong observability depth, a vendor-agnostic and prescriptive approach, and the ability to connect application performance, user experience, and business outcomes. You should be comfortable working across open-source and commercial observability ecosystems, guiding clients through current-state assessments, future-state roadmaps, and implementation programs that improve resilience, operational maturity, and developer experience.
Roles and Responsibilities:
•
Lead client discovery sessions, assessments, and workshops focused on observability, telemetry, reliability, and operational maturity.
•
Define target-state observability architectures and roadmaps aligned to cloud, platform engineering, SRE, and AIOps initiatives.
•
Design and guide implementation of solutions across metrics, logs, traces, dashboards, alerting, and incident workflows.
•
Help clients adopt modern telemetry practices using OpenTelemetry and related open-source, cloud-native, and enterprise toolsets.
•
Build or refine dashboards, alerts, service views, and operational integrations that improve visibility, context, and signal quality.
•
Improve alert quality, incident triage, escalation paths, runbooks, and post-incident feedback loops by connecting telemetry to actionable operational workflows.
•
Partner with engineering, platform, operations, and leadership stakeholders to align observability investments to business priorities.
•
Translate observability capabilities into measurable outcomes such as improved service reliability, faster incident detection and resolution, reduced alert noise, improved user experience, and stronger platform adoption.
•
Advise clients on telemetry governance, data quality, retention, cardinality, access, and cost optimization to improve signal value while reducing operational and platform waste.
•
Contribute to implementation planning, solution governance, documentation, enablement, and operational handoff.
•
Support adoption of modern integration patterns across observability, ITSM, incident management, collaboration, workflow, and AI-assisted operations platforms.
•
Provide hands-on technical leadership where needed, including configuration guidance, implementation oversight, design validation, troubleshooting support, and quality review of delivery outputs.
•
Mentor delivery teams and help build reusable patterns, accelerators, and best practices within the practice.
Experience:
•
6+ years of experience in consulting, engineering, SRE, platform engineering, or operations with strong observability responsibility.
•
Hands-on experience with modern observability concepts across metrics, logs, traces, alerting, service health, and incident response.
•
Experience with Open Telemetry and familiarity with telemetry pipeline design, instrumentation patterns, and collection architecture.
•
Working knowledge of open-source observability tools such as Prometheus, Grafana, Loki, Tempo, Mimir, Jaeger, or Elastic.
•
Experience with one or more enterprise observability platforms such as Datadog, Dynatrace, Splunk, New Relic, Elastic, LogicMonitor, Honeycomb, Chronosphere, etc.
•
Strong understanding of Kubernetes, containers, cloud-native architectures, and at least one major public cloud platform.
•
Experience with Terraform, Helm, CI/CD pipelines, Ansible, or related automation and platform tooling.
•
Solid understanding of distributed systems, modern application architectures, and operational best practices across SRE, DevOps, or ITSM environments.
•
Experience defining service health models, SLIs, SLOs, alerting strategies, or reliability measurement frameworks that connect technical telemetry to operational and business outcomes.
•
Ability to lead client-facing conversations, structure ambiguous problems, and translate strategy into executable workstreams.
•
Demonstrated ability to operate in ambiguous client environments, identify the practical path forward, and communicate tradeoffs clearly across engineering, operations, and leadership audiences.
•
Strong written and verbal communication skills with both technical and executive audiences.
•
A continuous learning mindset and a collaborative approach to delivery and practice building.
Nice to have
•
Experience operationalizing error budgets, burn-rate alerting, production readiness reviews, or reliability governance practices.
•
Experience integrating observability with ServiceNow, incident management platforms, CMDB, or collaboration tools.
•
Experience with telemetry cost optimization, sampling strategies, retention policies, tagging standards, cardinality management, or observability governance.
•
Familiarity with platform engineering, internal developer platforms, service catalogs, golden paths, or self-service observability patterns.
•
Exposure to AIOps, anomaly detection, event correlation, automated remediation workflows, or MCP-enabled integration patterns for AI-assisted operations.
•
Familiarity with microservices, service mesh technologies, end-user monitoring, or digital experience monitoring concepts.
•
Generalist coding or scripting experience in languages such as Python, Java, Go, JavaScript, or .NET.
Why AHEAD:
Through our daily work and internal groups like Moving Women AHEAD and RISE AHEAD, we value and benefit from diversity of people, ideas, experience, and everything in between.
We fuel growth by stacking our office with top-notch technologies in a multi-million-dollar lab, by encouraging cross department training and development, sponsoring certifications and credentials for continued learning.
India Employment Benefits include:
Comprehensive health insurance coverage for employees, with options to extend coverage to dependents
Paid time off and company holidays, along with additional leave benefits as per policy
Flexible work arrangements, supporting work-life balance
Learning and development opportunities to support continuous growth and upskilling
Employee wellness initiatives and programs focused on physical and mental well-being
Retirement and statutory benefits in line with India regulations
Inclusive and people-first culture, with a strong focus on collaboration and ownership