Principal AI Evaluation Engineer
Workatbackbase
Posted: December 12, 2025
Interested in this position?
Create a free account to apply with AI-powered matching
Required Skills
Job Description
About Backbase
As a a Principal AI Evaluation Engineeryou will be leading the evaluation efforts in our AI-powered SDLC team. You will own the evaluation strategy for AI assistants and agentic workflows, ensuring they are reliable, observable, and safeguarded with strong guardrails. Beyond hands-on work, you will mentor engineers, lead triage and reporting, and make evaluation a cornerstone of release decisions.
Meet the job
•
Define and lead the evaluation strategy and roadmap for AI-powered SDLC core product
•
Build and oversee evaluation pipelines and guardrails.
•
Build and maintain evaluation datasets (synthetic and real project data) to benchmark AI behavior.
•
Analyze evaluation results, identify gaps, and produce clear, actionable reports for engineering and product stakeholders.
•
Build a culture of innovation and excellence, encouraging continuous improvement and adoption of best practices in AI evaluation and deployment.
•
Collaborate with cross-functional teams to integrate evaluation insights into development.
How about you?
•
Strong understanding of software engineering principles and the software development lifecycle (SDLC).
•
Hands-on experience with test design, test management, observability, and data analysis.
•
Proficiency in Python (or another scripting language) for automating evaluations.
•
Familiarity with AI Agent evaluation methods (faithfulness, answer relevancy, contextual accuracy, tool correctness).
•
Excellent analytical and problem-solving skills.
•
Strong communication and collaboration abilities, able to work with cross-functional teams and stakeholders.
•
Demonstrated ability to mentor engineering talent, fostering collaboration and technical excellence.
•
(Nice to have) Experience with evaluation frameworks, RAG systems, or agentic workflows.