ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

AI Software Engineer - Model Evaluation (f/m/d)

AlephAlpha

Heidelberg, Baden Würtemberg, Germany Remote permanent

Posted: February 26, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Job Description

Our Mission

Aleph Alpha is one of the few companies in Europe doing serious foundation model pre-training. Our customers - in finance, manufacturing, public administration - need models that understand German, meet European regulatory requirements, and work reliably in high-stakes settings. We're building that in Heidelberg.

We're growing our pre-training team and hiring someone to own evaluation: defining what "better" means for our models, building the systems that measure it, and ensuring our training team has the signal it needs to iterate confidently.

The Role

As a Senior AI Engineer in Pre-training Evaluation, you will work across the full stack of evaluation - from methodology design to implementation to analysis. Some weeks you'll be deep in benchmark curation, understanding what a given eval actually measures and whether it predicts downstream performance. Other weeks you'll be optimising pipeline throughput or building dashboards that surface training signals.

We are looking for someone that combines significant research experience (in industry or academia) with high engineering competence.

Your work sits at high leverage: the evaluations you design and build determine which training runs we pursue, which data mixtures we prioritise, and how we allocate compute. You'll have direct influence on the models we ship.

This role is for Aleph Alpha Research.

Your Responsibilities

• Own benchmarks end-to-end: Select, implement, and maintain the evaluation suite used during pre-training - from dataset curation to scoring infrastructure to result analysis.

• Build evaluation infrastructure: Develop and optimise the pipelines that run evaluations against training checkpoints, ensuring speed, reliability, and reproducibility.

• Design aggregation and reporting: Define how benchmark results translate into training decisions, and build the tooling that makes results interpretable.

• Close capability gaps: Work with product and post-training teams to identify where our models fall short, then create or integrate benchmarks that measure progress.

• Own German evaluation: Ensure rigorous assessment of German language capabilities - this is core to our value proposition, not an afterthought.

• Correlate signals: Establish which pre-training metrics actually predict downstream and system-level performance.

Your Profile

Basic Qualifications

• Experience with LLM evaluation, benchmark design, evaluation dataset curation, and experimental design.

• Familiarity with statistical methods for evaluation and experiment design.

• Track record of shipping impactful technical work - whether that's research, infrastructure, or both.

• Strong Python skills and comfort with ML tooling (PyTorch, evaluation frameworks, distributed systems).

• Ability to reason about what an evaluation measures and whether it matters - not just run benchmarks, but understand them.

• Ownership mentality: you see problems through from diagnosis to solution to deployment.

• Willingness to relocate to Heidelberg or travel regularly (potentially weekly).

Preferred Qualifications

• Understanding of foundation model training - how data, scale, and architecture affect capabilities.

• Experience with large-scale data processing or ML infrastructure.

• German language proficiency (helpful for evaluating German capabilities, not required).

• PhD in machine learning, NLP, statistics, or a related field (valued but not required - we care about what you can do).

Why This Role

This is a greenfield opportunity. We're standing up evaluation as a dedicated function within pre-training - small team, high ownership, embedded where decisions happen. You won't be running evals for other teams or maintaining legacy systems. You'll be shaping how Aleph Alpha measures progress on foundation models, with direct impact on what we build.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply