ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

Research Scientist, Frontier, Zurich

Deepmind

Zurich, Switzerland permanent

Posted: January 16, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Job Description

Snapshot

At Google DeepMind, we foster an environment where ambitious, long-term research flourishes. Our team is tackling one of the hardest problems in modern AI: Post-training Frontier models. Unlike smaller models that can rely on distillation, our frontier models require novel training signals to advance the state of the art. We are defining the horizontal recipes—from revamping RL prompts to advancing Reward Models (RM) —that allow these models to "think" better, reason deeper, and align more closely with human intent. We believe that mastering the feedback loop between user signals and model behavior is the key to breaking through current performance plateaus.

About Us

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The Role

We are seeking a Research Scientist or Engineer to lead the development of next-generation post-training recipes for Gemini. In this role, you will move beyond standard tuning; you will architect the Reward Modeling and Reinforcement Learning strategies that define how our most capable models learn. You will focus specifically on "hard" capabilities—such as improving chain-of-thought reasoning and complex instruction following—where synthetic data and distillation fall short. You will work horizontally to ensure these recipes scale across text, audio, and multimodal domains, establishing the gold standard for how Gemini evolves.

Key responsibilities:

• Frontier Recipe Development: Design and validate novel post-training pipelines (SFT, RLHF, RLAIF) specifically for frontier-class models where no "teacher" model exists.

• Advance Reward Modeling: Lead research into next-gen Reward Models, including investigating new architectures, reducing reward hacking, and improving signal-to-noise ratios in preference data.

• Unlock "Thinking" Capabilities: innovative methods to improve the model's internal reasoning (chain-of-thought), focusing on correctness, logic, and self-correction in multi-step tasks.

• Revamp RL Paradigms: critically re-evaluate and optimize RL prompts and feedback mechanisms to extract maximum performance from the underlying base models.

• Solve the "Flywheel" Challenge: create robust mechanisms to turn user signals and interactions into training data that continuously improves the model without introducing regression or bias.

Horizontal Impact: collaborate across teams to apply these advanced recipes to various model sizes and modalities (e.g., Audio), ensuring consistent high-quality behavior.

About You

In order to set you up for success as a Research Scientist at Google DeepMind, we look for the following skills and experience:

• PhD in machine learning, artificial intelligence, or computer science (or equivalent practical experience).

• Strong background in Large Language Models (LLMs), Reinforcement Learning (RL), or preference learning.

• Research interest in aligning AI systems with human feedback and utility.

• Familiarity with experiment design and analyzing large-scale user data.

• Strong coding and communication skills.

Preferred requirements

• Experience with RLHF (Reinforcement Learning from Human Feedback) or DPO (Direct Preference Optimization).

• Experience building or improving reward models and conducting human evaluation studies.

• A proven track record of publications in top-tier conferences (e.g., NeurIPS, ICML, ICLR).

• Experience with Chain-of-Thought (CoT) reasoning research or process-based supervision.

• Deep understanding and experience training models from scratch or using self-play/self-improvement techniques.

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply