ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

Machine Learning Engineer, Reinforcement Learning & Reward Modeling

Wayve

Vancouver (Vancouver, Canada) Remote permanent

Posted: December 11, 2025

Interested in this position?

Create a free account to apply with AI-powered matching

Job Description

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law.

About us

Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems.

Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving.

In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future.

At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.

Make Wayve the experience that defines your career!

The role

We’re looking for a Machine Learning Engineer with strong experience in reinforcement learning (RL), reward modeling, and large-scale ML systems to advance how we train, evaluate, and deploy embodied AI behaviors. This role sits at the intersection of ML engineering, applied RL research, and ML systems, working on the frameworks that guide how our autonomous agents learn from data, simulation, and real-world experience.

As an MLE on the Accelerated Learning Loop team, you will:

• Design and optimise end-to-end pipelines for training reward models and RL agents, ensuring they are reproducible and high-throughput.

• Develop tooling for data processing, annotation, and inference within RL workflows.

• Build, refine, and deploy reward models that encode safe, interpretable, and effective driving behaviours.

• Integrate reward models with diverse data sources: real-world trajectories, simulation, and synthetic datasets.

• Conduct ablations, hyperparameter explorations, and controlled studies to analyse how reward structures, data composition, and training dynamics affect policy performance.

• Diagnose failure modes, investigate emergent behaviours, and iterate on reward objectives to improve reliability.

• Work closely with RL scientists to translate research ideas into scalable engineering solutions.

• Partner with evaluation teams to integrate reward and RL models into offline/online testing suites and simulation frameworks.

• Establish best practices around code quality, reproducibility, and deployment readiness.

• Build internal tools and visualisations that enable faster debugging, deeper insights, and more efficient iteration across the RL and reward modeling stack.

• This role is ideal for someone who enjoys building systems and running fast, grounded experiments. Someone who is motivated by delivering real impact on the behaviour of embodied AI systems in the real world.

Must-haves

• Experience applying reinforcement learning techniques, including offline RL, reward modeling, RLHF-style approaches, or similar

• Proficiency in Python and modern ML frameworks (e.g., PyTorch, JAX, Ray, or equivalent)

• Experience building ML pipelines or large-scale training workflows in production or research environments

• Strong understanding of simulation environments and/or real-world behavioural data

• Ability to design and run experiments, analyse results, and turn insights into actionable improvements

• Strong problem-solving skills and the ability to work effectively in cross-functional teams

Nice-to-haves

• Experience contributing to research (e.g., publications at NeurIPS, ICLR, CoRL, CVPR)

• Understanding of self-driving technologies, sensor data, or real-time decision-making algorithms

• Experience with distributed training systems and cloud compute environments (Azure, AWS, GCP)

• Exposure to large-scale simulation, embodied AI, or robotics systems

What we offer you

• Attractive compensation with salary and equity

• Immersion in a team of world-class researchers, engineers and entrepreneurs

• A unique position to shape the future of autonomy and tackle the biggest challenge of our time

• Bespoke learning and development opportunities

• Relocation support with visa sponsorship

• Flexible working hours - we trust you to do your job well, at times that suit you and your time

• Benefits such as an onsite chef, workplace nursery scheme, private health insurance, therapy, daily yoga, onsite bar, large social budgets, unlimited L&D requests, enhanced parental leave, and more!

This is a full-time role based in our office in Vancouver. At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.

We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.

For more information visit Careers at Wayve.

To learn more about what drives us, visit Values at Wayve

DISCLAIMER: We will not ask about marriage or pregnancy, care responsibilities or disabilities in any of our job adverts or interviews. However, we do look to capture information about care responsibilities, and disabilities among other diversity information as part of an optional DEI Monitoring form to help us identify areas of improvement in our hiring process and ensure that the process is inclusive and non-discriminatory.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply