MisuJob - AI Job Search Platform MisuJob

Member of Technical Staff (Infrastructure): World Models

Moonvalley Ai

Remote Remote permanent

Posted: April 8, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

Collaborate with researchers and engineers to understand workload requirements and translate them into infrastructure decisions, improve scheduling and resource allocation for inference and training coexistence on shared GPU clusters, own GPU utilization and cost as first-class metrics, and build automated tooling and observability to reduce friction for the AI team.

Job Description

What You'll Do

• Collaborate with researchers and engineers to understand workload requirements and translate them into infrastructure decisions — not just run what you're given.

• Design and improve scheduling and resource allocation for inference and training coexistence on shared GPU clusters.

• Build, operate, and scale GPU infrastructure across clusters of thousands of GPUs.

• Own GPU utilization and cost as first-class metrics.

• Build automated tooling and observability that reduces friction for the AI team.

• Participate in on-call rotation and drive reliability improvements.

• Serve as the primary point of contact for GPU providers, managing relationships and coordinating infrastructure needs.

Over time, you'll take on broader ownership: setting scheduling policy, driving architecture decisions for compute and storage systems, and identifying when infrastructure no longer fits evolving workloads.

What We're Looking For

• Deep Systems Foundation: Linux-native. Understands how machines work, can debug at the kernel level. Deep understanding of networking and storage stacks.

• Cluster Engineering: Experience operating and scaling GPU infrastructure (hundreds to thousands of GPUs), Kubernetes, Slurm, and distributed storage systems.

• Distributed Systems Fundamentals: Experience designing, building, and operating distributed systems at scale.

• Production Discipline: Track record of running critical infrastructure reliably — monitoring, incident response, and automation that reduces toil.

• ML Familiarity: Enough understanding of training and inference workloads to collaborate with researchers and make sound infrastructure decisions.

• Bonus: Resource-Constrained Thinking. Experience in environments where allocation, scheduling, and prioritization of scarce resources was the core problem (e.g. HPC, trading, large-scale ML platforms).

Challenges You'll Tackle

• Balancing latency-sensitive inference against long-running training workloads

• Operating under tight GPU constraints with constantly shifting demand

• Adapting infrastructure to rapidly evolving model architectures and scale

• Making tradeoffs between cost, utilization, and reliability at scale

Traits of the Ideal Candidate

• High ownership: Owns problems end-to-end, sets priorities, and escalates early.

• Challenges systems: Questions what exists and drives improvements when it no longer fits.

• Learns fast: Quickly builds context to make sound infrastructure decisions.

• Thinks in systems: Understands dependencies, validates assumptions, and catches issues early.

• Raises the bar: Shares context, surfaces risks, and helps the team move faster.

What we offer (compensation & benefits)

• Competitive salary and equity

• Private health coverage

• Pension contribution (UK, Canada, US)

• Unlimited paid vacation

• Fully-distributed, async-first culture

• Hardware setup of your choice

• Stipends for phone, internet, and meals

In our team, we approach our work with the dedication similar to Olympic athletes. Anticipate occasional late nights and weekends dedicated to our mission. We understand this level of commitment may not suit everyone, and we openly communicate this expectation.

If you're motivated by deeply technical problems, a seemingly never-ending uphill battle and the opportunity to build (and own) a generational technology company, we can give you what you're looking for.

All business roles at Moonvalley are hybrid positions by default, with some fully remote depending on the job scope. We meet a few times every year, usually in London, UK or North America (LA, Toronto) as a company.

If you're excited about the opportunity to work on cutting-edge AI technology and help shape the future of media and entertainment, we encourage you to apply. We look forward to hearing from you!

The statements contained in this job description reflect general details as necessary to describe the principal functions of this job, the level of knowledge and skill typically required and the scope of responsibility. It should not be considered an all-inclusive listing of work requirements. Individuals may perform other duties as assigned, including work in other functional areas to cover absences, to equalize peak work periods, or to otherwise balance organizational work

Moonvalley AI is proud to be an equal opportunity employer. We are committed to providing accommodations. If you require accommodation, we will work with you to meet your needs.

Please be assured we'll treat any information you share with us with the utmost care, only use your information for recruitment purposes and will never sell it to other companies for marketing purposes. Please review our privacy policy and job applicant privacy policy located here for further information.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply