Member of Technical Staff - Efficient ML
Embedding Vc
Posted: January 15, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
We are looking for a skilled engineer to join our team as a member of our technical staff, where you will be working on efficient machine learning projects, including training efficiency, inference optimization, and infrastructure development.
Required Skills
Job Description
Introducing Moonlake, AI for creating world simulations.
Scope of Work
Training efficiency
• Dataloaders, fusion, activation remat, gradient checkpointing.
• FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning.
GPU + kernel performance
• Nsight profiling, Triton/CUDA kernels, fused ops.
• Flash-attention–style speedups, sequence packing, KV-cache tricks.
Inference optimization
• Low-latency serving, continuous batching, speculative decoding.
• Quantization (GPTQ/AWQ), distillation, pruning.
Infra + reliability
• SLURM/K8s multi-node jobs, checkpoint hygiene.
• Determinism, env pinning, GPU failure handling.
We are committed to being an on-site, in-person team currently based in San Mateo