ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

AI Researcher — Inference Optimization

Featherlessai

Remote (world) permanent

Posted: January 23, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

Design, evaluate, and deploy high-performance inference systems for large-scale machine learning models, focusing on latency, throughput, and cost efficiency across real-world production environments.

Job Description

Role Overview

We are seeking an AI Researcher with deep experience in inference optimization to design, evaluate, and deploy high-performance inference systems for large-scale machine learning models. You will work at the intersection of model architecture, systems engineering, and hardware-aware optimization, improving latency, throughput, and cost efficiency across real-world production environments.

Key Responsibilities

• Research and develop techniques to optimize inference performance for large neural networks.

• Improve latency, throughput, memory efficiency, and cost per inference.

• Design and evaluate model-level optimizations (quantization, pruning, KV-cache optimization, architecture-aware simplifications).

• Implement systems-level optimizations (dynamic batching, kernel fusion, multi-GPU inference, prefill vs decode optimization).

• Benchmark inference workloads across hardware accelerators.

• Collaborate with engineering teams to deploy optimized inference pipelines.

• Translate research insights into production-ready improvements.

Required Qualifications

• Strong background in machine learning, deep learning, or AI systems.

• Hands-on experience optimizing inference for large-scale models.

• Proficiency in Python and modern ML frameworks (e.g., PyTorch).

• Experience with inference tooling (e.g., Triton, TensorRT, vLLM, ONNX Runtime).

• Ability to design experiments and communicate results clearly.

Preferred / Nice-to-Have Qualifications

• Experience deploying production inference systems at scale.

• Familiarity with distributed and multi-GPU inference.

• Experience contributing to open-source ML or inference frameworks.

• Authorship or co-authorship of peer-reviewed research papers in machine learning, systems, or related fields.

• Experience working close to hardware (CUDA, ROCm, profiling tools).

What Success Looks Like

• Measurable gains in latency, throughput, and cost efficiency.

• Optimized inference systems running reliably in production.

• Research ideas successfully translated into deployable systems.

• Clear benchmarks and documentation that inform product decisions.

Relevant Research Areas (Bonus)

• Long-context inference optimization

• Speculative decoding

• KV-cache compression and paging

• Efficient decoding strategies

• Hardware-aware inference design

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply