ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

Lead Software Engineer, Model Serving Platform

Sciforium

San Francisco, California, United States permanent

Posted: December 6, 2025

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

We are seeking a Lead Software Engineer to architect and lead the development of Sciforium's next-generation model serving platform, which will bring a multimodal, highly efficient foundation model to market.

Job Description

Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency serving platform. Backed by multi-million-dollar funding and direct sponsorship from AMD with hands-on support from AMD engineers the team is scaling rapidly to build the full stack powering frontier AI models and real-time applications.

About the role

This is a rare chance to help architect and lead the development of Sciforium’s next-generation model serving platform, the high-performance engine that will bring a multimodal, highly efficient foundation model to market. As a senior technical leader, you’ll not only build core components yourself but also guide and mentor other engineers, influencing engineering direction, standards, and execution quality.

You will learn and shape the full AI stack: from GPU kernels and quantized execution paths to distributed serving, scheduling, and the APIs that power real-time AI applications. If you enjoy deep systems work, thrive on ownership, and want to lead engineers in building foundational AI infrastructure, this role puts you at the center of SciForium’s mission and growth.

What you'll do

• Lead the technical direction of the model serving platform, owning architecture decisions and guiding engineering execution.

• Build core serving components including execution runtimes, batching, scheduling, and distributed inference systems.

• Develop high-performance C++ and CUDA/HIP modules, including custom GPU kernels and memory-optimized runtimes.

• Collaborate with ML researchers to productionize new multimodal models and ensure low-latency, scalable inference.

• Build Python APIs and services that expose model capabilities to downstream applications.

• Mentor and support other engineers through code reviews, design discussions, and hands-on technical guidance.

• Drive performance profiling, benchmarking, and observability across the inference stack.

• Ensure high reliability and maintainability through testing, monitoring, and engineering best practices.

• Troubleshoot and resolve complex issues across GPU, runtime, and service layers.

Ideal candidate profile

• Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience

• 5+ years of experience designing and building scalable, reliable backend systems or distributed infrastructure.

• Strong understanding of LLM inference mechanics (prefill vs decode, batching, KV cache)

• Experience with Kubernetes/Ray, Containerization

• Strong proficiency in C++, Python.

• Strong debugging, profiling, and performance optimization skills at the system level.

• Ability to collaborate closely with ML researchers and translate model or runtime requirements into production-grade systems.

• Effective communication skills and the ability to lead technical discussions, mentor engineers, and drive engineering quality.

• Comfortable working from the office and contributing to a fast-moving, high-ownership team culture.

Nice-to-have

• Experience with ML systems engineering, distributed GPU scheduling, open source inference engine like vLLM, Sglang, or TRT-LLM

• Experience in building large scale ML/MLOps infrastructure

• Proficiency in CUDA or ROCm and experience with GPU profiling tools

• Experience at an AI/ML startup, research lab, or Big Tech infrastructure/ML team.

• Familiarity with multimodal model architectures, raw-byte models, or efficient inference techniques.

• Contributions to open-source ML or HPC infrastructure

Benefits include

• Medical, dental, and vision insurance

• 401k plan

• Daily lunch, snacks, and beverages

• Flexible time off

• Competitive salary and equity

Equal opportunity

Sciforium is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply