ML/AI Research Engineer — Agentic AI Lab (Founding Team)

Location: San Francisco Bay Area
Type: Full-Time
Compensation: Competitive salary + meaningful equity (founding tier)

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

About the Role

We’re designing the future of enterprise AI infrastructure — grounded in agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance.

We’re looking for an ML/AI Research Engineer to join our AI Lab and lead the design, training, evaluation, and optimization of agent-native AI models. You'll work at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning — building the intelligence layer that sits on top of our enterprise data fabric.

This isn’t a prompt engineer role. It’s full-cycle ML: from data curation and fine-tuning to evaluation, interpretability, and deployment — with cost-awareness, alignment, and agent coordination all in scope.

Core Responsibilities

• Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured data

• Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graph

• Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task data

• Develop embedding-based memory and retrieval chains with token-efficient chunking strategies

• Create reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)

• Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability tools

• Contribute to model observability, drift detection, error classification, and alignment

• Optimize inference latency and GPU resource utilization across cloud and on-prem environments

Desired Experience

Model Training:

• Deep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRA

• Worked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelines

• Comfortable building and maintaining custom training datasets, filters, and eval splits

• Understand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantization

RAG + Knowledge Graphs:

• Experience building enterprise-grade RAG pipelines integrated with real-time or contextual data

• Familiar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)

• Experience grounding models with structured data (SQL, graph, metadata) + unstructured sources

• Bonus: Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systems

Agent Intelligence:

• Experience training or customizing agent frameworks with multi-step reasoning and memory

• Understand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and tools

• Familiar with self-correction, multi-agent communication, and agent ops logging

Optimization:

• Strong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuning

• Experience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)

Preferred Tech Stack

• LLM Training & Inference: HuggingFace Transformers, DeepSpeed, vLLM, FlashAttention, FSDP, LoRA

• Agent Orchestration: LangChain, LangGraph, ReAct, OpenAgents, LlamaIndex

• Vector DBs: Weaviate, Qdrant, FAISS, Pinecone, Chroma

• Graph Knowledge Systems: Neo4j, Puppygraph, RDF, Gremlin, JSON-LD

• Storage & Access: Iceberg, DuckDB, Postgres, Parquet, Delta Lake

• Evaluation: OpenLLM Evals, Trulens, Ragas, LangSmith, Weight & Biases

• Compute: Ray, Kubernetes, TGI, Sagemaker, LambdaLabs, Modal

• Languages: Python (core), optionally Rust (for inference layers) or JS (for UX experimentation)

Soft Skills & Mindset

• Startup DNA: resourceful, fast-moving, and capable of working in ambiguity

• Deep curiosity about agent-based architectures and real-world enterprise complexity

• Comfortable owning model performance end-to-end: from dataset to deployment

• Strong instincts around explainability, safety, and continuous improvement

• Enjoy pair-designing with product and UX to shape capabilities, not just APIs

Why This Role Matters

This role is foundational to our thesis: that agents + enterprise data + knowledge modeling can create intelligent infrastructure for real-world, multi-billion-dollar workflows. Your work won’t be buried in research reports — it will be productionized and activated by hundreds of users and hundreds of thousands of decisions. If this is your dream role - we would love to hear from you.

ML/AI Research Engineer — Agentic AI Lab (Founding Team)

Interested in this position?

Required Skills

Job Description

Why Apply Through MisuJob?

Frequently Asked Questions

How do I apply for this position?

Is MisuJob free for job seekers?

How does AI matching work?

Can I apply to jobs in other countries?

Ready to Apply?