MisuJob - AI Job Search Platform MisuJob

Applied Data Scientist, Evaluation & Model Behavior

Agi Inc

San Francisco, California, United States permanent

Posted: March 31, 2026

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

We're looking for a skilled Data Scientist to join our team in San Francisco, California, to work on building trustworthy and reliable AI agents that redefine human-AI collaboration for millions.

Job Description

Think Different. Build the Future. 🚀

Our Mission

Build everyday AGI. Trustworthy, consumer-grade agents that redefine human–AI collaboration for millions. Software shouldn’t wait for commands; it should partner with you, amplifying what you can do every single day.

Why AGI, Inc.

We’re a stealth team of elite founders and AI researchers, with backgrounds spanning Stanford, OpenAI, and DeepMind. We’re industry leaders in mobile and computer-use agents, bringing these capabilities to consumer scale.

Grounded in years of agent research, our AI is designed with trustworthiness and reliability as core pillars, not afterthoughts.

We are supported by tier-1 investors who funded the first generation of AI giants; now they’re backing us to build the next: everyday AGI. (Watch the demo)

If you see possibility where others see limits, read on.

About the Role

As an Applied Scientist focused on Evaluation & Model Behavior, you will design and implement the systems used to measure and improve the performance of Computer Use Agents.

This is not a support role. You will be responsible for the technical definition of model quality, including the design of evaluation metrics, the curation of training datasets, and the engineering of system prompts. You'll work directly with the engineering team to translate product requirements into technical specifications and quantifiable benchmarks.

You'll focus on rigor, clarity, and impact, ensuring every metric, dataset, and prompt moves us toward more reliable, trustworthy agents.

What You'll Do

Model Behavior Design: Translate product requirements into technical specifications for model behavior. Engineer system prompts and few-shot examples to address specific capability gaps and behavioral failures.

Evaluation Design: Define metrics for reasoning, tool usage, and safety, and validate these metrics against human judgment to ensure statistical rigor.

Data Strategy: Design algorithms to filter, score, and select training data. Write Python scripts to sanitize inputs and manage the training data lifecycle from raw logs to high-quality datasets.

Failure Analysis: Investigate regressions in model benchmarks. Diagnose root causes, distinguishing between data quality issues, prompt instruction failures, or underlying model capability gaps and implement fixes.

Ground Truth Management: Define rubrics and guidelines for human annotation. Maintain reference datasets ("Golden Sets") to establish a consistent baseline for model performance evaluation.

Minimum Qualifications

• Master's degree or PhD in Computer Science, Data Science, Statistics, or a related technical field, or equivalent practical experience

• 3+ years of experience in Data Science, Machine Learning, or Applied Science

• Proficiency in Python, with experience writing production-quality code for data pipelines or evaluation harnesses

• Experience with experimental design, A/B testing, or statistical analysis

Preferred Qualifications

• Experience with Large Language Models (LLMs), including prompt engineering, fine-tuning, or RLHF workflows

• Experience building automated evaluation systems or implementing model-based evaluation frameworks

• Ability to translate product requirements into measurable technical metrics

• Experience managing human-in-the-loop data pipelines or annotation quality control

Why This Role Matters: You can't improve what you can't measure. You can't ship what you can't trust.

You will define the technical definition of quality for our agents — the metrics that predict real-world success, the datasets that encode user intent, and the prompts that shape model behavior. Your work will directly determine how quickly we can iterate and how confidently we can ship.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply