Freelance Agent Evaluation Engineer
Mindrift
Posted: February 9, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Freelance Agent Evaluation Engineer
Required Skills
Job Description
Please submit your CV in English and indicate your level of English proficiency.
Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment.
What this opportunity involves
While each project involves unique tasks, contributors may:
• Create structured test cases that simulate complex human workflows
• Define gold-standard behavior and scoring logic to evaluate agent actions
• Analyze agent logs, failure modes, and decision paths
• Work with code repositories and test frameworks to validate your scenarios
• Iterate on prompts, instructions, and test cases to improve clarity and difficulty
• Ensure that scenarios are production-ready, easy to run, and reusable
What we look for
This opportunity is a good fit for software engineers, open to part-time, non-permanent projects. Ideally, contributors will have:
• 3+ of software development experience with strong Python focus
• Experience with Git and code repositories
• Comfortable with structured formats like JSON/YAML for scenario description
• Understanding core LLM limitations (hallucinations, bias, context limits) and how these affect evaluation design
• Familiarity with Docker
• English proficiency - B2
How it works
Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid
Project time expectations
Tasks for this project are estimated to take 6-10 hours to complete, depending on complexity. This is an estimate and not a schedule requirement; you choose when and how to work. Tasks must be submitted by the deadline and meet the listed acceptance criteria to be accepted.
Payment
• Paid contributions, with rates up to $40/hour*
• Fixed project rate or individual rates, depending on the project
• Some projects include incentive payments
*Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.