ARCHIVED
This job listing has been archived and is no longer accepting applications.
MisuJob - AI Job Search Platform MisuJob

Test Engineer-AI/LLM

OPPO US Research Center

Palo Alto, California, United States permanent

Posted: June 30, 2025

Interested in this position?

Create a free account to apply with AI-powered matching

Quick Summary

As a Test Engineer-AI/LLM, you will be responsible for evaluating the performance, reliability, and safety of Large Language Models (LLMs) in real-world product scenarios and testing end-to-end generative AI solutions.

Job Description

OPPO US Research Center is seeking a full-time meticulous and innovative AI/LLM Test Engineer to join our cutting-edge AI team. In this critical role, you will evaluate the performance, reliability, and safety of Large Language Models (LLMs) in real-world product scenarios and test end-to-end generative AI solutions. Your work will directly shape how users experience AI-powered features by ensuring robustness, accuracy, and alignment with product goals. This is a unique opportunity to pioneer testing methodologies for next-generation AI systems at the forefront of technology.

We are also seeking a Contractor based LLM Evaluation & QA Engineer to support the testing and validation of large language model (LLM)-powered applications. You will help implement test strategies, execute evaluation workflows, and assist in model performance validation across diverse generative AI use cases.

This contract role is ideal for someone with hands-on experience in AI/ML evaluation, QA engineering, or data analysis who wants to deepen their exposure to generative AI systems.


Requirements:
Full-time position requirement:

Core Testing & Evaluation

• Design and execute performance tests for LLMs across diverse product use cases (e.g., chatbots, content generation etc.).
• Develop automated test frameworks to evaluate LLM outputs for accuracy, bias, safety, and coherence.
• Conduct end-to-end testing of integrated generative AI solutions, including APIs, data pipelines, and user interfaces.

Optimization & Validation

• Collaborate with ML engineers to validate fine-tuned models and optimize prompts for target scenarios.
• Analyze model failures, edge cases, and adversarial inputs to identify risks and improvement areas.
• Benchmark LLM performance against industry standards and product-specific KPIs.

Collaboration & Quality Assurance

• Partner with product, engineering, and research teams to define test requirements and acceptance criteria.
• Document defects, performance metrics, and test results to drive data-driven improvements.
• Advocate for AI ethics and safety through rigorous testing of fairness, bias mitigation, and content moderation.

Innovation & Tooling

• Build scalable tools for synthetic test data generation, prompt variation testing, and automated evaluation workflows.
• Stay current with advancements in generative AI testing, including red-teaming techniques and evaluation frameworks (e.g., HELM, Dynabench).
• Propose novel testing strategies for emerging challenges (e.g., hallucinations, context drift).

Basic Qualifications:

• Bachelor’s degree in Computer Science, Data Science, Engineering, or a related technical field, or equivalent practical experience.
• 1+ years of experience in software testing, data science, or ML validation, with exposure to AI/ML systems.
• Proficiency in Python and testing frameworks (e.g., PyTest, Selenium).
• Hands-on experience evaluating LLMs in production environments (e.g., GPT, Claude, Llama, Gemini).
• Strong analytical skills for dissecting model behavior, statistical performance, and failure modes.
• Familiarity with cloud platforms (GCP, Azure, or AWS) and MLOps tooling (e.g., MLflow, Weights & Biases).
• Experience with version control (Git) and agile development methodologies.

Preferred Qualifications:

• Master’s degree in AI, Machine Learning, or a related field.
• Expertise in prompt engineering, LLM fine-tuning (e.g., LoRA, RLHF), or optimization techniques.
• Experience with automated evaluation tools (e.g., LangChain, TruLens) or LLM-specific test suites.
• Knowledge of data pipelines, SQL/NoSQL databases, and API testing (e.g., Postman).
• Background in statistics, quantitative analysis, or data visualization for test insights.
• Contributions to AI safety/ethics initiatives or open-source LLM evaluation projects.
• Experience testing mobile-integrated AI solutions (Android/iOS).

Contractor position requirements:

Testing & Evaluation Support:

• Execute pre-defined performance tests for LLMs across various tasks (e.g., summarization, Q&A, chatbot flows).
• Run scripted evaluations to assess outputs for factuality, coherence, and safety.
• Perform manual and automated test execution on APIs and LLM-integrated user interfaces.

Prompt & model validation:

• Assist ML engineers in evaluating prompt variations and prompt-tuning outcomes.
• Log and analyze failure cases, anomalies, and edge cases based on provided guidelines.

Collabration & Documentation

• Work with QA leads, product managers, and ML engineers to understand test goals and criteria.
• Report defects, compile evaluation summaries, and maintain testing logs.

Tooling & Antomation:

• Use existing internal tools or frameworks to automate test runs and result collection.
• Contribute to prompt generation, input templating, or result tagging processes.

Basic Qualifications:

• Bachelor's degree or equivalent work experience in a technical field (e.g., Computer Science, Engineering, Data Science).
• 6+ months experience in software QA, data labeling, LLM evaluation, or ML testing projects.
• Basic Python proficiency, especially for data processing and automation tasks.
• Familiarity with LLMs (e.g., GPT, Claude, Gemini) and prompt-based outputs.
• Comfortable working with tools like Jupyter, Postman, or testing dashboards.
• Detail-oriented with good documentation habits.

Contractor Details:

• Duration: Long term
• Rate: Commensurate with experience
• Conversion Opportunity: High-performing contractors may be considered for full-time roles


Benefits:
OPPO is proud to be an equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.

The US base salary range for this full-time position is $100,000-$200,000 + bonus + long term incentives benefits. Our salary ranges are determined by role, level, and location.

Why Apply Through MisuJob?

AI-Powered Job Matching: MisuJob uses advanced artificial intelligence to analyze your skills, experience, and career goals. Our matching algorithm compares your profile against thousands of job requirements to find positions where you have the highest chance of success. This saves you hours of manual job searching and ensures you only see relevant opportunities.

One-Click Applications: Once you create your profile, applying to jobs is effortless. Your resume and cover letter are automatically tailored to highlight the most relevant experience for each position. You can apply to multiple jobs in minutes, not hours.

Career Intelligence: Beyond job matching, MisuJob provides valuable career insights. See how your skills compare to market demands, identify skill gaps to address, and understand salary benchmarks for your experience level. Make data-driven decisions about your career path.

Frequently Asked Questions

How do I apply for this position?

Click the "Register to Apply" button above to create a free MisuJob account. Once registered, you can apply with one click and track your application status in your dashboard.

Is MisuJob free for job seekers?

Yes, MisuJob is completely free for job seekers. Create your profile, get matched with jobs, and apply without any cost. We help you find your dream job without any hidden fees.

How does AI matching work?

Our AI analyzes your resume, skills, and experience to understand your professional profile. It then compares this against job requirements using natural language processing to calculate a match percentage. Higher matches mean better fit for the role.

Can I apply to jobs in other countries?

Absolutely. MisuJob features jobs from companies worldwide, including remote positions. Filter by location or look for remote opportunities to find jobs that match your preferences.

Ready to Apply?

Join thousands of job seekers using MisuJob's AI to find and apply to their dream jobs automatically.

Register to Apply