Senior Testing Engineer
Robotsandpencils
Posted: May 14, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
We're seeking a Senior Testing Engineer to join our team and help design and ship AI co-workers that integrate into enterprise operations and deliver measurable results for our clients.
Required Skills
Job Description
Company Overview
Robots & Pencils is an applied AI engineering firm building the next frontier of business architecture. We design and ship AI co-workers that integrate into enterprise operations and deliver measurable results for our clients. We’re all in on AWS, combining deep UX capability with senior engineering talent to get AI into production fast and keep it there. We’ve earned the trust of leaders across Consumer Products and Retail, Education, Energy, Financial Services, Healthcare, and Manufacturing and more, and earned a reputation as the nimble alternative to traditional global systems integrators. Founded in 2009, with delivery centers in Canada, the United States, Eastern Europe, and Latin America, we are smaller, faster, and more senior by design. Our teams average 15+ years of experience. We move fast, sweat the details, and build things that actually ship.
Position Overview
We’re looking for a Senior Testing Engineer to join our team and own quality across a cloud-native AI/ML platform built on AWS. This is not a QA position. It is a hands-on engineering role for someone who writes test code across a production Python and AWS stack. In this role, you will evaluate the platform, create a comprehensive test coverage plan, and drive best practices across the team. You’ll propose and implement improvements to our testing infrastructure, modify production code to improve testability when necessary, and work with the broader engineering team to establish patterns that other developers can adopt and carry forward. You will be the authoritative voice on testing automation and test engineering on this engagement, working with minimal supervision.
Why This Role Matters
At Robots & Pencils, we design AI systems for a human world. Our name says it all. Robots and pencils means engineering paired with creativity, because every agent we ship has to work for real people in real workflows. That balance is baked into how we operate.
This platform delivers AI-powered learning and automation to real users—and its reliability depends on the quality infrastructure you build. You won’t be auditing a test suite. You’ll be building one from the ground up, shaping what “done” means across every layer of the stack: Lambdas, DynamoDB, SQS, event-driven flows, and agentic AI pipelines. When this platform works, people learn better and move faster. That’s what’s on the line.
What You’ll Do
Craft & Delivery
• Evaluate the platform, produce a thorough test coverage plan, and design a scalable testing architecture for the Python/AWS stack (Lambda, DynamoDB single-table design, SQS, S3, EventBridge, CDK) across unit, integration, E2E, agentic eval, and synthetic learner layers
• Write production-grade test code using PyTest and Python-native frameworks, build and maintain agentic evals and synthetic learner pipelines that validate AI-driven workflows end-to-end, and own quality gates in CI/CD pipelines (e.g. GitHub Actions)—modifying production code to improve testability when warranted
• Bring an AI-forward mindset to your daily work, using tools like Claude Code and Cursor to ship higher-quality work at pace
Collaboration & Communication
• Partner with engineering and product leadership to align test strategy with delivery goals and platform architecture decisions
• Translate test coverage status, quality risks, and recommended investments into terms technical and non-technical stakeholders can act on
• Lead test planning sessions and release readiness assessments, driving clear go/no-go signals across the team
Leadership & Influence
• Establish the testing standards, frameworks, and patterns the broader engineering team adopts and extends, mentoring junior and mid-level engineers on testing practices to raise quality ownership across the team rather than centralizing it on yourself
• Take ownership of quality end-to-end, including the unglamorous work of stabilizing flaky suites and paying down test debt
• Evaluate and introduce emerging tools and methodologies, continuously improving testing quality and velocity without chasing novelty for its own sake
What You’ll Bring
• 5+ years of professional software engineering experience with a strong focus on testing—unit, integration, E2E, and/or AI/ML system testing
• Strong Python programming skills; this role writes test code, not just test plans
• Hands-on experience with AWS services including Lambda, DynamoDB, SQS, S3, and EventBridge; CDK experience a strong plus
• Deep expertise with PyTest and Python-native testing frameworks, with a track record of designing and scaling test automation infrastructure
• Experience writing and maintaining E2E and integration tests for event-driven, serverless, or microservices architectures
• Familiarity with DynamoDB single-table design and the specific challenges of testing against it
• Experience building or validating agentic or LLM-based systems; comfort with evals, output consistency testing, and hallucination/accuracy validation
• Strong CI/CD expertise, with experience owning quality gates in delivery pipelines (e.g. GitHub Actions)
• Working knowledge of AI safety and responsible AI principles as they apply to validating LLM behavior, prompt injection defenses, and PII handling in test data
• Demonstrated ability to work independently, drive architectural recommendations, and deliver with minimal supervision
• Demonstrable usage of AI-forward tools such as Claude Code and Cursor
• Strong problem-solving skills and sound judgment in ambiguous technical territory
You’ll Do Well Here if You Are
• A doer. You see something broken and fix it. You’d rather move on clarity than wait for certainty.
• A fast learner who knows you don’t know everything. The AI landscape changes weekly. You’re senior enough to know better and curious enough to keep learning anyway.
• Direct in a way that makes the work better. You give honest feedback. You’d rather have the hard conversation than blow smoke.
• Obsessed with craft. You know genius is in the details. You ship exceptional, not perfect, and you don’t put your name on work you wouldn’t stand behind.
• Built for ownership. You honor commitments, admit mistakes fast, and back your teammates when a decision costs something. No handoffs, no finger-pointing.
• All in. You treat clients’ businesses like your own. You take the work seriously without taking yourself seriously.
• Resourceful when the budget, timeline, or team is tight. Constraints don’t slow you down. They sharpen you.
• Glad to be in the room with people who care as much as you do. Our teams average fifteen-plus years of experience. We hire people who push each other to do better work.