Learn from the Testing Experts
7th November, 2025
PHOENIX
Keynote
Testing the Two Faces of AI
As generative AI becomes increasingly integrated into software products, traditional testing methods often fall short of addressing the unique challenges these systems pose. Generative AI is inherently two-faced. It has the potential to deliver extraordinary value while introducing significant risks, generating outputs that can be both useful and harmful. The potential for value and harm are deeply intertwined, making the evaluation of generative AI complex and critical.
Join Ben Simo as he explores how applying real intelligence to testing artificial intelligence is essential for the responsible delivery of AI-powered software. Testing helps teams understand fuzzy, non-deterministic systems, uncover hidden risks, and make informed decisions to mitigate harm while delivering meaningful value.
Ben will share strategies for evaluating the efficacy and safety of AI systems, enabling teams to manage the risks of AI while responsibly delivering value.
Takeaways from this talk
- Two-Faced Nature of Generative AI: Gain a deeper understanding of the fundamental characteristics and dual qualities of generative AI.
- Demonstrating Capability: Learn approaches for evaluating the usefulness of inexplicable and fuzzy outputs from AI systems.
- Discovering Risk: Explore techniques for identifying and testing AI-specific risks.
Featured Speaker
Training AI to Watch NBA Games with Reinforcement Learning
What if AI could browse NBA.com just like a human, finding live games, pulling up player stats, and navigating highlight reels, without a single hardcoded click? In this session, we’ll explore how reinforcement learning (RL) can be applied to web automation, turning the browser itself into a dynamic training ground for intelligent agents.
You’ll learn how we adapted classic RL concepts, like reward signals, action spaces, and state encoding, to the unpredictable world of a real sports website. We’ll walk through building an NBA-specific browser “the-league” using Playwright, encoding structured HTML states, defining actions (clicks, scrolls, searches), and training a TensorFlow.js model to explore and achieve goals such as finding a live game or looking up a player’s stats.
Takeaways from this talk
- A blueprint for creating their own RL-powered browser agents
- Insights into designing reward systems for unstructured, real-world websites
- Lessons learned from bridging modern web automation tools with AI training loops
AI Orchestrated Test Automation via Claude with Playwright MCP and Selenium MCP
This session explores how to leverage Claude as an AI orchestrator to drive test automation workflows with Playwright MCP and Selenium MCP. We’ll walk through launching browsers, simulating user interactions and generating test scripts programmatically. By combining Claude’s AI capabilities with Playwright and Selenium we enable better end-to-end test automation.
Takeaways from this talk
- Understand what Claude is and how it can be applied to test automation
- Learn the concept of an MCP (Model Context Protocol) and its role in automation workflows
- Gain hands-on knowledge of launching browsers and perform user actions using Selenium MCP and Playwright MCP
- See how AI can generate and refine test scripts for faster automation development
Agentic AI in Quality Assurance: Automating the Future of API Testing and Documentation
AI is transforming quality assurance from repetitive automation to intelligent, self-driven systems. This talk explores how agentic AI—systems that reason, adapt, and act independently—is being used to supercharge API testing and documentation in modern development environments.
- Through real-world examples, you’ll see how AI can:
- Automatically generate API test cases,
- Keep Swagger/OpenAPI docs up to date,
- Integrate seamlessly with CI/CD,
- Scale effortlessly across diverse codebases.
Whether you’re a QA engineer, developer, or engineering leader, you’ll gain actionable insights to future-proof your quality pipeline with AI-driven solutions.
Takeaways from this talk
- Agentic AI enables self-maintaining QA systems for APIs.
- You can auto-generate tests and docs reliably—even at scale.
- AI integrates naturally into CI/CD workflows.
- Think beyond automation—think autonomy.
- The future of QA is not just faster… it’s smarter.
The Next Generation of Automation Taxonomy
Have you ever debated the definitions of Unit Test, Component Test, Integration Test, or End-to-End Test, only to find it unproductive? Striking the right balance of automation types is vital for a good return of investment. How can we follow good advice when we don’t agree on terms?
Join Curtis in this thought provoking session, as he re-examines the most important characteristics of automation suites, offering a framework for creating test categories that work in your context. Leaving the dogma behind we can then explore the wisdom of familiar patterns with fresh eyes.
Takeaways from this talk
- New ways to evaluate your automation suites
- A framework to develop an automation taxonomy that works for your company
- A fresh perspective on the automation pyramid
Building Test Frameworks That Scale with Complexity
Modern applications depend on a network of services — APIs, message queues, databases, and monitoring tools. Testing these environments is challenging, because frameworks often break under too many dependencies or fail to scale as systems grow. In this talk, I’ll share my experience at Early Warning (Zelle), where I built a test framework from scratch to support mission-critical financial transactions. The goal wasn’t just automation, but a framework that was scalable, resilient, and trusted by the business. Attendees will learn how to approach testing in complex environments, and how to turn testing into a strategic advantage rather than a bottleneck.
Takeaways from this talk
- Why complexity makes test frameworks fragile — and how to design for resilience.
- Principles for building frameworks that scale with dependencies (queues, databases, logs, etc.).
- How to make testing a source of confidence for both engineers and stakeholders.
- Lessons learned from building a framework at Zelle that supported real-time, high-stakes systems.