
WHAT IT IS
The emergence of agentic software development has accelerated the speed at which code is written, reviewed, and deployed across the industry. Consequently, testing frameworks must adapt to this rapidly changing environment. Faster development necessitates faster testing to detect bugs as they appear in a codebase, without the need for constant updates and maintenance.
Just-in-Time Tests (JiTTests) represent a fundamentally new testing approach where large language models (LLMs) automatically generate tests on the fly to catch bugs—even those that traditional testing might miss—just before the code is deployed to production.
A Catching JiTTest specifically targets regressions introduced by code changes. This testing approach reimagines decades of software testing theory and practice. Unlike traditional testing, which relies on static test suites, manual creation, and ongoing maintenance, Catching JiTTests require no test maintenance or code review, allowing engineers to focus on real bugs instead of false positives. These tests employ advanced techniques to maximize test signal value and minimize false positive occurrence, targeting serious failures.
HOW TESTING TRADITIONALLY WORKS
Traditionally, tests are manually built as new code is integrated into a codebase and are continuously executed, necessitating regular updates and maintenance. Engineers creating these tests must evaluate the behavior of not only the current code but also all possible future changes. This inherent uncertainty can result in tests that fail to catch issues or produce false positives when they do. Agentic development considerably accelerates code changes, burdening test development and escalating the costs of false positives and maintenance.
HOW CATCHING JITTESTS WORK
JiTTests are custom tests tailored to specific code changes, providing engineers with simple, actionable feedback on unexpected behavior changes without requiring them to read or write test code. LLMs can automatically generate JiTTests as soon as a pull request is submitted. Since the JiTTest is LLM-generated, it can often deduce the plausible intention of a code change and simulate potential faults that may arise from it.
By understanding intent, Catching JiTTests can significantly reduce false positive instances.
Key steps in the Catching JiTTest process include:
- New code is introduced into the codebase.
- The system infers the intention behind the code change.
- Mutants (code versions with intentionally inserted faults) are created to simulate potential issues.
- Tests are generated and executed to detect these faults.
- A combination of rule-based and LLM-based assessors focuses on true positive failures.
- Engineers receive clear, relevant reports of unexpected changes at critical moments.
WHY IT MATTERS
Catching JiTTests are designed for an AI-powered agentic software world and speed up testing by concentrating on serious unexpected bugs. They eliminate the need for engineers to write, review, and test intricate test code. JiTTests address many issues of traditional testing:
- They are generated on-the-fly for each code change, eliminating ongoing maintenance costs and shifting effort from humans to machines.
- They are tailored to each change, enhancing robustness and reducing vulnerability to intended updates.
- They automatically adapt as code evolves.
- They only require human review when a bug is detected.
This marks a significant shift in testing infrastructure, refocusing from generic code quality to verifying if a test effectively detects faults in a specific change without false positives. It enhances overall testing while keeping pace with agentic coding.
READ THE PAPER
Just-in-Time Catching Test Generation at Meta