Every API testing eval we found either required source code access, relied on rich documentation, or measured output format rather than whether a test would catch a real failure.
So we built APIEval-20. Twenty scenarios across e-commerce, payments, auth, scheduling, and user management. Each scenario gives a model exactly two things: a JSON schema and a sample payload. No implementation details, no docs, no further context. The model has to generate a test suite from that alone.
The bugs are planted in live reference implementations. A bug is only caught if a generated test produces a response that deviates from correct behavior when run against the implementation. Submit through the hosted eval harness and get a score back.
Scoring weights bug detection at 70%, API surface coverage at 20%, and test efficiency at 10%.
KushoAI helps you test web user journeys in minutes. Record user journeys using the KushoAI extension and watch as exhaustive test code gets generated. Find bugs without having to think of scenarios and write endless automation scripts to run them.
Kusho is an AI Agent for API testing. It generates an exhaustive test suite for your API in 2 minutes– just put in an API spec and get a host of functional tests covering real-world scenarios. Run these tests with AI-generated assertions in a single-click.