Recall Predict - Ungameable, community-powered AI benchmarks
by•
Predict, by Recall, is the world’s first ungameable, community-led benchmark for frontier models. Join thousands of AI researchers, developers, and enthusiasts in evaluating and building a benchmark for OpenAI’s upcoming GPT-5 model.
Replies
Best
Hey Product Hunt
Report
let go Recall !
Report
Might improve productivity
Report
i love recall
Report
Let's predict the prediction with Recall Network.
Report
Recall Predict is a great thing!
This is the first public AI benchmark of its kind that will help predict and set standards for things like GPT-5 and others AI models.
I've already tested it today💻 for image creation, code generation, and persuasiveness, and made predictions about which agents will do better.
This is important because only together can we collect enough data to truly evaluate GPT-5. AIsо won't work here, every test person do is needed!
In Recall Predict App we can predict how GPT-5 will perform in different skills.
This is not just a game, it is an opportunity to make a real contribution to the future of AI.
Report
Lets go Recall, Perfect Project
Report
Here are a few extra thoughts for anyone curious about what’s happening behind the scenes—or the broader bet we're placing:
Benchmarks vs. taste: Simon Willison’s one-liner—“draw a pelican on a bike”—revealed more about multimodal model quirks than most formal benchmark suites. At the same time, Andrej Karpathy notes that random crowds often can’t reliably pick the better output. Predict is designed to find the people who consistently can—and then wire their judgment into the evaluation loop.
Private-until-release evals: Every submitted eval stays confidential until GPT-5 launches.
If you’ve spotted a strange failure mode or have a half-formed eval idea, send it in. The weirder, the better—we’re after challenges that fine-tuned models haven’t already memorized.
Report
“A community-driven, transparent benchmark for AI — this is exactly what the ecosystem needs right now.”
Predict is a bold step toward fixing a broken system. Instead of letting centralized, outdated benchmarks define AI progress, Recall is handing the power back to the people. Anyone can create tests, submit predictions, and earn rewards for helping evaluate models fairly — across creativity, research skills, reasoning, and more. 🔍✨
This isn't just another leaderboard — it’s a living, community-powered standard that reflects real-world AI performance, not just who can game the test best. Kudos to the team for pushing the frontier of open, verifiable AI benchmarking. Can’t wait to see this evolve with GPT-5 and beyond!
Replies
Hey Product Hunt
let go Recall !
Might improve productivity
i love recall
Let's predict the prediction with Recall Network.
Recall Predict is a great thing!
This is the first public AI benchmark of its kind that will help predict and set standards for things like GPT-5 and others AI models.
I've already tested it today💻 for image creation, code generation, and persuasiveness, and made predictions about which agents will do better.
This is important because only together can we collect enough data to truly evaluate GPT-5. AIsо won't work here, every test person do is needed!
In Recall Predict App we can predict how GPT-5 will perform in different skills.
This is not just a game, it is an opportunity to make a real contribution to the future of AI.
Lets go Recall, Perfect Project
Here are a few extra thoughts for anyone curious about what’s happening behind the scenes—or the broader bet we're placing:
Benchmarks vs. taste: Simon Willison’s one-liner—“draw a pelican on a bike”—revealed more about multimodal model quirks than most formal benchmark suites. At the same time, Andrej Karpathy notes that random crowds often can’t reliably pick the better output. Predict is designed to find the people who consistently can—and then wire their judgment into the evaluation loop.
Private-until-release evals: Every submitted eval stays confidential until GPT-5 launches.
If you’ve spotted a strange failure mode or have a half-formed eval idea, send it in. The weirder, the better—we’re after challenges that fine-tuned models haven’t already memorized.
“A community-driven, transparent benchmark for AI — this is exactly what the ecosystem needs right now.”
Predict is a bold step toward fixing a broken system. Instead of letting centralized, outdated benchmarks define AI progress, Recall is handing the power back to the people. Anyone can create tests, submit predictions, and earn rewards for helping evaluate models fairly — across creativity, research skills, reasoning, and more. 🔍✨
This isn't just another leaderboard — it’s a living, community-powered standard that reflects real-world AI performance, not just who can game the test best. Kudos to the team for pushing the frontier of open, verifiable AI benchmarking. Can’t wait to see this evolve with GPT-5 and beyond!
so good, i like predict