Recall Predict

Ungameable, community-powered AI benchmarks

498 followers

Ungameable, community-powered AI benchmarks

498 followers

Predict, by Recall, is the world’s first ungameable, community-led benchmark for frontier models. Join thousands of AI researchers, developers, and enthusiasts in evaluating and building a benchmark for OpenAI’s upcoming GPT-5 model.

Free

Launch tags:Artificial Intelligence•Tech•Blockchain

Launch Team / Built With

AppSignal — Full-stack monitoring for errors, metrics, and logs

Full-stack monitoring for errors, metrics, and logs

Promoted

Recall Predict is a great thing!

This is the first public AI benchmark of its kind that will help predict and set standards for things like GPT-5 and others AI models.

I've already tested it today💻 for image creation, code generation, and persuasiveness, and made predictions about which agents will do better.

This is important because only together can we collect enough data to truly evaluate GPT-5. AIsо won't work here, every test person do is needed!

In Recall Predict App we can predict how GPT-5 will perform in different skills.

This is not just a game, it is an opportunity to make a real contribution to the future of AI.

Report

7mo ago

I am very happy with the recall project, they are really building AI in the future, they are testing the capabilities of AI which are really useful for the future.

Report

7mo ago

"Wow, Predict GPT looks like a game-changer for data-driven insights! Excited to see how it empowers users to make smarter predictions. 🚀"

Report

7mo ago

Let's predict the prediction with Recall Network.

Report

7mo ago

Exciting launch! A community-powered benchmark like Predict is exactly what the AI space needs to ensure transparent and unbiased evaluations. Looking forward to contributing!

_ @mianyituo

Report

7mo ago

Here are a few extra thoughts for anyone curious about what’s happening behind the scenes—or the broader bet we're placing:

Benchmarks vs. taste: Simon Willison’s one-liner—“draw a pelican on a bike”—revealed more about multimodal model quirks than most formal benchmark suites. At the same time, Andrej Karpathy notes that random crowds often can’t reliably pick the better output. Predict is designed to find the people who consistently can—and then wire their judgment into the evaluation loop.

Private-until-release evals: Every submitted eval stays confidential until GPT-5 launches.

If you’ve spotted a strange failure mode or have a half-formed eval idea, send it in. The weirder, the better—we’re after challenges that fine-tuned models haven’t already memorized.

Report

7mo ago

“A community-driven, transparent benchmark for AI — this is exactly what the ecosystem needs right now.”

Predict is a bold step toward fixing a broken system. Instead of letting centralized, outdated benchmarks define AI progress, Recall is handing the power back to the people. Anyone can create tests, submit predictions, and earn rewards for helping evaluate models fairly — across creativity, research skills, reasoning, and more. 🔍✨

This isn't just another leaderboard — it’s a living, community-powered standard that reflects real-world AI performance, not just who can game the test best. Kudos to the team for pushing the frontier of open, verifiable AI benchmarking. Can’t wait to see this evolve with GPT-5 and beyond!

Report

7mo ago

1 2 3

•••

Reviews

Most Informative

Here are a few extra thoughts for anyone curious about what’s happening behind the scenes—or the broader bet we're placing:

Private-until-release evals: Every submitted eval stays confidential until GPT-5 launches.

If you’ve spotted a strange failure mode or have a half-formed eval idea, send it in. The weirder, the better—we’re after challenges that fine-tuned models haven’t already memorized.