Forums

We asked 5 AI models the same 1,000 questions. How often do you think they agreed?

We built a model to generate 1,000 questions that people actually ask.
Not random prompts.
We scraped 50,000 real user queries from search logs, forum threads, and support tickets across 12 industries.
We clustered them by intent and generated 1,000 representative questions.

We asked those same 1,000 questions to 5 AI models: ChatGPT (GPT-4), Gemini (Ultra), Perplexity (Pro), Claude (4.5 Sonnet), and Llama (3).
We ran the experiment daily for 30 days. We tracked every citation at the source level.

The goal: measure citation overlap.
How often do these models cite the same source for the same question?

The dataset: