WildChat-4.8M: 4.8M Real User–ChatGPT Conversations (Open Dataset)

2 yuntian 1 8/11/2025, 8:42:29 PM huggingface.co ↗

Comments (1)

yuntian · 2h ago
We just released 4.8 million real user–ChatGPT conversations collected from our public chatbot.

- Covers a wide range of topics and languages, all from actual users in the wild.

- Includes 122K conversations from reasoning models (o1-preview and o1-mini) which are long, often involving complex problem solving, and very costly to collect.

- 2.5M conversations from GPT-4o.

Links:

- Non-toxic version: https://hf.co/datasets/allenai/WildChat-4.8M

- Full version (gated): https://hf.co/datasets/allenai/WildChat-4.8M-Full

- Exploration tool: https://wildvisualizer.com