How Confident Are You, ChatGPT?

1 aylinakkus 1 8/9/2025, 10:48:10 AM aylinakkus.github.io ↗

Comments (1)

martianlantern · 1h ago
Very insightful post, this may work in the IMO setting because mathematical problems are inherently binary if we ignore somethings like the incompleteness theorem. In contrast, subjective tasks, such as evaluating a painting or rating a poem, lack absolute truth. How would such reasoners estimate confidence in these cases, and to what extent could RL techniques effective in the IMO transfer to real world problems?