Show HN: BaaS to build agents as data, not code (github.com)
5 points by ishita159 1d ago 0 comments
Show HN: Bringing Tech News from HN to My Community (sh4jid.me)
3 points by sh4jid 1d ago 2 comments
How Confident Are You, ChatGPT?
2 aylinakkus 1 8/9/2025, 10:48:10 AM aylinakkus.github.io ↗
Comments (1)
martianlantern · 10h ago
Very insightful post, this may work in the IMO setting because mathematical problems are inherently binary if we ignore somethings like the incompleteness theorem. In contrast, subjective tasks, such as evaluating a painting or rating a poem, lack absolute truth. How would such reasoners estimate confidence in these cases, and to what extent could RL techniques effective in the IMO transfer to real world problems?