God created men; Sam Altman made them equal (taylor.town)
1 points by surprisetalk 2m ago 0 comments
Show HN: Omnara – Run Claude Code from Anywhere (github.com)
3 points by kmansm27 11m ago 0 comments
Evaluating GPT5's reasoning ability using the Only Connect game show
1 scrollaway 1 8/12/2025, 1:52:51 PM ingram.tech ↗
Insights: - GPT-5 does extremely well, but only marginally better than o3. - Model verbosity has little impact on accuracy and cleverness, except, interestingly, for the sequences round - "minimal" verbosity however causes accuracy to drop sharply.
We'll be publishing additional results in the coming days from our extended tests. We're looking at different types of evals (how do the models fare with a single item in a sequence vs. 2, 3, 4). We would also like to look at how the models behave in a team of 3, replicating the format of the game show.
We were unable to find evidence that the Only Connect games are in the training materials (which of course is likely to change now). Finally, we are looking at replicating the results of the connecting wall with the New York Times' Connections, however we suspect those to be in the training materials which would skew the results.