Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

Even if GPT-3.5 was noticeably worse for any of these questions, it's honestly more interesting for someone's first experience to be with the exaggerated shortcomings of AI. The slightly-screwy answers are still endemic of what you see today, so it all ended well enough I think. Would've been a terribly boring exchange if Knuth's reply was just "looks great, thanks for asking ChatGPT" with no challenging commentary.

vbezhenar · 1h ago

For question 3, ChatGPT 5 Pro gave better answer:

> It isn’t “wrong.” Wolfram defines Binomial[n,m] at negative integers by a symmetric limiting rule that enforces Binomial[n,m] = Binomial[n,n−m]. With n = −1, m = −1 this forces Binomial[−1,−1] = Binomial[−1,0] = 1. The gamma-formula has poles at nonpositive integers, so values there depend on which limit you adopt. Wolfram chooses the symmetry-preserving limit; it breaks Pascal’s identity at a few points but keeps symmetry. If you want the convention that preserves Pascal’s rule and makes all cases with both arguments negative zero, use PascalBinomial[−1,−1] = 0. Wolfram added this explicitly to support that alternative definition.

Of course this particular question might have been in the training set.

Honestly 2.5 years feel like infinity when it comes to AI development. I'm using ChatGPT very regularly, and while it's far from perfect, recently it gave obviously wrong answers very rarely. Can't say anything about ChatGPT 5, I feel like in my conversations with AI, I've reached my limit, so I'd hardly notice AI getting smarter, because it's already smart enough for my questions.

seanhunter · 46m ago

On Wolfram specifically, GPT-5 is a huge step up from GPT-4. One of the first things I asked it was to write me a mathematica program to test the basic properties (injectivity, surjectivity, bijectivity) of various functions. The notebook it produced was

1) 100% correct

2) Really useful (ie it includes various things I didn’t ask for but are really great like a little manipulator to walk through the function at various points and visualize what the mapping is doing)

3) Built in a general way so I can easily change the mapping to explore different types of functions and how they work.

It seems very clear (both from what they said in the launch demos etc and from my experience of trying it out) that performance on coding tasks has been an area of massive focus and the results are pretty clear to me.

godelski · 5m ago

  > gave *obviously wrong* answers very rarely.

I don't think this is a reason I'd trust it, actually this is a reason I don't trust it.

There's a big difference between "obviously wrong" and "wrong". It is not objective but entirely depends on the reader/user.

The problem is it optimizes deception alongside accuracy. It's a useful tool but good design says we should want to make errors loud and apparent. That's because we want tools to complement us, to make us better. But if errors are subtle, nuanced, or just difficult to notice then there is actually a lot of danger to the tool (true for any tool).

I'm reminded of the Murray Gell-Mann Amnesia effect: you read something in the news paper that you're an expert in and lambast it for its inaccuracies, but then turn the page to something you don't have domain knowledge and trust it.

The reason I bring up MGA is because we don't often ask GPT things we know about or have deep knowledge in. But this is a good way to learn about how much we should trust it. Pretend to know nothing about a topic you are an expert in. Are its answers good enough? If not, then be careful when asking questions you can't verify.

Or, I guess... just ask it to solve "5.9 = x + 5.11"

tra3 · 43m ago

Right, I’m still trying to wrap my mind around how gpts work.

If we keep retraining them on the currently available datasets then the questions that stumped ChatGPT3 are in the training set for chatgpt5.

I don’t have the background to understand the functional changes between ChatGPT 3 and 5. It can’t be just the training data can it?

rvba · 37m ago

What is with those reposts?

Someone could at least run the same questions on the latest model and show the new answers.

Farming karma reddit style..

wslh · 1h ago

It would be great to have an update from Knuth. There is no other Knuth.