Reasoning LLMs Guide for Devs

3 omarsar 1 6/9/2025, 2:26:09 PM promptingguide.ai ↗

Comments (1)

incomingpain · 5h ago
chatgpt, perpelxity, grok website all rapidly answered correctly. As expected.

claude 3.5 sonnet got it wrong. Came up with 4227.

claude 3.7 sonnet got it right with 5117

claude 4 sonnet also got it wrong. 4227

Gemini 2.5 pro got it wrong with 4227

GPT-4o got it right and wrong answering both 4227 and 5117? It found the right answer on the internet, the code was wrong. Interesting because my free chatgpt in first line is o4-mini and got it right?

deepseek r1 qwen3 8b super failed.

qwen2.5 coder 7b instruct came up with some code which correctly runs and gives me 5117, but the answer was 4291?