We should remind you that like e.g. students can be more and more proficient in feigning, and examiners must remain focused on judging if the "understanding" - the actual "performance indicator", the sought - is there, beyond any surface of alluring presentation,
all those hints we receive from LLM use make us return to the point, "but can it do procedural reasoning - is it reliable"?
That something proceeds in a direction of "being more and more convincing" is a bad direction, detrimental, not progressive, when what counts is "actually having the juice".
> Is this ...? ... they all vary dramatically in problems
If the above were achieved, it would be an architectural revolution, and we would have been informed. If it is "more of the same, but more advanced", then the submitted shows a structural problem.
paul7986 · 4h ago
Chat with it when I drive to get things done like GPT create the best route for a New Mexico trip (El Paso to Carlsbad to Rosewell to White Sands to Truth & Consequence to Albuquerque) and create an image of it showing the drive distance between them all. Just like I would have to slog through and use Apple or Google Maps to do this .. GPT should do this for me and much quicker via an image I can then share with travel friends. Some nerd rage today as I discovered it can't do this ATM. Hopefully they are working on integrating Bing Maps so I can use it for travel; use Apple and Google Maps less.
Now it can't do that but does some cool things based off of what you try with it like recently in an Icelandic restaurant I took a pic of the menu, uploaded to GPT and asked it to create a mirror image yet show me the menu in English and US dollars (not Icelandic Krona). That was very handy as I then shared it with my travel friends in the restaurant and those in the hotel.
Overall love hearing how people are using it uniquely too! Used it to count my calories for a year as I eat out daily at healthy chains (Cava and others) and GPT can easily grab calories from their sites and calculate.
Is this 4o-mini, 4o, o4-mini, o4-mini-high, o3 (etc)?
No matter how strange their naming, they all vary dramatically in problems like that presented in the article.
Which one the experience was with is critical to drawing any kind of conclusions.
o3, for example, nailed it on the first try: https://chatgpt.com/share/68231cfc-d258-8013-aad2-5115eba880...
all those hints we receive from LLM use make us return to the point, "but can it do procedural reasoning - is it reliable"?
That something proceeds in a direction of "being more and more convincing" is a bad direction, detrimental, not progressive, when what counts is "actually having the juice".
> Is this ...? ... they all vary dramatically in problems
If the above were achieved, it would be an architectural revolution, and we would have been informed. If it is "more of the same, but more advanced", then the submitted shows a structural problem.
Now it can't do that but does some cool things based off of what you try with it like recently in an Icelandic restaurant I took a pic of the menu, uploaded to GPT and asked it to create a mirror image yet show me the menu in English and US dollars (not Icelandic Krona). That was very handy as I then shared it with my travel friends in the restaurant and those in the hotel.
Overall love hearing how people are using it uniquely too! Used it to count my calories for a year as I eat out daily at healthy chains (Cava and others) and GPT can easily grab calories from their sites and calculate.