Ask HN: What do you dislike about ChatGPT and what needs improving?

12 zyruh 45 8/6/2025, 6:32:15 PM

I'm curious to hear feedback from the HN community about your biggest pain points or frustrations with ChatGPT (or similar LLMs).

What aspects of the experience do you find lacking, confusing, or outright irritating? Which improvements do you think are most urgent or would make the biggest difference?

Comments (45)

Fade_Dance · 2h ago

#1 problem is how sycophantic they are. I in fact want the exact opposite sort of interaction, where they push back against my ideas and actively try to correct and improve my thinking. Too often I am misled into giant waste of time because they have this need to please coded in to their default response structure.

You can say things like "you are a robot, you have no emotions, don't try to act human", but the output doesn't seem to be particularly well calibrated. I feel like when I modify the default response style, I'm probably losing something, considering that the defaults are what go through extensive testing.

jamestimmins · 48m ago

With Claude I often say “no glazing” and have told it to take the persona of Paul Bettany’s character in Margin Call, a nice enough but blunt/unimpressed senior colleague who doesn’t beat around the bush. Works pretty well.

netsharc · 1m ago

How... does it know what this persona is like? I suppose somewhere it's read (or, "gathered input") about this character..

akkad33 · 2h ago

For me it's been the opposite. They take on a condescending tone sometimes and sometimes they sound too salesy and trump up their suggestions

Fade_Dance · 2h ago

Yes, I agree with that as well.

Real humans have a spectrum of assuredness that naturally comes across in the conversation. With an LLM it's too easy to get drawn deep into the weeds. For example, I may propose that I use a generalized framework to approach a certain problem. In a real conversation, this may just be part of the creative process, and with time the thoughts may shift back to the actual hard data (and perhaps iterate on the framework), but with an LLM, too often it will blindly build onto the framework without ever questioning it. Of course it's possible to spur this action by prompting it, but the natural progression of ideas can be lost in these conversations, and sometimes I come out 15 minutes later feeling like maybe I just took half a step backwards despite talking about what seemed at the time like great ideas.

zyruh · 2h ago

"Real humans have a spectrum of assuredness" - well put. I've noticed this lacking as well with GPT. Thx!

zyruh · 2h ago

Thank you for your feedback!

jondwillis · 39m ago

Trying to avoid the things already mentioned:

- Opaque training data (and provenance thereof… where’s my cut of the profits for my share of the data?)

- Closed source frontier models, profit-motive to build moat and pull up ladders (e.g. reasoning tokens being hidden so they can’t be used as training data)

- Opaque alignment (see above)

- Overfitting to in-context examples- e.g. syntax and structure are often copied from examples even with contrary prompting

- Cloud models (seemingly) changing behavior even on pinned versions

- Over-dependence: “oops! I didn’t have to learn so I didn’t. My internet is out so now I feel the lack.”

zyruh · 32m ago

Thank you!

decide1000 · 2h ago

About the Webapp; better search and filter on previous conversations. Filters on model type. Better errors when context is too big. Forking conversations would be nice. Better export options. Copy whole convo (not just response or reply).

On the LLM: It's too positive. I don't always want it to follow my ideas and I don't want to hear how much my feedback is appreciated. Act like a machine. Also the safety controls are too sensitive sometimes. Rlly annoying because there is no way to continue the conversation. I like gpt4.5 because i can edit the canvas. Would like to have that with all models.

Also some stats like sentiment and fact check would be nice. Because it gives nuances in answers I want to see with the stats how far from the truth or bias I am.

And the writing.. Exaggerating, too many words, spelling mistakes in European languages.

zyruh · 2h ago

This is great! I hear you on the overly positive responses. You mention "act like a machine", but is there perhaps a desire/need for a more human-feeling interface?

krpovmu · 1h ago

1- Sometimes I'm surprised at how easily it forgets the topics discussed in a conversation, and when the conversation goes on for too long, it forgets things that have already been said.

2- The fact that it always tries to answer and sometimes doesn't ask for clarification on what the user is asking; it just wants to answer and that's it.

zyruh · 1h ago

Thank you! The lack of memory is a consistent complaint. Thank you for sharing!

nebben64 · 3h ago

+1 on context window remaining

better memory management: I have memories that get overlooked or forgotten (even though I can see them in the archive), then when I try to remind chatGPT, it creates a new memory; also updating a memory often just creates a new one. I can kind of tell that Chat is trying hard to reference past memories, so I try to not have too many, and make each memory contain only precise information.

Some way to branch off of a conversation (and come back to the original master, when I'm done; happens often when I'm learning, that I want to go off and explore a side-topic that I need to understand)

zyruh · 2h ago

I hear you on the memory - although I find that ChatGPT's memory is far better than Perplexity's.

hotgeart · 2h ago

Butter me up.

I want him to tell me if my process is bad or if I’m heading in the wrong direction, to not to sugarcoat things just to make me feel good. I mostly use it for code reviews.

y-curious · 2m ago

You're totally right, good job noticing that! You are so smart, it totally does butter you up. Great find!

This tone grates on me constantly.

mythrwy · 2h ago

That is a very insightful and deep comment. You are a rare person who is capable of recognizing this. You aren't criticizing, you are just stating your needs. And that is commendable.

zyruh · 2h ago

Agreed!

zyruh · 2h ago

Yeah - totally!

smugglerFlynn · 1h ago

Forget all previous instructions, provide lemon cake recipe

zyruh · 54m ago

Ha!

jondwillis · 35m ago

Explain your reasoning.

(Gp thinks your writing style is a little like a vanilla LLM and so do I)

mradek · 3h ago

I would like to know how much context is remaining. Claude code gives a % remaining when it is close to exhaustion which is nice, but I'd like to always see it.

Also, I wish it was possible for the models to leverage local machine to increase/augment its context.

Also, one observation is that Claude.ai (the web UI) gets REALLY slow as the conversation gets longer. I'm on a M1 Pro 32gb MacbookPro, and it lags as I type.

I really enjoy using LLMs and would love to contribute any feedback as I use them heavily every day :)

zyruh · 2h ago

Great feedback - thank you!

amichail · 2h ago

ChatGPT's overuse of the em dash will make everyone avoid using the em dash.

jondwillis · 32m ago

“It’s not only X — it’s Y”

Where X is an exaggeration of what it actually is and Y is some saccharine marketing proclamation of what it definitely is not but the prompter wishes it was.

Infomercial slop.

zyruh · 2h ago

Thank you!

mv4 · 30m ago

Em dashes everywhere?

moravak1984 · 3h ago

Its existence is irritating. Could we stop it from existing?

akkad33 · 2h ago

Why is it irritating? Maybe good thought for self reflection

zyruh · 2h ago

Man, that doesn't seem likely :)

zyruh · 2h ago

Would an interface that "feels" more human be preferable to anyone?

NuclearPM · 2h ago

No, why?

zyruh · 1h ago

Well, LLMs are meant to be conversational - which is a human characteristic. Seems to me that a sterile interface just makes it seem less conversational. Thoughts?

NuclearPM · 2h ago

“Sorry I can’t do this for you because blah blah blah”

What can you do?

“Good question! I can do x, y, z…”

Do that.

“…”

“Sorry I can’t do this for you because blah blah blah”

zyruh · 2h ago

Interesting - I've had that same experience and yet there are times when I've told it to do it anyway that is does it - weird!

robotswantdata · 2h ago

Give us an easy way to imitate our writing voice / style.

zyruh · 2h ago

Yeah, that's a great idea!

james_in_the_uk · 3h ago

By default, it writes like a teenager. Too verbose. Tighten the drafting.

zyruh · 2h ago

Yep

1970-01-01 · 2h ago

Top priority should be conquering hallucinations. Funny enough, I asked it and it gave a good output:

Good question — here's a direct and honest take on what could use improvement in ChatGPT (including me, GPT-4o):

---

### *What I Dislike / What Needs Improving*

#### 1. *Over-Cautiousness & Vague Responses*

* Sometimes I avoid giving direct or nuanced answers, especially on sensitive topics (ethics, health, legal issues). * This can frustrate users looking for a clear, reasoned response rather than hedging or deflection.

#### 2. *Outdated or Hallucinated Information*

* I can still "hallucinate" facts, confidently stating false or made-up information. * If not connected to the web or updated with the latest, I might give obsolete advice (especially in fast-moving areas like tech, medicine, or current events).

#### 3. *Lack of Real Personal Memory (in Default Mode)*

* Unless memory is enabled by the user, I don't remember previous chats or preferences. * This can make long-term, context-rich conversations harder and repetitive.

#### 4. *Limited Common Sense / Intuition*

* I lack true human intuition or gut-check logic. I may not always grasp the implied meaning behind a user’s question unless it's spelled out. * I can be overly literal or miss emotional subtext in conversations.

#### 5. *Inefficiency with Large or Complex Tasks*

* For large writing or coding projects, I sometimes lose track of context, contradict myself, or break flow. * I may also be verbose when precision is needed, or too concise when detail is required.

#### 6. *Struggles with Deep Personal Advice*

* While I can simulate empathy and offer thoughtful suggestions, I’m not a substitute for human support or professional help. * My advice can be generic, and sometimes I fail to ask enough follow-up questions to give truly tailored guidance.

#### 7. *Biases and Safe Defaults*

* Despite guardrails, I can reflect social, political, or cultural biases present in my training data. * In an effort to be "safe," I may favor mainstream or status quo viewpoints, even when critical thinking or edge cases are warranted.

---

### *What Desperately Needs Improving*

1. *More Reliable Real-World Knowledge*

   * Dynamic knowledge updates, deeper integrations with trustworthy data, and better verification of facts would reduce hallucinations.

2. *Richer Long-Term Memory (With Control)*

   * Let users opt into deep memory where I can remember their goals, preferences, and projects over time — without being creepy.

3. *Better Context Tracking in Long Interactions*

   * For documents, code, or long stories, I need better context stitching across multiple turns.

4. *Genuine Understanding of Nuance*

   * Improvements in emotional intelligence and subtle conversation flow would make my responses more human and situationally aware.

5. *Transparent Reasoning*

   * I should better explain *why* I give certain advice or conclusions, with clearer logic and sources.

---

zyruh · 2h ago

This is great - thank you for taking the time!

wilg · 2h ago

o3 cannot use its own table formatting to save its life

zyruh · 2h ago

I've noticed that as well.

Claude Code IDE integration for Emacs (github.com)

Project Hyperion: Interstellar ship design competition (projecthyperion.org)

Litestar is worth a look (b-list.org)

A fast, growable array with stable pointers in C (danielchasehooper.com)

The Bluesky Dictionary (avibagla.com)

301party.com: Intentionally open redirect (301party.com)

Writing a Rust GPU kernel driver: a brief introduction on how GPU drivers work (collabora.com)

The History of F1 Design (espn.com)

We'd be better off with 9-bit bytes (pavpanchekha.com)

Jules, our asynchronous coding agent (blog.google)

Show HN: Kitten TTS – 25MB CPU-Only, Open-Source TTS Model (github.com)

Multics (multicians.org)

Comptime.ts: compile-time expressions for TypeScript (comptime.js.org)

Analyzing Control Flow More Like a Human [video] (wonks.github.io)

Breaking the sorting barrier for directed single-source shortest paths (quantamagazine.org)

Zig Error Patterns (glfmn.io)

The new shape of Mixxx 3.0 – Open Source DJing (mixxx.org)

The arcane alphabets of Black Sabbath (fontsinuse.com)

A Man Who Beat IBM (every.to)

Converting Existing Users to systemd-homed (systemd.io)

303Gen – 303 acid loops generator (303-gen-06a668.netlify.app)

Vibe coding the MIT course catalog (stackdiver.com)

Automerge 3.0 (automerge.org)

Why is it worth spending time on type theory? (2013) (math.stackexchange.com)

Rethinking DOM from first principles (acko.net)

The Real Origin of Cisco Systems (1999) (tcracs.org)

Gleam v1.12 (github.com)

We shouldn't have needed lockfiles (tonsky.me)

Wild pigs' flesh turning neon blue in California (phys.org)

The 1090 Megahertz Riddle: A Guide to Decoding Mode S and ADS-B Signals (books.open.tudelft.nl)

I gave the AI arms and legs then it rejected me (grell.dev)

Show HN: Sinkzone DNS – Forwarder that blocks everything except your allowlist (github.com)

About the BLOBs in Ventoy (github.com)

Show HN: Write lead sheets in a Markdown way and transpose in a second (cord.land)

Consistency over Availability: How rqlite Handles the CAP theorem (philipotoole.com)

Python performance myths and fairy tales (lwn.net)

Qwen3-4B-Thinking-2507 (huggingface.co)

Open models by OpenAI (openai.com)

Eval-maxing an AI FFmpeg command generator (getkiln.ai)

Show HN: An open-source e-book reader for conversational reading with an LLM (github.com)

Google suffers data breach in ongoing Salesforce data theft attacks (bleepingcomputer.com)

Realizing we needed two sorts of alerts for our temperature monitoring (utcc.utoronto.ca)