Why agents are bad pair programmers

124 sh_tomer 83 6/9/2025, 11:36:23 PM justin.searls.co ↗

Comments (83)

khendron · 6h ago

When I first tried an LLM agent, I was hoping for an interactive, 2-way, pair collaboration. Instead, what I got was a pairing partner who wanted to do everything themselves. I couldn't even tweak the code they had written, because it would mess up their context.

I want a pairing partner where I can write a little, they write a little, I write a little, they write a little. You know, an actual collaboration.

icedchai · 4h ago

Have you tried recently? This hasn't been my experience. I modify the code it's written, then ask it to reread the file. It generally responds "I see you changed file and [something.]" Or when it makes a change, I tell it I need to run some tests. I provide feedback, explain the problem, and it iterates. This is with Zed and Claude Sonnet.

dkersten · 38m ago

I do notice though that if I edit what it wrote before accepting it, and then it sees it (either because I didn’t wait for it to finish or because I send it another message), it will overwrite my changes with what it had before my changes every single time, without fail.

(Zed with Claude 4)

psadri · 5h ago

I usually add “discuss first. Don’t modify code yet”. Then we do some back and forth. And finally, “apply”.

dragonfax · 4h ago

Claude Code has "plan mode" for this now. It enforces this behavior. But its still poorly documented.

psadri · 4h ago

They should add a “cmd-enter” for ask, and “enter” to go.

Separately, if I were at cursor (or any other company for that matter), I’d have the AI scouring HN comments for “I wish x did y” suggestions.

carpo · 4h ago

Same. I use /ask in Aider so I can read what it's planning, ask follow-up questions, get it to change things, then after a few iterations I can type "Make it so" while sitting back to sip on my Earl Grey.

tomkwong · 1h ago

I had done something slightly different. I would ask LLM to prepare a design doc, not code, and iterate on that doc before I ask them to start coding. That seems to have worked a little better as it’s less likely to go rogue.

tobyhinloopen · 3h ago

You can totally do that. Just tell it to.

If you want an LLM to do something, you have to explain it. Keep a few prompt docs around to load every conversation.

haneul · 1h ago

Hmm you can tweak fine these days without messing up context. But, I run in “ask mode” only, with opus in claude code and o3 max in cursor. I specifically avoid agent mode because, like in the post, I feel like I gain less over time.

I infrequently tab complete. I type out 80-90% of what is suggested, with some modifications. It does help I can maintain 170 wpm indefinitely on the low-medium end.

Keeping up with the output isn’t much an issue at the moment given the limited typing speed of opus and o3 max. Having gained more familiarity with the workflow, the reading feels easier. Felt too fast at first for sure.

My hot take is that if GitHub copilot is your window into llms, you’re getting the motel experience.

catlifeonmars · 1h ago

> My hot take is that if GitHub copilot is your window into llms, you’re getting the motel experience.

I’ve long suspected this; I lean heavily on tab completion from copilot to speed up my coding. Unsurprisingly, it fails to read my mind a large portion of the time.

Thing is, mind reading tab completion is what I actually want in my tooling. It is easier for me to communicate via code rather than prose, and I find the experience of pausing and using natural language to be jarring and distracting.

Writing the code feels like a much more direct form of communicating my intent (in this case to the compiler/interpreter). Maybe I’m just weird; and to be honest I’m afraid to give up my “code first” communication style for programming.

Edit: I think the reason why I find the conversational approach so difficult is that I tend to think as I code. I have fairly strong ADHD and coding gives me appropriate amount of stimulation to do design work.

mock-possum · 4h ago

In all honesty - have you tried doing what you would do with a paired programmer - that is, talk to them about it? Communicate? I’ve never had trouble getting cursor or copilot to chat with me about solutions first before making changes, and usually they’ll notice if I make my own changes and say “oh, I see you already added XYZ, I’ll go ahead and move on to the next part.”

lomase · 2h ago

I’ve never had trouble getting cursor or copilot to chat with me about solutions first before making changes

Never had any trouble.... and then they lived together happy forever.

rhizome31 · 23m ago

As a developer who doesn't use AI for coding, except for the occasional non-project specific question to a chat bot, I am wondering if you use it for client projects or only for your own projects. If you do use it for client projects, do you have some kind of agreement that you're going to share their code with a third-party? I'm asking because most clients will make you sign a contract saying that you shouldn't disclose any information about the project to a third-party. I even once had a client who explicitly stated that AI should not be used. Do you find clients willing to make an exception for AI coding agents?

bluefirebrand · 7h ago

Pair programming is also not suitable for all cases

Maybe not for many cases

I mentioned this elsewhere but I find it absolutely impossible to get into a good programming flow anymore while the LLM constantly interrupts me with suggested autocompletes that I have to stop, read, review, and accept/reject

It's been miserable trying to incorporate this into my workflow

meesles · 7h ago

Second this. My solution is to have a 'non-AI' IDE and then a Cursor/VS Code to switch between. Deep work cannot be achieved by chatting with the coding bots, sorry.

gen220 · 7h ago

You should try aider! This is my workflow, essentially

NicoSchwandner · 3h ago

I do this as well and it works quite well for me like that!

Additionally, when working on microservices and on issues that don’t seem too straightforward, I use o3 and copy the whole code of the repo into the prompt and refine a plan there and then paste it as a prompt into cursor. Handy if you don’t have MAX mode, but a company-sponsored ChatGPT.

vasusen · 2h ago

I do this too by pasting only the relevant context files into O3 or Claude 4. We have an internal tool that just lets us select folders/files and spit out one giant markdown.

morkalork · 7h ago

Thirded. It was just completely distracting and I had to turn it off. I use AI but not after every keystroke, jeez.

latentsea · 7h ago

But but but... "we are an AI-first company".

Yeah, nah. Fourthed!

mdp2021 · 7h ago

> AI-first company

Does anybody introduce itself like that?

It's like when your date sends subtle signals, like kicking sleeping tramps in the street and snorting the flour over bread at the restaurant.

(The shocking thing is that the expression would even make sense when taken properly - "we have organized our workflows through AI-intelligent systems" -, while at this time it easily means the opposite.)

neilv · 5h ago

> > AI-first company

> Does anybody introduce itself like that?

Yes, I've started getting job posts sent to me that say that.

Declaring one's company "AI-first" right now is a great time-saver: I know instantly that I can disregard that company.

soulofmischief · 7h ago

> Deep work cannot be achieved by chatting with the coding bots, sorry.

...by you. Meanwhile, plenty of us have found a way to enhance our productivity during deep work. No need for the patronization.

bluefirebrand · 6h ago

I don't believe you experience deep work the same way I do then

In my mind you cannot do deep work while being interrupted constantly, and LLM agents are constant interruptions

8note · 6h ago

if youre using a computer at all, youre doing it wrong. deep work can only be done from the forest with no internet reception, pencil and paper

thallada · 4h ago

Everyone knows real programmers only need to use a butterfly.

ashdksnndck · 1h ago

This sounds like an issue with the specific UI setup you are using. I have mine configured so it only starts doing stuff if I ask it to. It never interrupts me.

soulofmischief · 6h ago

You can do better than a No true Scotsman fallacy. The fact is that not everyone works the same way you do, or interacts the same way with agents. They are not constant interruptions if you use them correctly.

Essentially, this is a skill issue and you're at the first peak of the Dunning–Kruger curve, sooner ready to dismiss those with more experience in this area as being less experienced, instead of keeping an open mind and attempting to learn from those who contradict your beliefs.

You could have asked for tips since I said I've found a way to work deeply with them, but instead chose to assume that you knew better. This kind of attitude will stunt your ability to adopt these programs in the same way that many people were dismissive about personal computers or the internet and got left behind.

girvo · 6h ago

It’s quite amusing to see you complain about patronisation, and then see you turn about and do it yourself one comment later.

Timwi · 2h ago

As an observer to this conversation, I can't help but notice that both have a good point here.

Soulofmischief’s main point is that meesles made an inappropriate generalization. Meesles said that something was impossible to do, and soulofmischief pointed out that you can't really infer that it's impossible for everyone just because you couldn't find a way. This is a perfectly valid point, but it wasn't helped by soulofmischief calling the generalization “patronizing”.

Bluefirebrand pushed back on that by merely stating that their experience and intuition match those of meesles, but soulofmischief then interpreted that as implying they're not a real programmer and called it a No True Scotsman fallacy.

It went downhill from there with soulofmischief trying to reiterate their point but only doing so in terms of insults such as the Dunning-Kruger line.

girvo · 1h ago

Oh 100%. I deliberately passed no judgement on the actual main points, as my experience is quite literally in between both of theirs.

I find agent mode incredibly distracting and it does get in the way of very deep focus for implementation for myself for the work I do... but not always. It has serious value for some tasks!

soulofmischief · 4h ago

I'm open to hearing how being honest with them about their negative approach is patronizing them.

jcranmer · 4h ago

Calling someone "on the first peak of the Dunning-Kruger curve" is patronizing them.

soulofmischief · 3h ago

How would you have handled it?

Timwi · 2h ago

Here is how I might have handled it differently:

Instead of

> Meanwhile, plenty of us have found a way to enhance our productivity during deep work. No need for the patronization.

you could have written

> Personally, I found doing X does enhance my productivity during deep work.

Why it's better: 1) cuts out the confrontation (“you're being patronizing!”), 2) offers the information directly instead of merely implying that you've found it, and 3) speaks for yourself and avoids the generalization about “plenty of people”, which could be taken as a veiled insult (“you must be living as a hermit or something”).

> You can do better than a No true Scotsman fallacy.

Even if the comment were a No True Scotsman, I would not have made that fact the central thesis of this paragraph. Instead, I might have explained the error in the argument instead. Advantages: 1) you can come out clean in the case that you might be wrong about the fallacy, and 2) the commenter might appreciate the insight.

Reason you're wrong in this case: The commenter referred entirely to their own experience and made no “true programmer” assertions.

> Essentially, this is a skill issue [...] Dunning–Kruger curve [...] chose to assume that you knew better. [...]

I would have left out these entire two paragraphs. As best as I can tell, they contain only personal attacks. As a result, the reader comes away feeling like your only purpose here is to put others down. Instead, when you wrote

> You could have asked for tips

I personally would have just written out the tips. Advantage: the reader may find it useful in the best case, and even if not, at least appreciate your contribution.

kbelder · 2h ago

Civilly?

antihipocrat · 6h ago

Would be informative if both sides share what the problem domain is when providing their their experiences.

It's possible that the domain or the complexity of the problems are the deciding factor for success with AI supported programming. Statements like 'you'll be left behind' or 'it's a skill issue' are as helpful as 'It fails miserably'

Timwi · 2h ago

> You could have asked for tips since I said I've found a way to work deeply with them

How do you work deeply with them? Looking for some tips.

JumpCrisscross · 6h ago

For what it’s worth, the deepest-thinking and most profound programmers I have met—hell, thinkers in general—have a peculiar tendency to favour pen and paper. Perhaps because once their work is recognised, they are generally working with a team that can amplify them without needing to interrupt their thought flow.

soulofmischief · 4h ago

Ha, I would count myself among those if my handwriting wasn't so terrible and I didn't have bad arthritis since my youth. I still reach for pen and paper on the go or when I need to draw something out, but I've gotten more productive using an outliner on my laptop, specifically Logseq.

I think there's still room for thought augmentation via LLMs here. Years back when I used Obsidian, I created probably the first or second copilot-for-Obsidian plugin and I found it very helpful, even though GPT-3 was generally pretty awful. I still find myself in deep flow, thinking in abstract, working alongside my agent to solve deep problems in less time than I otherwise would.

No comments yet

hnthrowaway121 · 5h ago

Analysis in the last 5-10 years has shown the Dunning-Kruger effect may not really exist. So it’s a poor basis on which to be judgmental and condescending.

soulofmischief · 4h ago

> judgmental and condescending

pushing back against judgement and condescension is not judgemental and condescending.

> may not really exist

I'm open to reading over any resources you would like to provide, maybe it's "real", maybe it isn't, but I have personally both experienced and witnessed the effect in myself, other individuals and groups. It's a good heuristic for certain scenarios, even if it isn't necesarily generalizable.

Timwi · 2h ago

I would invite you to re-read some of the comments you perceived as judgement and condescension and keep an open mind. You might find that you took them as judgement and condescension unfairly.

Meanwhile, you have absolutely been judgemental and condescending yourself. If you really keep the open mind that you profess, you'll take a moment to reflect on this and not dismiss it out of hand. It does not do you any favors to blissfully assume everyone is wrong about you and obliviously continue to be judgmental and condescending.

flessner · 7h ago

I recently got a new laptop and had to setup my IDE again.

After a couple hours of coding something felt "weird" - turns out I forgot to login to GitHub Copilot and I was working without it the entire time. I felt a lot more proactive and confident as I wasn't waiting on the autocomplete.

Also, Cursor was exceptional at interrupting any kind of "flow" - who even wants their next cursor position predicted?

I'll probably keep Copilot disabled for now and stick to the agent-style tools like aider for boilerplate or redundant tasks.

baq · 3h ago

> Also, Cursor was exceptional at interrupting any kind of "flow" - who even wants their next cursor position predicted?

Me, I use this all the time. It’s actually predictable and saves lots of time when doing similar edits in a large file. It’s about as powerful as multi-line regex search and replace, except you don’t have to write the regex.

ipaddr · 6h ago

It's strange the pure llm workflow and boring. I still write most of my own code and will llms when I'm too lazy to write the next piece.

If I give it to an llm most of my time is spent debugging and reprompting. I hate fixing someone elses bug.

Plus I like the feeling of the coding flow..wind at my back. Each keystroke putting us one step closer.

The apps I made with llms I never want to go back to but the apps I made by hand piece by piece getting a chemical reaction when problems were solved are the ones I think positively about and want to go back to.

I always did math on paper or my head and never used a calculator. Its a skill I never have forgotten and I worry how many programmers won't be able to code without llms in the future.

barrenko · 2h ago

Same, I like agents or nothing in between.

johnfn · 7h ago

> who even wants their next cursor position predicted

I'm fascinated by how different workflows are. This single feature has saved me a staggering amount of time.

brianpan · 4h ago

AI "auto-complete" or "code suggestions" is the worst, especially if you are in a strongly-type language because it's 80% correct and competing with an IDE that can be 100% correct.

AI agents are much better for me because 1) they don't constantly interrupt your train of thought and 2) they can run compile, run tests, etc. to discover they are incorrect and fix it before handing the code back to you.

dinosaurdynasty · 4h ago

I love the autocomplete, honestly use it more than any other AI feature.

But I'm forced to write in Go which has a lot of boilerplate (and no, some kind of code library or whatever would not help... it's just easier to type at that point).

It's great because it helps with stuff that's too much of a hassle to talk to the AI for (just quicker to type).

I also read very fast so one line suggestions are just instant anyway (like non AI autocomplete), and longer ones I can see if it's close enough to what I was going to type anyway. And eventually it gets to the point where you just kinda know what it's going to do.

Not an amazing boost, but it does let me be lazy writing log messages and for loops and such. I think you do need to read it much faster than you can write it to be helpful though.

CraigJPerry · 1h ago

Zed has a "subtle" mode, hopefully that feature can become table stakes in all AI editor integrations

jumploops · 6h ago

I’m a Vim user and couldn’t agree more.

Didn’t like any of the AI-IDEs, but loved using LLMs for spinning up one off solutions (copy/paste).

Not to be a fan boy, but Claude Code is my new LLM workflow. It’s tough trying to get it to do everything, but works really well with a targeted task on an existing code base.

Perfect harmony of a traditional code editor (Vim) with an LLM-enhanced workflow in my experience.

amazingamazing · 7h ago

Code regularly, and use ai to get unblocked if you do so or review code for mistakes.

Or have the ai write the entire first draft for some piece and then you give it a once over, correcting it either manually or with prompts.

No comments yet

palisade · 5h ago

LLM agents don't know how to shut up and always think they're right about everything. They also lack the ability to be brief. Sometimes things can be solved with a single character or line, but no they write a full page. And, they write paragraphs of comments for even the most minuscule of changes.

They talk at you, are overbearing and arrogant.

dawnerd · 38m ago

I was trying out sonnet 4 yesterday and it spent 15 minutes changing testing changing etc just to get one config item changed. It ended up changing 40 files for no reason. Also kept trying to open a debugger that didn’t exist and load a webpage that requires auth.

They’re far from perfect that’s for sure.

energy123 · 3h ago

I expect a lot of the things people don't like ("output too long, too many comments in code") are side effects of making the LLM good in other areas.

Long output correlates with less laziness when writing code, and higher performance on benchmarks due to the monotone relationship between number of output tokens and scores. Comment spam correlates with better performance because it's locally-specific reasoning it can attend on when writing the next line of code, leading to reduced errors.

tobyhinloopen · 2h ago

Just add to the prompt not to include comments and to talk less.

I have a prompt document that includes a complete summary of the Clean Code book, which includes the rules about comments.

You do have to remind it occasionally.

Onewildgamer · 3h ago

Finally someone said it, they're overconfident in their approach, don't consult us with the details of the implementation, they're trained to create mock APIs that don't follow structure, leading to lot of rework. The LLM actions should be measured, collaborative, ask for details when it's not present. It is impossible to give every single detail in the initial prompt, and a follow up prompt derails the train thought and design of the application.

I don't know if I'm using it right, I'd love to know more if that's the case. In a way the LLM should improve on being iterative, take feedback, maybe it's a hard problem to add/update the context. I don't know about that either, but love to learn more.

NitpickLawyer · 2h ago

Most stacks now support some form of "plan" workflows. You'd want to first do this, and see if it improves your experience.

One workflow that works well for me, even with small local models, is to start a plan session with something like: "based on @file, and @docs and @examples, I'd like to _ in @path with the following requirements @module_requirements.md. Let's talk through this and make sure we have all the info before starting to code it."

Then go back and forth, make sure everything is mentioned, and when satisfied either put it into a .md file (so you can retry the coding flow later) or just say "ok do it", and go grab a cup of coffee or something.

You can also make this into a workflow with .rules files or .md files, have a snippets thing from your IDE drop this whenever you start a new task, and so on. The idea with all the advancements in LLMs is that they need lots of context if you want them to be anything other than what they were trained on. And you need to try different flows and see what works on your specific codebase. Something that works for projectA might not work for projectB ootb.

csomar · 2h ago

Also giving them more details seems to confuse them. There is probably a way around this, though. They are pretty good in finding a tiny silver of information out of the ocean. What I hate is that the industry is all geared toward the same model (chat bot). Imagine if we never invented the keyboard, mouse, GUI, touch screen, etc...

__MatrixMan__ · 4h ago

I've been considering a... protocol? for improving this. Consider this repo:

    foo.py
    bar.py
    bar.py.vibes.md

This would indicate that foo.py is human-written (or at least thoroughly reviewed by a human), while bar.py is LLM written with a lower bar of human scrutiny.

bar.py.vibes.md would contain whatever human-written guidance describes how bar should look. It could be an empty file, or a few paragraphs, or it it could contain function signatures and partially defined data types.

If an LLM wants to add a new file, it gets a vibes.md with whatever prompt motivated the addition.

Maybe some files keep their assiciated *.vibes.md forever, ready to be totally rewritten as the LLM sees fit. Maybe others stick around only until the next release, after which the associated code is reviewed and the vibes files are removed (or somehow deactivated, I could imagine it being useful for them to still be present).

What do people think, do we need handcuffs of this kind for our pair programming friends the LLMs?

almosthere · 4h ago

I think coding will eventually go away in favor of models with metadata built around them.

How many times did you have a mutation operation where you had to hand code the insert of 3 or 4 entities and make sure they all come back successful, or you back out properly (and perhaps this is without a transaction, perhaps over multiple databases).

Make sure the required fields are present Grab the created inserted ID Rinse, repeat

Or if you're mutating a list, writing code that inserts a new element, but you don't know which one is new. And you end up, again, hand coding loops and checking what you remember to check.

What about when you need to do an auth check.

And the hand coder may fail to remember one little thing somewhere.

With LLM code, you can just describe that function and it will remember to do all the things.

An LLM with a model + metadata - we won't really need to think of it as editing User.java or User.py anymore. Instead User.yaml - and the LLM will just consume that, and build out ALL of your required biz-logic, and be done with it. It could create a fully authenticating/authorizing REST API + GraphQL API with sane defaults - and consistent notions throughout.

And moving into UIs- we can have the same thing. The UI can be described in an organized way. What fields are required for user registration. What fields are optional according to the backend. It's hard to visualize this future, but I think it's a no-code future. It's models of requirements instead.

ChrisMarshallNY · 7h ago

I use an LLM as a reference (on-demand), and don't use agents (yet). I was never into pair programming, anyway, so it isn't a familiar workflow for me.

I will admit that it encourages "laziness," on my part, but I'm OK with that (remember when they said that calculators would do that? They were right).

For example, I am working on a SwiftUI project (an Apple Watch app), and forgot how to do a fairly basic thing. I could have looked it up, in a few minutes, but it was easier to just spin up ChatGPT, and ask it how to do it. I had the answer in a few seconds. Looking up SwiftUI stuff is a misery. The documentation is ... a work in progress ...

petesergeant · 6h ago

> I use an LLM as a reference (on-demand), and don't use agents (yet)

This was me until about three weeks ago. Then, during a week of holiday, I decided I didn't want to get left behind and tried a few side-projects using agents -- specifically I've been using Roo. Now I use agents when appropriate, which I'd guess is about 50% of the work I'm doing.

cpursley · 4h ago

Roo looks interesting. How does it compare with Cursor and Windsurf?

shio_desu · 4h ago

It burns tokens if you BYOK but you can hook into GH Copilot LLMs directly

I really like the orchestrator and architect personas as is out of the box. I prefer it over Cursor / Windsurf for a few reasons - no indexing (double edged sword) - orchestrator I find much more useful than windsurf cascades - tool usage is fantastic

The no indexing is a double edged sword, it does need to read files constantly, contributing to token burn. However, you don't have to worry about indexed data being on a 3rd party server (cursor), and also since it has to crawl to understand the codebase for it to implement, to me it seems like it is more capable of trickier code implementations, as long as you utilize context properly.

For more complex tasks, I usually either spend 20-30 minutes writing a prompt to give it what I'm looking to implement, or write up a document detailing the approach I'd like to take and iterate with the architect agent.

Afterwards, hand it off to the orchestrator and it manages and creates subtasks, which is to provide targeted implementation steps / tasks with a fresh context window.

If you have a GH Copilot license already, give it a shot. I personally think it's a good balance between control as an architect and not having to tie my time down for implementations, since really a lot of the work in coding is figuring out the implementation plan anyways, and the coding can be busy work, to me personally anyways. I prefer it over the others as I feel Windsurf/Cursor encourages YOLO too much.

Pandabob · 35m ago

I basically jump away from Cursor to ChatGPT when I need to think thoroughly on something like an architecture decision or an edge case etc. Then when I've used ChatGPT to come up with an implementation plan, I jump back to Cursor and have Claude do the actual coding. O3 and ChatGPT's search functionality are just better (at least for myself) currently for "type 2" thinking tasks.

travisgriggs · 2h ago

> Allow users to pause the agent to ask a clarifying question or push back on its direction without derailing the entire activity or train of thought

I think I’ve seen Zed/Claude do kind of this. A couple times, I’ve hit return, and then see that I missed a clarifying statement based on the direction it starts going and I put it in fast, and it corrects.

tobyhinloopen · 3h ago

This guy needs a custom prompt. I keep a prompt doc around that is constantly updated based on my preferences and corrections.

Not a few sentences but many many lines of examples and documentation

atemerev · 2h ago

Aider does everything right. Stop using Cursor or any other agentic environments. Try Aider, it works exactly as suggested here.

worldsayshi · 1h ago

I've been wanting to use Aider but I want to use Copilot as a provider (to stay compliant with the wishes of my employer). Haven't gone down that road yet because Aider copilot support seems a bit tentative. I see they have some docs about it up now though: https://aider.chat/docs/llms/github.html

"The easiest path is to sign in to Copilot from any JetBrains IDE"

Somebody must've made a standalone login script by now right? I wonder if `gh auth login` can be used to get a token?

ramesh31 · 7h ago

>LLM agents make bad pairs because they code faster than humans think

This is why I strongly dislike all of the terminal based tools and PR based stuff. If you're left to read through a completed chunk of code it is just overwhelming and your cycle time is too slow. The key to productivity is using an IDE based tool that shows you every line of code as it is being written, so you're reading it and understanding where it's going in real time. Augmentation, not automation, is the path forward. Think of it like the difference between walking and having a manual transmission car to drive, not the difference between having a car and having a self driving car.

bluefirebrand · 7h ago

If I have a 20 line function in my mind and the LLM injects 20 lines for me to accept or reject, I have two problems

First I have to review the 20 lines the LLM has produced

Second, if I reject those lines, it has probably shoved the function I had in mind out of my head

It's enormously disruptive to my progress

ramesh31 · 7h ago

The hard truth here is in accepting that the 20 lines in your head were probably wrong, or suboptimal, and letting go of that urge. Think in interfaces, not implementations. Successive rendering, not one-shot.

shortstuffsushi · 7h ago

This is just fundamentally not the case most of the time. LLMs guess where you're going, but so often what they produce is a "similar looking" non sequitur relative to the lines above it. It guesses, and sometimes that guess is good, but as often, or more, it's not.

The suggestion "think in interfaces" is fine; if you spell out enough context in comments, the LLM may be able to guess more accurately, but in spelling out that much context for it, you've likely already done the mental exercise of the implementation.

Also baffled by "wrong or suboptimal," I don't think I've ever seen an LLM come up with a better solution.

xedrac · 6h ago

Maybe it's the domain I work in, or the languages I use, but the 20 lines the LLM comes up with is almost certainly wrong.

datameta · 7h ago

I agree with the last two sentences but simultaneously think that starting to defacto believe you cannot have an equal or better solution compared to the AI is the start of atrophy of those skills.

bluefirebrand · 7h ago

> The hard truth here is in accepting that the 20 lines in your head were probably wrong, or suboptimal, and letting go of that urge.

Maybe, but the dogshit that Cursor generates is definitely wrong so frankly if it's gonna be my name on the PR then I want it to me my wrong code not hide behind some automated tool

> Think in interfaces, not implementations

In my experience you likely won't know if you've designed the right interface until you successfully implement the solution. Trying to design the perfect interface upfront is almost guaranteed to take longer than just building the thing

simoncion · 6h ago

> ...and letting go of that urge.

What urge? The urge to understand what the software you're about to build upon is doing? If so, uh... no. No thanks.

I've seen some proponents of these code-generation machines say things like "You don't check the output of your optimizing compiler, so why check the output of Claude/Devon/whatever?". The problem with this analogy is that the output from mainstream optimizing compilers is very nearly always correct. It may be notably worse than hand-generated output, but it's nearly never wrong. Not even the most rabid proponent will claim the same of today's output from these code-generation machines.

So, when these machines emit code, I will inevitably have to switch from "designing and implementing my software system" mode into "reading and understanding someone else's code" mode. Some folks may be actually be able to do this context-shuffling quickly and easily. I am not one of those people. The results from those studies from a while back that found that folks take something like a quarter-hour to really get back into the groove when interrupted while doing a technical task suggest that not that many folks are able to do this.

> Think in interfaces...

Like has been said already, you don't tend to get the right interface until you've attempted to use it with a bunch of client code. "Take a good, educated stab at it and refine it as the client implementations reveal problems in your design." is the way you're going to go for all but the most well-known problems. (And if your problem is that well-known, why are you writing more than a handful of lines solving that problem again? Why haven't you bundled up the solution to that problem in a library already?)

> Successive rendering, not one-shot.

Yes, like nearly all problem-solving, most programming is and always has been an iterative process. One rarely gets things right on the first try.

UltraSane · 4h ago

It is rather soul crushing how fast LLMs spit out decent code.

Bob_LaBLahh · 1h ago

In my experience, LLMs are idiot savant coders--but currently more idiot than savant. Claude 3.7 (via cursor and roo) can comment code well, create a starter project 10x faster than I could, and they spit out common crud apps pretty well.

However I've come to the conclusion that LLMs are terrible at decision making. I would much rather have an intern architect my code than let AI do it. It's just too unreliable. It seems like 3 out of 4 decisions that it makes are fine. But that 4th decision is usually asinine.

That said, I now consider LLMs a mandatory addition to my toolkit because they have improved my developer efficiency so much. I really am a fan. But without a seasoned dev to write detailed instructions, break down the project into manageable chunks, make all of the key design decisions, and review every line of code that it writes, today's AI will only add a mountain of technical debt to your project.

I guess I'm trying to say: don't worry because the robots cannot replace use yet. We're still in the middle of the hype cycle.

But what do I know? I'm just an average meat coder.

ninetyninenine · 4h ago

>LLM agents make bad pairs because they code faster than humans think.

Easily solved. Use less compute. Use slower hardware. Or put in the prompt to pause at certain intervals.

Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

How to get started with writing tech video essays

Ask HN: How to learn CUDA to professional level

Control PC games with body movements using webcam andPython+ MediaPipe

Building an Audit Readiness Platform for Startups – Would Love Your Feedback

Ask HN: Do founders get honest feedback on their pitch decks?

Ask HN: Share you personal favorite productivity tools, workflows or setup

Ask HN: In 15 years, what will a gas station visit look like?

Ask HN: Startup getting spammed with PayPal disputes, what should we do?

Ask HN: What API or software are people using for transcription?

Ask HN: How do you integrate AI assistants into your note taking?

Ask HN: Any good tools for viewing congressional bills?

Ask HN: Has anybody built search on top of Anna's Archive?

Ask HN: What Happened to the Apple Vision Pro?

Ask HN: What's your open source stack?

Ask HN: Is there any demand for Personal CV/Resume website?

Bill Atkinson has passed away

Ask HN: Is anyone still programming the old-fashioned way (without LLMs)?

Best place for small remote gigs?

Ask HN: Has Apple lost its way?

Ask HN: Options for One-Handed Typing

Ask HN: How long do you ever leave your server running without updates?

Ask HN: What would you work on if you couldn't fail?

Ask HN: Anyone else feeling increasingly alienated from the industry?

What is a modern successor to HyperCard?

Ask HN: Share real complaints about outsourcing data annotation

When Profit Overshadows Community: A Look at Golang Conferences

Founders: Don't Give Up

Get Your Dev Tool Mentioned by ChatGPT, Gemini Not Just Ranked on Google

Ask HN: Running AI agents in isolated environments

Ask HN: What Does Your Self-Hosted LLM Stack Look Like in 2025?

Ask HN: Best practices for building front ends with AI assistance?

Ask HN: Resources to Learn AI/ML Fundamentals

Ask HN: What do you put in claude.md and what you leave out?

60–70% of YC X25 Agent Startups Are Using TypeScript

Reaching my first 100 users without money or audience (at 10K users now)

Tiptap open-sources 10 formerly Pro extensions under MIT license

Ask HN: In-house or outsourced data annotation? (2025)

Ask HN: Where do you go for cutting-edge dev news and info?

How do you store and maintain your CV/resume over time?

Ask HN: What powerful but lesser-known Python libraries are out there?

Ask HN: Do we need a language designed specifically for AI code generation?

Ask HN: A Tetris variant with greater tactical and strategic depth?

O(1) memory, no-preprocessing reachability algorithm for 2D grids

Ask HN: Is synthetic data generation practical outside academia?

Ask HN: Dealing with Vibe Coding Depression?

Why agents are bad pair programmers

Comments (83)