I don't get the whole "all-in" mentality around LLMs. I'm an iOS dev by trade, I continue to do that as I always have. The difference now is I'll use an LLM to quickly generate a one-off view based on a design. This isn't a core view of an app, the core functionality, or really anything of importance. It's a view that promotes a new feature, or how to install widgets, or random things. This would normally take me 30-60 min to implement depending on complexity, now it takes 5.
I also use it for building things like app landing pages. I hate web development, and LLMs are pretty good at it because I'd guess that is 90% of their training data related to software development. For that I make larger changes, review them manually, and commit them to git, like any other project. It's crazy to me that people will just go completely off the rails for multiple hours and run into a major issue, then just start over when instead you can use a measured approach and always continue forward momentum.
mritchie712 · 9h ago
How useful the various tools will be depends on the person and the problem. Take two hypothetical people working on different problems and consider if, for example, Cursor would be useful.
IF you're a:
* 10 year python dev
* work almost entirely on a very large, complex python code base
* have a pycharm IDE fine tuned over many years to work perfectly on that code base
* have very low tolerance for bugs (stable product, no room for move fast, break things)
THEN: LLMs aren't going to 10x you. An IDE like Cursor will likely make you slower for a very long time until you've learned to use it.
IF you're a:
* 1 year JS (react, nextjs, etc.) dev
* start mostly from scratch on new ideas
* have little prior IDE preference
* have high tolerance for bugs and just want to ship and try stuff
THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.
infecto · 9h ago
These all-or-nothing takes on LLMs are getting tiresome.
I get the point you’re trying to make, LLMs can be a force multiplier for less experienced devs, but the sweeping generalizations don’t hold up. If you’re okay with a higher tolerance for bugs or loose guardrails, sure, LLMs can feel magical. But that doesn’t mean they’re less valuable to experienced developers.
I’ve been writing Python and Java professionally for 15+ years. I’ve lived through JetBrains IDEs, and switching to VS Code took me days. If you’re coming from a heavily customized Vim setup, the adjustment will be harder. I don’t tolerate flaky output, and I work on a mix of greenfield and legacy systems. Yes, greenfield is more LLM-friendly, but I still get plenty of value from LLMs when navigating and extending mature codebases.
What frustrates me is how polarized these conversations are. There are valid insights on both sides, but too many posts frame their take as gospel. The reality is more nuanced: LLMs are a tool, not a revolution, and their value depends on how you integrate them into your workflow, regardless of experience level.
mritchie712 · 9h ago
my stance is the opposite of all-or-nothing. The note above is one example. How much value you get out of CURSOR specifically is going to vary based on person & problem. The Python dev in my example might immediately get value out of o3 in ChatGPT.
It's not all or nothing. What you get value out of immediately will vary based on circumstance.
infecto · 8h ago
You say your stance isn’t all-or-nothing, but your original comment drew a pretty hard line, junior devs who start from scratch and have a high tolerance for bugs get 10x productivity, while experienced devs with high standards and mature setups will likely be slowed down. That framing is exactly the kind of binary thinking that’s making these conversations so unproductive.
ghufran_syed · 8h ago
I wouldn’t classify this as binary thinking - isnt the comment you are replying just defining boundary conditions? Then those two points don’t define the entire space, but the output there does at least let us infer (but not prove) something about the nature of the “function” between those two points? Where the function f is something like f: experience -> productivity increase?
infecto · 8h ago
You’re right that it’s possible to read the original comment as just laying out two boundary conditions—but I think we have to acknowledge how narrative framing shapes the takeaway. The way it’s written leads the reader toward a conclusion: “LLMs are great for junior, fast-shipping devs; less so for experienced, meticulous engineers.” Even if that wasn’t the intent, that’s the message most will walk away with.
But they drew boundaries with very specific conditions that lead the reader. It’s a common theme in these AI discussions.
web007 · 7h ago
> LLMs are great for junior, fast-shipping devs; less so for experienced, meticulous engineers
Is that not true? That feels sufficiently nuanced and gives a spectrum of utility, not binary one and zero but "10x" on one side and perhaps 1.1x at the other extrema.
The reality is slightly different - "10x" is SLoC, not necessarily good code - but the direction and scale are about right.
TeMPOraL · 2h ago
That feels like the opposite of being true. Juniors have, by definition, little experience - the LLM is effectively smarter than them and much better at programming, so they're going to be learning programming skills from LLMs, all while futzing about not sure what they're trying to express.
People with many years or even decades of hands-on programming experience, have the deep understanding and tacit knowledge that allows them to tell LLMs clearly what they want, quickly evaluate generated code, guide the LLM out of any rut or rabbit hole it dug itself into, and generally are able to wield LLMs as DWIM tools - because again, unlike juniors, they actually know what they mean.
mritchie712 · 7h ago
no, those are two examples of many many possible circumstances. I intentionally made it two very specific examples so that was clear. Seems it wasn't so clear.
infecto · 7h ago
Fair enough but if you have to show up in the comments clarifying that your clearly delineated “IF this THEN that” post wasn’t meant to be read as a hard divide, maybe the examples weren’t doing the work you thought they were. You can’t sketch a two-point graph and then be surprised people assume it’s linear.
Again I think the high level premise is correct as I already said, the delivery falls flat though. Your more junior devs have larger opportunity of extracting value.
bluecheese452 · 7h ago
I and others understood it perfectly well. Maybe the problem wasn’t with the post.
infecto · 6h ago
And I along with others who upvoted me did not. What’s your point? Seems like you have none and instead just want to point fingers.
bluecheese452 · 6h ago
The guy was nice enough to explain his post that you got confused about. Rather than be thankful you used that as evidence that he was not clear and lectured him on it.
I gently suggested that the problem may have not been with his post but with your understanding. Apparently you missed the point again.
infecto · 5h ago
If multiple people misread the post, clarity might be the issue, not comprehension. Dismissing that as misunderstanding doesn’t add much. Let’s keep it constructive.
lelandbatey · 6h ago
This doesn't seem like an "all or nothing" take. This person is trying to be clear about their claims, but they're not trying to state these are the only possible takes. Add the word "probably" after each "then" and I image their intended tone becomes a little clearer.
mexicocitinluez · 9h ago
> I get the point you’re trying to make, LLMs can be a force multiplier for less experienced devs, but the sweeping generalizations don’t hold up. If you’re okay with a higher tolerance for bugs or loose guardrails, sure, LLMs can feel magical. But that doesn’t mean they’re less valuable to experienced developers.
Amen. Seriously. They're tools. Sometimes they work wonderfully. Sometimes, not so much. But I have DEFINITELY found value. And I've been building stuff for over 15 years as well.
I'm not "vibe coding", I don't use Cursor or any of the ai-based IDEs. I just use Claude and Copilot since it's integrated.
voidhorse · 9h ago
> Amen. Seriously. They're tools. Sometimes they work wonderfully. Sometimes, not so much. But I have DEFINITELY found value. And I've been building stuff for over 15 years as well.
Yes, but these lax expectation s are what I don't understand.
What other tools in software sometimes work and sometimes don't that you find remotely acceptable? Sure all tools have bugs, but if your compiler had the same failure rate and usability issues as an LLM you'd never use it. Yet for some reason the bar is so low for LLMs. It's insane to me how much people have indulged in the hype koolaid around these tools.
TeMPOraL · 2h ago
> What other tools in software sometimes work and sometimes don't that you find remotely acceptable?
Other people.
Seriously, all that advice about not anthropomorphizing computers is taken way too seriously now, and is doing a number on the industry. LLMs are not a replacement for compilers or other "classical" tools - they're replacement for people. The whole thing that makes LLMs useful is their ability to understand what some text means - whether or not it's written in natural language or code. But that task is inherently unreliable because the problem itself is ill-specified; the theoretically optimal solution boils down to "be a simulated equivalent of a contemporary human", and that still wouldn't be perfectly reliable.
LLMs are able to trivially do tasks in programming that no "classical" tools can, tasks that defy theoretical/formal specification, because they're trained to mimic humans. Plenty of such tasks cannot be done to the standards you and many others expect of software, because they're NP-complete or even equivalent to halting problem. LLMs look at those and go, "sure, this may be provably not solvable, but actually the user meant X therefore the result is Y", and succeed with that reliably enough to be useful.
Like, take automated refactoring in dynamic languages. Any nontrivial ones are not doable "classically", because you can't guarantee there aren't references to the thing you're moving/renaming that are generated on the fly by eval() + string concatenation, etc. As a programmer, you may know the correct result, because you can understand the meaning and intent behind the code, the conceptual patterns underpinning its design. DAG walkers and SAT solvers don't. But LLMs do.
marcosdumay · 5h ago
There were always lots of code generation tools that people expected to review and fix the output.
Anyway, code generation tools almost always are born unreliable, then improve piecewise into almost reliable, and finally get replaced by something with a mature and robust architecture that is actually reliable. I can't imagine how LLMs could traverse this, but I don't think it's an extraordinary idea.
infecto · 8h ago
My compiler doesn’t write a complete function to visualize a DataFrame based on a vague prompt. It also doesn’t revise that function as I refine the requirements. LLMs can.
There’s definitely hype out there, but dismissing all AI use as “koolaid” is as lazy as the Medium posts you’re criticizing. It’s not perfect tech, but some of us are integrating it into real production workflows and seeing tangible gains, more code shipped, less fatigue, same standards. If that’s a “low bar,” maybe your expectations have shifted.
ljm · 6h ago
People are way too quick to defend LLMs here, because it's exactly on point.
In an era where an LLM can hallucinate (present you a defect) with 100% conviction, and vibe coders can ship code of completely unknown quality with 100% conviction, the bar by definition has to have been set lower.
Someone with experience will still bring something more than just LLM-written code to the table, and that bar will stay where it is. The people who don't have experience won't even feel the shortcomings of AI because they won't know what it's getting wrong.
tbrownaw · 6h ago
> What other tools in software sometimes work and sometimes don't that you find remotely acceptable?
Searching for relevant info on the Internet can take several attempts, and occasionally I end up not finding anything useful.
My ide intellisense tries to guess what identifier I want and put it at the top of the list, sometimes it guesses wrong.
I've heard that the various package repositories will sometimes deliberately refuse to work for a while because of some nonsense called "rate limiting".
Cloud deployments can fail due to resource availability.
mexicocitinluez · 7h ago
> Yes, but these lax expectation s are what I don't understand.
It's pretty a really, really simple concept.
If I have a crazy Typescript error, for instance, I can throw it in and get a much better idea of what's happening. Just because that's not perfect, doesn't mean it isn't helpful. Even if it works 90% of the time, it's still better than 0% of the time (Which is where I was at before).
It's like google search without ads and with the ability to compose different resources together. If that's not useful to you, then I don't know what to tell you.
UncleEntity · 7h ago
Hell, AI is probably -1x for me because I refuse to give up and do it myself instead of trying to get the robots to do it. I mean, writing code is for the monkeys, right?
Anyhoo... I find that there are times where you have to really get in there and question the robot's assumptions as they will keep making the same mistake over and over until you truly understand what it is they are actually trying to accomplish. A lot of times the desired goal and their goal are different enough to cause extreme frustration as one tends to think the robot's goal should perfectly align with the prompt. Once it fails a couple times then the interrogation begins since we're not making any further progress, obviously.
Case in point, I have this "Operational Semantics" document, which is correct, and a peg VM, which is tested to be correct, but if you combine the two one of the operators was being compiled incorrectly due to the way backtracking works in the VM. After Claude's many failed attempts we had a long discussion and finally tracked down the problem to be something outside of its creative boundaries and it needed one of those "why don't you do it this way..." moments. Sure, I shouldn't have to do this but that's the reality of the tools and, like they say, "a good craftsman never blames his tools".
cbm-vic-20 · 9h ago
I've got over 30 years of professional development experience, and I've found LLMs most useful for
* Figuring out how to write small functions (10 lines) in canonical form in a language that I don't have much experience with. This is so I don't end up writing Rust code as if it were Java.
* Writing small shell pipelines that rely on obscure command line arguments, regexes, etc.
maccard · 9h ago
I’ve found the biggest thing LLMs and agents let me do are build the things that I really suck at to a prototype level. I’m not a frontend engineer, and pitching feature prototypes without a fronted is tough.
But with aider/claude/bolt/whatever your tool of choice is, I can give it a handful of instructions and get a working page to demo my feature. It’s the difference between me pitching the feature or not, as opposed to pitching it with or without the frontend.
CuriouslyC · 9h ago
16 year python dev who's done all that, lead multiple projects from inception to success, and I rarely manually code anymore. I can specify precisely what I want, and how I want it built (this is the key part), stub out a few files and create a few directories, and let an agent run wild but configured for static analysis tools/test suite to run after every iteration with the instructions to fix their mistakes before moving on.
I can deliver 5k LoC in a day easily on a greenfield project and 10k if I sweat or there's a lot of boilerplate. I can do code reviews of massive multi-thousand line PRs in a few minutes that are better than most of the ones done by engineers I've worked with throughout a long career, the list just goes on and on. I only manually code stuff if there's a small issue that I see the LLM isn't understanding that I can edit faster than I can run another round of the agent, which isn't often.
LLMs are a force multiplier for everyone, really senior devs just need to learn to use them as well as they've learned to use their current tools. It's like saying that a master archer proves bows are as good as guns because the archer doesn't know how to aim a rifle.
mrweasel · 7h ago
Assuming that your workflow works, and the rest of us just need to learn to use LLMs equally effective, won't that plateau us at the current level of programming?
The LLMs learn from examples, but if everyone uses LLMs to generate code, there's no new code to learn new features, libraries or methods from. The next generation of models are just going to be trained on the code generated by it's predecessors with now new inputs.
Being an LLM maximalist is basically freeze development in the present, now and forever.
Workaccount2 · 6h ago
If Google's AlphaEvolve is any indication, they already have LLM's writing faster algorithms than humans have discovered.[1]
I'm not thinking algorithms. Let's say someone write a new web framework. If there is no code samples available, I don't think whatever is going to be in the documentation will be enough data, then the LLMs doesn't have the training data and won't be able to utilize it.
Would you ever be able to tell e.g. CoPilot: I need a web framework with these specs, go create that framework for me. The later have Claude actually use that framework?
TeMPOraL · 2h ago
> Would you ever be able to tell e.g. CoPilot: I need a web framework with these specs, go create that framework for me. The later have Claude actually use that framework?
Sure, why not?
The "magic sauce" of LLMs is that they understand what you mean. They've ingested all the thinking biases and conceptual associations humans have through their training on the entire training corpus, not just code and technical documentation. When Copilot cobbles together a framework for you, it's going to name the functions and modules and variables using domain terms. For Claude reading it, those symbols aren't just meaningless tokens with identity - they're also words that mean something in English in general, as well as in the web framework domain specifically; between that and code itself having common, cross-language pattern, there's more than enough information for an LLM to use a completely new framework mostly right.
Sure, if your thing is unusual enough, LLMs won't handle it as well as something that's over-represented in their training set, but then the same is true of humans, and both benefit from being provided some guidelines and allowed to keep notes.
(Also, in practice, most code is very much same-ish. Every now and then, someone comes up with something conceptually new, but most of the time, any new framework or library is very likely to be reinventing something done by another library, possibly in different language. Improvements, if any, tend to be incremental. Now, the authors of libraries and frameworks may not be aware they're retracing prior art, but SOTA LLMs very likely seen it all, across most programming languages ever used, and can connect the dots.)
And in the odd case someone really invents some unusual, new, groundbreaking pattern, it's just a matter of months between it getting popular and LLMs being trained on it.
mritchie712 · 9h ago
were you immediately more productive in Cursor specifically?
my point is exactly inline with your comment. The tools you get immediate value out of will vary based on circumstance. There's no silver bullet.
CuriouslyC · 8h ago
I use Aider, and I was already quite good at working with AI before that so there wasn't much of a learning curve other than figuring out how to configure it to automatically do the ruff/mypy/tests loop I'd already been doing manually.
They key is that I've always had that prompt/edit/verify loop, and I've always leaned heavily on git to be able to roll back bad AI changes. Those are the skills that let me blow past my peers.
WD-42 · 6h ago
Let’s see the GitHub project for an easy 10k line day.
CuriouslyC · 6h ago
Not public on github, but here's the cloc for an easy 5k one day (10k is sweats).
And when I say easy, I was playing the bass while working on this project for ~3 hours.
WD-42 · 6h ago
This shows nothing.
otabdeveloper4 · 2h ago
> we're back to counting programming projects in kloc, like it's the radical 1980's again
Yikes. But also lol.
alvis · 8h ago
20+ years dev here, started coding with AI when Javis was still a thing (before it became Jasper, and way before Copilot or ChatGPT).
Back then, Javis wasn’t built for code, but it was a surprisingly great coding companion. Yes. It only gave you 80% working code, but because you had to get your hands dirty, you actually understand what was happening. It didn't give me 10x but I'm happy with 2x with good understanding on what's going on.
Fast-forward to now: Copilot, Cursor, roo code, windsurf and the rest are shockingly good at output, but sometimes the more fluent the AI, the sneakier the bugs. They hand you big chunks of code, and I bet most of us don't have a clear picture of what's going on at ground 0 but just an overall idea. It's just too tempting to blindly "accept all" the changes.
It’s still the old wisdom — good devs are the ones not getting paged at 3am to fix bugs. I'm with the OP. I'm more happy with my 2x than waking up at 3am.
palmotea · 1h ago
> IF you're a:
> * 1 year JS (react, nextjs, etc.) dev
> * start mostly from scratch on new ideas
> * have little prior IDE preference
> * have high tolerance for bugs and just want to ship and try stuff
> THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.
And also probably dead-end you, and you'll stay the bug-tolerate 1 year JS dev for the next 10 years of your career.
It's like eating your seed corn. Sure you'll be fat and have it easy for a little while, but then next year...
prisenco · 9h ago
Agreed but the 1 year JS dev should know they're making a deal with the devil in terms of building their skillset long term.
diggan · 9h ago
I basically learned programming by ExpertSexchange, Google (and Altavista...), SourceForge and eventually StackOverflow + GitHub. Many people with more experience than me at the time always told me I was making a deal with the devil since I searched so much, didn't read manuals and asked so many questions instead of thinking for myself.
~15 years later, I don't think I'm worse off than my peers who stayed away from all those websites. Doing the right searches is probably as important as being able to read manuals properly today.
GeoAtreides · 3h ago
>I basically learned programming by ExpertSexchange, Google (and Altavista...), SourceForge and eventually StackOverflow + GitHub. Many people with more experience than me at the time always told me I was making a deal with the devil since I searched so much, didn't read manuals and asked so many questions instead of thinking for myself.
no they didn't, no one said that
i know that because i was around then and everyone was doing the same thing
also, maybe there's a difference between searching and collating answers and just copy and pasting a solution _without thinking_ at all
whiplash451 · 9h ago
The jury is still out on that one.
Having a tool that’s embedded into your workflow and shows you how things can be done based on tons of example codebases could help a junior dev quite a lot to learn, not just to produce.
_heimdall · 9h ago
Like anything else with learning, that will be heavily dependent on the individual's level of motivation.
Based on the classmates I had in college who were paying to get a CS degree, I'd be surprised if many junior devs already working a paid job put much effort into learning rather than producing.
whiplash451 · 6h ago
I wouldn't dismiss the implicit/subconscious aspect of learning by example that occurs when you are "just" producing.
_heimdall · 2h ago
That still comes back to motivation in my opinion. Using an LLM to generate code and using it without studying the code and understanding it will teach you very little.
I'd still expect most junior Deva that use an LLM to get their job done won't be motivated to study the generated code enough to really learn it.
A student is also only as good as the teacher, though that's a whole other can of works with LLMs.
aqme28 · 6h ago
Maybe. But I also think that ignoring AI tools will hamper your long-term skillsets, as our profession adapts to these new tools.
bccdee · 6h ago
Why would that be the case? If anything, each successive generation of AI tools gets easier to use and requires less prompt fiddling. I picked up Cursor and was comfortable with it in 20 minutes.
I'm not sure there's much of a skillset to speak of for these tools, beyond model-specific tricks that evaporate after a few updates.
StefanBatory · 9h ago
From workplace perspective, they don't have a reason to care. What they'll care about is that you're productive right now - if you won't become better dev in the future? Your issue.
nis251413 · 5h ago
Even a single person may do different things that will change whether using an LLM helps or not.
Much of the time I spend writing code, not thinking about the general overview etc but about the code I am about to write itself, and if I actually care about the actual code (eg I am not gonna throw it away anyway by the end of the day) it is about how to make it as concise and understandable to others (incl future me) as possible, what cases to care about, what choices to make so that my code remain maintainable after a few days. It may be about refactoring previous code and all the decisions that go with that. LLM generated code, imo, is too bloated; them putting stuff like asserts is always a hit or miss about what they will think is important or not. Their comments tend to be completely trivial, instead of stating the intention of stuff, and though I have put some effort in getting them use a coding style similar to mine, they often fail there too. In such cases, I only use them if the code they write can be isolated enough, eg write a straightforward, auxiliary function here and there that will be called in some places but does not matter as much what happens in there. There are just too many decisions at each step that LLMs are not great at resolving ime.
I depend more on LLMs if I care less about maintenability of the code itself and more about getting it done as fast as possible, or if I am just exploring and do not actually care about the code at all. For example, it can be I am in a rush to get sth done and care about the rest later (granted they can actually do the task, else I am losing time). But when I tried this for my main work, it soon became a mess that would take more time to fix even if they seem like speeding me up initially. Granted, if my field was different and the languages I was using more popular/represented in training data, I may have found more uses for them, but I still think that after some point it becomes unsustainable to leave decisions to them.
diggan · 9h ago
Taking your "to 10x you" as hyperbole and to actually mean "more productive", if you replace "Python" with "Programming" and "IDE" with Neovim, that's basically me. And I'm way more productive with LLMs than without them. Granted, I stay far away from "vibe coding", only use LLMs for some parts and don't use "agentic LLMs" or whatever, just my own programmed "Human creates issue on GitHub, receive N PRs back with implementations" bot.
Basically, use LLMs as a tool for specific things, don't let them do whatever and everything.
stef25 · 9h ago
Recently I tried getting ChatGPT to help me build a Wordpress site (which I know nothing about), starting from an Adobe design file. It spent hours thinking, being confused and eventually failed completely.
However it's great for simple "write a function that does X", which I could do myself but it would take longer, be boring and require several iterations to get it right.
Having said that, blindly copying a few lines of ChatGPT code did lead to automated newsletters being sent out with the wrong content.
anonzzzies · 9h ago
> until you've learned to use it
You have the copilot mode which takes no learning at all which might give you some speedup, especially if you are doing repetitive stuff, it might even 10x+ you.
You have cmdk mode which you need to prompt and seems to he a lobotomized version of chat. I find putting comments and waiting for the copilot mode to kick in better as then the way we got there is saved.
Then there is agentic editing chat: that is the timewaster you speak off I believe, but what is there to learn? Sometimes it generates a metric ton of code, including in legacy massive code bases, that help, and often it just cannot do whatever.
I don't think these cases you make, or at least, when the second one goes beyond the basics, are different. There is nothing to learn except that you need read all the code, decide what you want in tech detail and ask that of the agentic chat. Anything else fails beyond the basics and 'learning to use it' will be that but if you didn't know that after 5 minutes you definitely didn't do any 'fine tuned pycharm ide', ever.
It is a tool that customizes code it ingested for your case specifically, if it can. That is it. If it never saw a case, it won't solve it, no matter what you 'learn to use'. And I am fine doing that in public: we use LLMs a lot and I can give you very simple cases that, besides (and often even that doesn't work) typing up the exact code, it will never fix with the current models. It just gets stuck doing meaningless changes with confidence.
ekidd · 9h ago
> You have the copilot mode which takes no learning at all which might give you some speedup, especially if you are doing repetitive stuff, it might even 10x+ you.
I have some grey hair and I've been programming since I was a kid. Using CoPilot autocompletion roughly doubles my productivity while cutting my code quality by 10%.
This happens because I can see issues in autocompleted code far faster than I can type, thanks to years of reading code and reviewing other people's code.
The 10% quality loss happens because my code is no longer lovingly hand-crafted single-author code. It effectively becomes a team project shared by me and the autocomplete. That 10% loss was inevitable as soon as I added another engineer, so it's usually a good tradeoff.
Based on observation, I think my productivity boost is usually high compared to other seniors I've paired with. I see a lot of people who gain maybe 40% from Copilot autocomplete.
But there is no world in which current AI is going to give me a 900% productivity boost when working in areas I know well.
I am also quite happy to ask Deep Research tools to look up the most popular Rust libraries for some feature, and to make me a pretty table of pros and cons to skim. It's usually only 90% accurate, but it cuts my research time.
I do know how to drive Claude Code, and I have gotten it to build a non-trivial web front-end and back-end that isn't complete garbage without writing more than a couple of dozen lines myself. This required the same skill set as working with an over-caffeinated intern with a lot of raw knowledge, but who has never written anything longer than 1,000 lines before. (Who is also a cheating cheater.) Maybe I would use it more if my job was to produce an endless succession of halfway decent 5,000-line prototypes that don't require any deep magic.
Auto-complete plus Deep Research is my sweet spot right now.
anonzzzies · 8h ago
I get very good results with very little effort, but that is because I have written code for 40 years fulltime. Not because I know the tool better.
ekianjo · 9h ago
> THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.
they will make you clueless about what the code does and your code will be unmaintanable.
belter · 9h ago
We finally found a metric to identify the really valuable coders in my company :-)
jrh3 · 7h ago
> have high tolerance for bugs and just want to ship
LOL
stevepotter · 9h ago
I do a variety of things, including iOS and web. Like you mentioned, LLM results between the two are very different. I can't trust LLM output to even compile, much less work. Just last night, it told me to use an API called `CMVideoFormatDescriptionGetCameraIntrinsicMatrix`. That API is very interesting because it doesn't exist. It also did a great job of digging some deep holes when dealing with some tricky Swift 6 concurrency stuff. Meanwhile it generated an entire nextjs app that worked great on the first shot. It's all about that training data baby
kartoffelsaft · 7h ago
Honestly, with a lot of HN debating the merits of LLMs for generating code, I wish it were an unwritten rule that everyone states the stack they're using with it. It seems that the people who rave about it creating a whole product line in a weekend are asking it to write them a web iterface using [popular js framework] that connects to [ubiquitous database], and their app is a step or two away from being CRUD. Meanwhile, the people who say it's done nothing for them are writing against [proprietary in-house library from 2005].
The worst is the middleground of stacks that are popular enough to be known but not enough for an LLM to know them. I say worst because in these cases the facade that the LLM understands how to create your product will fall before you the software's lifecycle ends (at least, if you're vibe-coding).
For what it's worth, I've mostly been a hobbyist but I'm getting close to graduating with a CS degree. I've avoided using LLMs for classwork because I don't want to rob myself of an education, but I've occasionally used them for personal, weird projects (or tried to at least). I always give up with it because I tend to like trying out niche languages that the LLM will just start to assume work like python (ex: most LLMs struggle with zig in my experience).
englishspot · 6h ago
> Meanwhile, the people who say it's done nothing for them are writing against [proprietary in-house library from 2005].
there's MCP servers now that should theoretically help with that, but that's its own can of worms.
andy99 · 9h ago
Overshooting the capabilities of LLMs is pretty natural when you're exploring them. I've been using them to partially replace stack overflow or get short snippets of code for ~2 years. When Claude code came out, I gave it increased responsibility until I made a mess with it, and now I understand where it doesn't work and am back to using LLMs more for ideas and advice. I think this arc is pretty common.
maerch · 9h ago
Exactly my thoughts. It seems there’s a lot of all-or-nothing thinking around this.
What makes it valuable to me is its ability to simplify and automate mundane, repetitive tasks. Things like implementing small functions and interfaces I’ve designed, or even building something like a linting tool to keep docs and tests up to date.
All of this has saved me countless hours and a good deal of sanity.
spacemadness · 3h ago
I've found LLMs are extremely hit or miss with iOS development. I think part of that might be how quickly Swift and SwiftUI is changing coupled with how bad Apple documentation is. I have loved using them to generate quick views and such for scaffolding purposes and quick iterations, but they tend to break down quickly around asynchronous coding and non-trivial business logic. I will say they're still incredibly useful to point you in a direction, but can be very misleading and send you down a hallucination rabbit hole easily.
arctek · 8h ago
Similar to my experience, it works well for small tasks, replacing search (most of the time) and doing alot of boilerplate work.
I have one project that is very complex and for this I can't and don't use LLMs for.
I've also found it's better if you can get it code generate everything in the one session, if you try other LLMs or sessions it will quickly degrade.
That's when you will see duplicate functions and dead end code.
jrvarela56 · 8h ago
You can use the LLm to decompose tasks. As you said, tasks that are simple and have solutions in the trainning data can save you time.
Most code out there is glue. So there’s a lot of trainning data on integrating/composing stuff.
If you take this as a whole, you could do that 30-60 min into 5 min for most dev work.
sublinear · 9h ago
> I also use it for building things like app landing pages.
This is a reasonable usage of LLMs up to a certain point, and especially if you're in full control of all the requirements as the dev. If you don't mind missing details related to sales and marketing such as SEO and analytics, I think those are not really "landing pages", but rather just basic web pages.
> I hate web development, and LLMs are pretty good at it because I'd guess that is 90% of their training data related to software development.
Your previous sentence does not support this at all since web development is a much more broad topic than your perception of landing pages. Anything can be a web app, so most things are nowadays.
dfxm12 · 7h ago
I don't get the whole "all-in" mentality around LLMs.
They are being marketed a virtual assistants that will literally do all the work for you. If they become marketed truthfully, however, people will probably realize that they aren't worth the cost and it's largely more beneficial to search the web and/or crowdsource answers.
a7fort · 9h ago
I think you're doing it right, it's just hard to resist the temptation to use AI for everything, when you're getting decent results for small things.
cosiiine · 8h ago
The AI-Assist tools (Cursor, Windsurf, Claude Code, etc) want you to be "all-in" and that's why so many people end up fighting them. A delicate balance is hard to achieve when you're discarding 80% of the suggestions for 20% of the productivity boosts.
rco8786 · 9h ago
There's a whole section in the doc called "A happy medium"
llm_nerd · 9h ago
>I don't get the whole "all-in" mentality around LLMs
To be uncharitable and cynical for a moment (and talking generally rather than about this specific post), it yields content. It gives people something to talk about. Defining their personality by their absolutes, when in reality the world is an infinite shades of gradients.
Go "all in" on something and write about how amazing it is. In a month you can write your "why I'm giving up" the thing you went all in on and write about how relieved/better it is. It's such an incredibly tired gimmick.
"Why I dumped SQL for NoSQL and am never looking back"
"Why NoSQL failed me"
"Why we at FlakeyCo are all in on this new JavaScript framework!"
"Why we dumped that new JavaScript framework"
This same incredibly boring cycle is seen on here over and over and over again, and somehow people fall for it. Like, it's a huge indicator that the writer more than likely has bad judgment and probably shouldn't be the person to listen to about much.
Like most rational people that use decent judgement (rather than feeling I need to "all in" on something, as if the more I commit the more real the thing I'm committing to is), I leverage LLMs many, many times in my day to day. Yet somehow it has authored approximately zero percentage of my actual code, yet is still a spectacular resource.
spiderfarmer · 9h ago
People just do stupid stuff like "going all in" for their blog posts and videos. Nuance, like rationalism, doesn't get engagement.
dyauspitr · 4h ago
As a previous iOS dev I was able to spin up a moderately complex app in a weekend, something that would have taken me probably at least a couple of weeks in the past. I have no idea what you’re on about. I don’t even use cursor and windsurf etc, I’m having chatGPT and Gemini just dump all their outputs into single files and manually breaking them up.
meander_water · 9h ago
The thing most LLM maximalists don't realize is that the bottleneck for most people is not code generation, it's code understanding. You may have doubled the speed at which you created something, but you need to pay double that time back in code review, testing and building a mental model of the codebase in your head. And you _need_ to do this if you want to have any chance of maintaining the codebase (i.e. bugfixes, refactoring etc.)
emushack · 7h ago
Totally agree! Reading code is harder than writing it, and I think I spend more time reading and trying to understand than I do writing.
But this CEO I just met on LinkedIn?
"we already have the possibility to both improve our productivity and increase our joy. To do that we have to look at what software engineering is. That might be harder than it looks because the opportunity was hidden in plain sight for decades. It starts with rethinking how we make decisions and with eliminating the need for reading code by creating and employing contextual tools."
Context is how AI is a whole new layer of complexity that SWE teams have to maintain.
I have often had the same thought in response to the effusive praise some people have for their sophisticated, automated code editors.
kragen · 6h ago
I've found LLMs are pretty good at explaining my code back to me.
VMG · 7h ago
This is not true.
It may be bad practice, but consider that the median developer does not care at all about the internals of the dependencies that they are using.
They care about the interface and about whether they work or not.
They usually do not care about the implementation.
Code generated by LLM is not that different than pulling in a random npm package or rust crate. We all understand the downsides, but there is a reason that practice is so popular.
rurp · 2h ago
Popular packages are regularly being used and vetted by thousands of engineers and that level of usage generally leads to subtle bugs being found and fixed. Blindly copy/pasting some LLM code is the opposite of that. It might be regurgitating some well developed code, but it's at least as likely to be generating something that looks right but is completely wrong in some way.
emushack · 7h ago
"Code generated by LLM is not that different than pulling in a random npm package or rust crate"
So I really hope you don't pull in packages randomly. That sounds like a security risk.
Also, good packages tend have a team of people maintaining it. How is that the same exactly?
VMG · 7h ago
> So I really hope you don't pull in packages randomly. That sounds like a security risk.
It absolutely is, but that is besides the point
> Also, good packages tend have a team of people maintaining it. How is that the same exactly?
If you're a developer, you do yourself a disservice by describing it this way.
qudat · 7h ago
> They usually do not care about the implementation.
[citation needed]
> Code generated by LLM is not that different than pulling in a random npm package or rust crate
It's not random, there's an algorithm for picking "good" packages and it's much simpler than reviewing every single line of LLM code.
VMG · 7h ago
>> They usually do not care about the implementation.
> [citation needed]
Everybody agrees that e.g. `make` and autotools is a pile of garbage. It doesn't matter, it works and people use it.
> It's not random, there's an algorithm for picking "good" packages and it's much simpler than reviewing every single line of LLM code.
But you don't need to review every single line of LLM code just as you don't need to review every single line of dependency code. If it works, it works.
Why does it matter who wrote it?
marcosdumay · 5h ago
If you as a developer care so much about stuff that the software users won't care about, you should look for better tools.
skydhash · 7h ago
Everything compounds. Good architecture makes it easy to maintain things later. Bad code will slow you down to a snail pace and will result in 1000s of bug tickets.
lawn · 1h ago
> Code generated by LLM is not that different than pulling in a random npm package or rust crate.
Yes, LLM code is significantly worse than even a random package as it very often doesn't even compile.
rco8786 · 9h ago
> So I do a “coding review” session. And the horror ensues.
Yup. I've spoken about this on here before. I was a Cursor user for a few months. Whatever efficiency gains I "achieved" were instantly erased in review, as we uncovered all the subtle and not-so-subtle bugs it produced.
Went back to vanilla VSCode and still use copilot but only when I prompt it to do something specific (scaffold a test, write a migration with these columns, etc).
Cursor's tab complete feels like magic at first, but the shine wore off for me.
Izkata · 7h ago
> Cursor's tab complete feels like magic at first, but the shine wore off for me.
My favorite thing here watching a co-worker is when Cursor tries to tab complete what he just removed, and sometimes he does it by reflex.
manmal · 9h ago
What kind of guardrails did you give the agent? Like following SOLID, linting, 100% code coverage, templates, architectural documents before implementing, architectural rules, DRY cleanup cycles, code review guidelines (incl strict rules around consistency), review by another LLM etc?
breckenedge · 7h ago
Not the OP, but in my experience LLMs are still not quite there on guardrails. They might be for 25-50% of sessions, but it’ll vary wildly.
manmal · 3h ago
Depends on the LLM, recent Gemini models are quite good in this regard.
nicodjimenez · 9h ago
I tend to agree with this. These days I usually use LLMs to learn about something new or to help me generate client code for common APIs (especially boto3 these days). I tried Windsurf to help me make basic changes to my docker compose files, but when it couldn't even do that correctly, I lost a little enthusiasm. I'm sure it can build a working prototype of a small web app but that's not enough for me.
For me LLMs are a game changer for devops (API knowledge is way less important now that it's even been) but I'm still doing copy pasting from ChatGPT, however primitive it may seem.
Fundamentally I don't think it's a good idea to outsource your thinking to a bot unless it's truly better than you at long term decision making. If you're still the decision maker, then you probably want to make the final call as to what the interfaces should look like. I've definitely had good experiences carefully defining object oriented interfaces (eg for interfacing with AWS) and having LLMs fill in the implementation details but I'm not sure that's "vibe coding" per se.
pizzathyme · 7h ago
I had a similar experience as the author. I've found found that cursor / copilot are FANTASTIC at "smart autocomplete", or "write a (small function that does this)" and quick viral prototypes.
But after I got a week into my LLM-led code base, it became clear it was all spaghetti code and progress ground to a halt.
This article is a perfect snapshot of the state of the art. It might improve in the future, but this is where it is in May 2025.
dmazin · 10h ago
This rings true to me.
I still use LLMs heavily. However, I now follow two rules:
* Do not delegate any deep thought to them. For example, when thinking through a difficult design problem, I do it myself.
* Deeply review and modify any code they generate. I go through it line-by-line and edit it thoroughly. I have to do this because I find that much of what they generate is verbose, overly defensive, etc. I don't care if you can fix this through prompting; I take ownership over future maintainability.
"Vibe coding" (not caring about the generated code) gives me a bad feeling. The above approach leaves me with a good feeling. And, to repeat, I am still using them a lot and coding a lot faster because of it.
chuckadams · 9h ago
I delegate all kinds of deep analysis to the AI, but it's to create detailed plans with specific implementation steps and validation criteria, backed by data in reproducible reports (i.e. "generate a script to generate this json data and another to render this data"). Plans have a specific goal that is reflected in the report ("migrated total should be 100%"). It's still an iterative process, the generators and plans have to be refined as it misses edge cases, but that's plans in general, AI or no.
It takes a good hour or two to draw up the plans, but it's the kind of thing that would take me all day to do, possibly several as my ADHD brain rebels against the tedium. AI can do yeomans work when it just wings it, and sometimes I have just pointed at a task and did it in one shot, but they work best when they have detailed plans. Plus it's really satisfying to be able to point at the plan doc and literally just say "make it so".
rco8786 · 9h ago
> Deeply review and modify any code they generate. I go through it line-by-line and edit it thoroughly
This is the issue, right? If you have to do this, are you saving any time?
tasuki · 8h ago
(Usually) Yes. The LLM can generate three functions, over 100 lines of code, and I spend perhaps 15 minutes rearranging it so it pleases me aesthetically. It would've taken me an hour or two to write.
I find most benefit in writing tests for a yet-inexistent function I need, then giving the LLM the function signature, and having it implement the function. TDD in the age of LLMs is great!
dmazin · 9h ago
I know what you’re saying, but I’m saving time by:
* getting ideas for how logic etc could be implemented
* boilerplate is a thing of the past
The other thing is that I have the LLM make the modifications I want.
I know how long it takes to get an extremely bad programmer to do what you want, but the LLM is far better than that, so I do come out ahead.
jgilias · 9h ago
I believe this depends on the individual. For me, yeah, I am. But I do have colleagues who wouldn’t be.
VMG · 9h ago
I get it and I see the same problems as the author.
I'm working on a few toy projects and I am using LLM for 90% of it.
The result is 10x faster than if I coded it "by hand", but the architecture is worse and somewhat alien.
I'm still keeping at it, because I'm convinced that LLM driven code is where things are headed, inevitably. These tools are just crazy powerful, but we will have to learn how to use them in a way that does not create a huge mess.
Currently I'm repeatedly prompting it to improve the architecture this way or that way, with mixed results. Maybe better prompt engineering is the answer? Writing down the architecture and guidelines more explicitly?
Imagine how the whole experience will be if the latency was 1/10th of what it is right now and the tools are 10x better.
a7fort · 9h ago
I hope we get to that "10x better" point. I think the problem right now is people advertising LLMs as if we're there already. And it's not just the providers, it's also the enthusiasts on X/Reddit/etc that think they have found the perfect workflow or prompt.
Just like you're mentioning "maybe better prompt engineering", I feel like we're being conditioned to think "I'm just not using it right" where maybe the tool is just not that good yet.
VMG · 8h ago
Well "I'm just not using it right" is a perfectly reasonable thought for a new technology. Isn't this the default? When new powerful tools come along, they often fit awkwardly into existing processes.
jfim · 9h ago
One thing you can do is to define the classes and methods that you want to have, and have the LLM implement them. For tricky things, you can leave additional notes in the empty method body as to how things should be implemented.
This way you're doing the big picture thinking while having the LLM do what's it's good at, generating code within the limits of its context window and ability to reason about larger software design.
I mostly treat the LLM as an overly eager to please junior engineer that types very quickly, who can read the documentation really quickly, but also tends to write too much code and implement features that weren't asked for.
One of the good things is that the code that's generated is so low effort to generate that you can afford to throw away large chunks of it and regenerate it. With LLM assistance, I wrote some code to process a dataset, and when it was too screwy, I just deleted all of it and rewrote it a few times using different approaches until I got something that worked and was performant enough. If I had to type all of that I would've been disappointed at having to start over, and probably more hesitant to do so even if it's the right thing to do.
acureau · 7h ago
I've found a lot of value in this approach as well, I don't delegate any architecture decisions to LLMs. I build out the high-level and I see if the LLM can fill the gaps. I've found they are good at writing pure functions, and am good at composing them and managing state.
jacob019 · 9h ago
This is where they shine, for prototyping greenfield projects, but as the project gets closer to production that 10x erodes. You have to be really intentional about the architecture, fixing core design issues later can turn 10x into 0.1x.
jdiff · 9h ago
At least currently, the only use pattern that can withstand complex codebases is as advanced speech-to-text, but without the speech if that makes sense. The main issue with that is phrasing things in English is often far more verbose, so without the speech, it's very often faster to just do it by hand.
geraneum · 9h ago
> Writing down the architecture and guidelines more explicitly?
Yes, very explicit like “if (condition) do (action)” and get more explicit when… oh wait!
skydhash · 7h ago
Yeah. I never understood where people are coming with “you need guardrails, extensive architecture docs, coding rules,…”. For every software and features I wrote, I already have a good idea of the objectives before I even start to code. I do the specific part with code, going back to the whiteboard when I need to think.
It’s an iterative process, not a linear one. And the only hige commits are the scaffolding and the refactorings. It’s more like sculpture than 3d printing, a perpetual refinement of the code instead of adding huge lines of code.
This is the reason I switched to Vim, then Emacs. They allow for fast navigation, and faster editing. And so easy to add your own tool as the code is a repetitive structure. The rare cases I needed to add 10s of lines of code is with a code generator, or copy-pasting from some other file.
lesser23 · 7h ago
Many places are quite literally forcing their software engineers to use LLMs. Complete with cursor/copilot is the ability to see usage statistics and surely at these companies these statistics will eventually be used as firing criteria.
I gave them a fair shake. However, I do not like them for many reasons. Code quality is one major reason. I have found that after around a month of being forced to use them I felt my skill atrophy at an accelerated rate. It became like a drug where instead of thinking through the solution and coming up with something parsimonious I would just go to the LLM and offload all my thinking. For simple things it worked okay but it’s very easy to get stuck in a loop. I don’t feel any more productive but at my company they’ve used it as justification to increase sprint load significantly.
There has been almost a religious quality associated to LLMs. This seems especially true among the worst quality developers and the non-technical morons at the top. There are significant security concerns that extend beyond simple bad code.
To me we have all the indicators of the maximum of the hype cycle. Go visit LinkedIn for confirmation. Unless the big AI companies begin to build nuclear power it will eventually become too expensive and unprofitable to run these models. They will continue to exist as turbo autocomplete but no further. The transformer model has fundamental limitations and much like neural networks in the 80s it’ll become more niche and die everywhere else. Like its cousins WYSIWIG and NoCode in 30 more years it’ll rise again like a phoenix to bring “unemployment” to developers once more. It will be interesting to see who among us was swimming without clothes when the water goes out.
alexjplant · 5h ago
> I have found that after around a month of being forced to use them I felt my skill atrophy at an accelerated rate
I've started a "no Copilot Fridays" rule for myself at $DAYJOB to avoid this specifically happening.
bwfan123 · 7h ago
My use of cursor is limited to auto-complete, and small snippets. Even then, i can feel my skills atrophying.
use-it-or-lose-it is the cognitive rule.
mrighele · 9h ago
I have been trying LLMs in a couple of new small projects recently.
I got more success that I hoped for, but I had to adjust my usage to be effective.
First of all, treat the LLM as a less experienced programmer. Don't trust it blindly but always make code review of its changes. This gives several benefits.
1) It keeps you in touch with the code base, so when need arise you can delve into it without too much trouble
2) You catch errors (sometimes huge ones) right away, and you can have them fixed easily
3) You can catch errors on your specification right away. Sometimes I forget some detail and I realize it only when reviewing, or maybe the LLMs did actually handle it, and I can just tell it to update the documentation
4) You can adjust little by little the guidelines for the LLM, so that it won't repeat the same "mistakes" (wrong technical decisions) again.
In time you get a feeling of what it can and cannot do, where you need to be specific and where you know it will get it right, or where you don't need to go into detail. The time required will be higher than vibe coding, but decreases over time and still better than doing by myself.
There is another important benefit for me in using an LLM. I don't only write code, I do in fact many other things. Calls, writing documentation, discussing requirements etc. Going back to writing code requires a change of mental state and to recall into memory all the required knowledge (like how is the project structured, how to use some apis etc.). If I can do two hours of coding it is ok, but if the change is small, it becomes the part where I spend the majority of time and mental energy.
Or I can ask the LLM to make the changes and review them. Seeing the code already done requires less energy and will help me reminding stuff.
tuan · 6h ago
> LLMs are okay at coding, but at scale they build jumbled messes.
This reminds me of the day of Dreamweaver and the like. Everybody loved how quickly they could drag and drop UI components on a canvas, and the tool generated HTML code for them. It was great at the beginning, but when something didn't work correctly, you spent hours looking at spaghetti HTML code generated by the tool.
At least, back then, Dreamweaver used deterministic logic to generate the code. Now, you have AI with the capability to hallucinate...
bwfan123 · 7h ago
This [1] is an old and brilliant article titled "On the foolishness of natural language programming" by Dijkstra relevant to this debate.
The argument is that the precision allowed by formal languages for programming, math etc were the key enabler for all of the progress made in information processing.
ie, Vibe-coding with LLMs will make coding into a black-art known only to the shamans who can prompt well.
Personally, I use LLMs to write code that I would have never bothered writing in the first place. For example, I hate web front-end development. I'm not a web dev, but sometimes it's cool to show visual demos or websites. Without LLMs, I wouldn't have bothered creating those, because I wouldn't have had the time anyway, so in that case, it's a net positive.
I don't use LLMs for my main pieces of work exactly due to the issues described by the author of the blogpost.
qwertox · 9h ago
Go to AI Studio, select Gemini Pro, give it your code, or describe a problem you want to solve, and then tell it that you want to --> discuss <-- the code. That you don't want it to generate code for you, but that you want to discuss it, that you want it to comment on your code or on how to solve the problem.
This is the best way to get Gemini to be a really good assistant, unless you want to add System Instructions which precisely describe how it should behave.
Because if you just say it should solve some problem for you, it eagerly will generate a lot of code for you, or add a lot of code to the clean code which you provided.
picklesman · 9h ago
Yeah my current approach is to generate a plan with Gemini Pro, with plenty of back and forths, and then have it write the plan to a markdown file. Afterwards I get either it or another model to follow the plan step by step. Without doing this the results are questionable and often require going back and fixing a lot.
jdiff · 9h ago
Even limiting its scope can be risky, if it's a complex problem in a niche that's not well-represented in the training data. Been learning Gleam lately, and yesterday when wrapping my head around recursion was starting to give me a headache, I tried to ask Gemini Pro to write a function. It needed to recurse down a tree, perform a function on each child that requires information accumulated from each of its ancestors, and return all nodes in the tree as a flat list.
It returned over 600 lines of code across 3 code blocks, almost all of them commented out for some reason, each with an accompanying essay, and each stuffed with hallucinated and unnecessary helper functions. Apparently Gemini Pro struggles to wrap its weights around recursion more than I do. I just wrote it myself and only needed 26 lines. It's not using tail calls, but hey, my target platform still doesn't support tail call optimization in 2025 anyway.
Ozzie_osman · 9h ago
> One morning, I decide to actually inspect closely what’s all this code that Cursor has been writing.
You can't abdicate your responsibility as a builder to the LLM. You are still responsible for the architecture, for the integrity, for the quality. In the same way you wouldn't abdicate your responsibility if you hired a more junior engineer.
0x500x79 · 4h ago
I have also stepped back from LLM coding a bit. I still utilize LLMs for API discussions, maybe 1-off things I would stackoverflow before but I have stepped back a bit from autocomplete everywhere and even agentic flows.
And I feel more productive. I recommend that everyone gives it a try.
As a tenured SW developer in my company my measurements for success are much more than "how much code can I spit out". There are mentoring, refactoring, code readability/mantainability, and quality that are important to my job. I found that LLM generated code was not hitting the necessary bars for me in these areas (agent or code autocompletion) and so I have stepped back from them. The readability point is extra important to me. Having maintained million lines of code products, I have found that readability is more important than writing a ton of code: and LLMs just don't hit the bar here.
When I am playing with side projects that I don't have the same bar on, sure Ill have bolt or lovable generate me some code in combination with cursor or windsurf, but these are low stakes and in some ways I just want to get something on paper.
jwblackwell · 7h ago
> the current PHP+MySQL combo was not fit for purpose anymore
He lost me here. Sounds like he tried to change from a boring stack he understood, to Go and Clickhouse because it's cooler.
You're asking for trouble. LLMs are great and getting better, but you can't expect them to handle something like this right now
JimDabell · 9h ago
> I’ve never used Go or Clickhouse before
> I have no concept of Go or Clickhouse best practices.
> One morning, I decide to actually inspect closely what’s all this code that Cursor has been writing. It’s not like I was blindly prompting without looking at the end result, but I was optimizing for speed and I hadn’t actually sat down just to review the code.
> I’m defaulting to coding the first draft of that function on my own.
I feel like he’s learnt the wrong lesson from this. There is a vast gulf between letting an LLM loose without oversight in a language you don’t know and starting from scratch yourself. There’s absolutely nothing wrong with having AI do the first draft. But it actually has to be a first draft, not something you blindly commit without review.
> “Vibe coding”, or whatever “coding with AI without knowing how to code” is called, is as of today a recipe for disaster, if you’re building anything that’s not a quick prototype.
But that’s what vibe coding is. It’s explicitly about quick throwaway prototypes. If you care about the code, you are not vibe coding.
> There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. […] It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.
He’s basically saying that vibe coding is a disaster after his experience doing something that is not vibe coding.
jdiff · 9h ago
That is what vibe coding is. The tweet says "It's not too bad for throwaway projects" but that does not limit the definition, it only limits the advisable application of it.
JimDabell · 4h ago
Vibe coding is “Forget the code even exists; it mostly works for throwaway stuff”. That is clearly not what this person wants.
You can’t pick up vibe coding and then complain that it’s behaving as described or that it isn’t giving you something that wasn’t promised.
jdiff · 4h ago
Don't disagree at all. It was never promised that LLMs could do what he's trying to do. But just because he's holding it wrong doesn't mean that he's not holding it.
psim1 · 7h ago
"Using [your] brain" is key. And your brain has to understand software engineering in the first place, before you can use it. In my project, the product manager has recently tried his hand at vibe coding. LLM produces what you would expect it to produce and as described in the article: ok but not robust code, and fast. So now he thinks he knows how fast coding should be and has gone light speed on the product, expecting engineers to produce high quality software at the speed of LLM-produced "ok code". And it's hard to keep expectations reasonable when the world is all a-hype about LLM coding.
flanked-evergl · 7h ago
You should quit, with people like that running things it's going to get run into the ground. Problem is that most product managers don't get that their position is the fail upwards position, the position you get when you literally can't do anything useful.
bookofjoe · 1h ago
As exemplified by the comments on this HN submission, whether LLMs are — to borrow a phrase from "Blade Runner" — a benefit or a hazard seems unresolvable for now.
in which Windsurf is forging ahead with an agentic LLM product that endeavors to not only replace software engineers but actually take over the entire software engineering PROCESS.
We're at a very interesting point, where investors and corporate interests are crystal clear in their intent to use LLMs to replace as many expensive humans as possible, while the technology available to do so is not there yet. And depending on your own perspective, it's not clear it ever will be, or perhaps it'll eventually be "good enough" for them to downsize us anyway.
I keep thinking about compilers. The old timers had been writing assembly by hand for years. The early C compilers were producing defective garbage output that was so incredibly poor that it was faster to keep coding by hand. Then the compilers got better. And better, and better, and now pretty much nobody inspects the output, let alone questions it.
mrbluecoat · 7h ago
1. Company uses AI for some development
2. The ratio increases and programmer layoffs begin
3. Bugs appear, AI handles most but not all
4. Bug complexity increases, company hires programmers to fix
5. Programmers can't decipher AI code mess
6. Programmers quit
7. Company unable to maintain their product
> the horror ensues
3dsnano · 9h ago
This perspective can only come from someone who started programming in the pre-LLM era. Back in those days the code didn’t write itself and you had to think really hard about shit and bang your head in the wall. This literally requires strong and coherent mental models, because there’s no other way… there were no shortcuts.
Post-LLM era, you can use a shortcut to get a representation of what something could be. However I see a lot of folks shipping that representation straight to prod. By comparison, the mental models created are weaker and have less integrity. You might be able to feel that something is off, but you lack the faculties to express or explain why.
rossdavidh · 7h ago
Hypothesis: the actual long-term impact of LLM's on coding will be that more different languages get used, because it reduces the "ugh I don't want to have to learn all that" psychological obstacle that prevents programmers experienced in one language from learning another. We should be using different languages for different tasks, but the push has always been to try to get One Language To Rule Them All. Ten years ago people tried to make Javascript their db query language and their backed language. More recently, the push is to use Python for everything. Both languages are fine in certain domains, but a poor fit in others.
Even if the idea that an LLM will help you do it is false, perhaps it is still a good idea if it convinces the experienced programmer to go ahead and use SQL for the query, Go for the async, Javascript for the frontend, etc. Right now, few if any companies would let you use the best tool for the job, if that's not one they already use. Perhaps the best use of LLMs is to convince programmers, and their bosses, to use the best tool for each job.
But, after you've gotten past that part, you will probably (like the author of this article) need to throw away the LLM-generated code and write it yourself.
pjm331 · 7h ago
counter hypothesis - fewer languages will be used because LLMs will be better at languages with lots of training data
I never write python in my day to day but every from-scratch project I've knocked out with Claude code has been in python because that's what it seems to default to if I don't specify anything
rossdavidh · 6h ago
Interesting point!
I wonder if this will also mean that new languages (or even algorithms or code patterns) are harder to get adopted, because the mass of existing code (that LLMs learned from) exerts a gravitational force pulling things back down to the status quo.
martinald · 7h ago
I agree with this and matches my thinking (however - the fact that this is even an option is an incredible breakthrough that I did not expect to happen preLLMs - potentially in decades).
I have basically given up on "in IDE" AI for now. I simply have a web interface on my 2nd monitor of whatever the "best" LLM (currently Gemini, was Claude) is and copy paste snippets of code back and forth or ask questions.
This way I have to physically copy and paste everything back - or just redo it "my way", which seems to be enough of an overhead that I have to mentally understand it. When it's in an IDE and I just have to tick accept I just end up getting over eager with it, over accepting things and then wishing I hadn't, and spend more time reverting stuff when the penny drops later this is actually a bad way to do things.
It's really more a UX/psychology problem at the moment. I don't think the 'git diff' view of suggested changes is the right one that many IDEs use for it - I'm psychologically conditioned to review these like pull requests and it seems my critique of these is just not aligned to critiquing LLM code. 99% of PR reviews I do are finding edge cases and clarifying things. They're not looking at very plausible yet subtly completely wrong code (in the most part).
To give a more concrete example; if someone is doing something incredibly wrong in 'human PRs' they will tend to name the variables wrong because they clearly don't understand the concept, at which point the red flag goes off in my head.
In LLM PRs the variables are named often perfectly - but just don't do what they say they will. This means my 'red flag' doesn't fire as quickly.
Xcelerate · 8h ago
I noticed a funny thing a while back after using the “bird’s eye view” capability in my car to help with parallel parking: I couldn’t parallel park without that feature anymore after a few years. It made for an awkward situation every time I visited the Bay Area and rented a car. I realized I had become so dependent on the bird’s eye view that I had lost the ability to function without it.
Luckily in this particular case, being able to parallel park unassisted isn’t all that critical in the overall scheme of things, and as soon as I turned that feature off, my parking skills came back pretty quickly.
But the lesson stuck with me, and when LLMs became capable of generating some degree of functioning code, I resolved not to use them for that purpose. I’m not trying to behave like a stodgy old-timer who increasingly resists new tech out of discomfort or unfamiliarity—it’s because I don’t want to lose that skill myself. I use LLMs for plenty of other things like teaching myself interesting new fields of mathematics or as a sounding board for ideas, but for anything where it would (ostensibly) replace some aspect of my critical thinking, I try to avoid them for those purposes.
meander_water · 7h ago
The main narrative has been that developers who don't use AI are going to get left behind, but I think what's actually going to happen is that developers who retain their cognitive abilities are going to become rarer and more sought after.
tacheiordache · 8h ago
Same thing with GPS, I lost the ability to memorize routes.
outcoldman · 6h ago
When you have a hammer in your hand everything looks like a nail.
Developing without the knowledge what LLM writes is dangerous. For me having LLM as a tool is like having a few Junior Developers around, that I can advise and work with. If I have a complicated logic that I need to write - I write it. After I wrote it, I can ask LLM to review it, it might be good to find some corner cases in some places. When I need to "move things from one bucket to another", like call API and save to DB - that is a perfect task for LLM, that I can easily review after.
At the same time, LLM is able to write pretty good complicated logic as well for my side projects. I might need to give it a few hints, but the results are amazing.
jacob019 · 9h ago
It's very fustrating when you end up spending more time tracking down bugs and cleaning up inconsistent code then it would have taken to do it right yourself in the first place. It's an important lesson. That said, there are still massive productivity gains available. If you let the model mess with your existing code base, or if it get's past a small size, it will shit all over it. The LLM needs limited scope. I find it's great for frontend work, which is what I hate doing myself. I've been having it build webcomponents, paired with targeted backend functions that wrap internal libraries that I wrote for db and API calls. Then I compose things. Careful instructions, internal documentaion for the model and well contained isolated tasks are best. The models get confused and distracted by existing code, it's ofen better to document function interfaces and use rather than showing it the source. If the code is going into production you have to read it line by line and clean it up. I've been using Aider, mostly.
kragen · 6h ago
If and when LLMs are able to complete software development projects autonomously, the first sign probably won't be that your own software development job gets easier. Almost certainly, Google AI, DeepSeek, or OpenAI will notice before you do, and that's when their further progress will be limited only by how fast their GPUs can develop AI software instead of how fast their human SWEs can develop AI software.
Eric Schmidt gave a TED interview about the subject this week. He predicts the US and China bombing each other's data centers.
mjlawson · 9h ago
My rule with Cursor is that unless I know exactly what I want it to do, I have it stay out of my way. If I have a general idea, I may turn on auto-complete. I reserve the agent for things like tests, UX, and rote programming that saves me time.
When I do use the agent, I inspect its output ruthlessly. The idea that pages of code can be written before being inspected is horrifying to me.
icedchai · 6h ago
I've been experimenting with LLMs for coding / "agentic" editing using Zed over the past month or so. It's another tool in the toolbox, good for prototyping, also good for rote tasks (like upgrading a legacy code base, converting one call to another.) So far I've been most impressed by Claude Sonnet. The jury is still out... I haven't used it for any "work" tasks, only hobby projects, because I don't feel comfortable doing that yet.
exiguus · 8h ago
Over the past few months, I have read numerous articles on this topic. Some are in favor, some are against, and others fall somewhere in between. However, they all share a common shortcoming: the absence of code. There are no codebases, no commit histories—nothing that allows readers to track how the codebase evolves, gets refactored, or how new features are integrated.
Without access to the code, it's challenging to verify the authors' claims and form an independent opinion. In my view, we should be cautious about trusting articles that lack examples or sources. As someone wiser than me once said:
> Articles without sources are merely the opinions of the author.
mg · 9h ago
What I witness is that many people pick a random model and expect a prompt they wrote in 10 seconds to replace an hour of coding. And then are disappointed and go back to coding manually.
For me, the magic of LLMs is that I already get an hour of coding via 30 minutes of prompting and finetuning the generated code. And I know that the ratio will constantly improve as LLMs become better and as I finetune my prompts to define the code style I prefer. I have been coding for pretty much all my life and I never felt more excited about it than I do now.
It would be cool if people shared their prompts and the resuling commits. If someone here is disappointed by the commits LLMs make, I would love to see the prompt, the commit, and which model made it.
elzbardico · 9h ago
I prefer to ask LLMs to help me to create highly cohesive functions, and absolutely abhor using agents with their overly verbose code, their tendency to do shotgun surgery and the inevitable regressions.
So I kind of do a top-down design and use the LLM to help me with toil or with unfamiliar things that would require me finding the right documentation, code examples, bug reports, etc, etc... LLMs with web search are great for this kind of toil elimination.
LLMs are useful tools, but they have no notion of hierarchy, of causal relationships, or any other relationship, actually. Any time they seem to have those capabilities in the code they generate, it is merely a coincidence, a very probably coincidence, but still, it is not intentional.
shahbaby · 9h ago
Let's take a step back and think about how LLMs are trained.
Think about when chat gpt gives you the side-by-side answers and asks you to rate which is "better".
Now consider the consequence of this at scale with different humans with different needs all weighing in on what "better" looks like.
This is probably why LLM generated code tends to have excessive comments. Those comments would probably get it a higher rating but you as a developer may not want that. It also hints at why there's inconsistency in coding styles.
In my opinion, the most important skill for developers today is not in writing the code but in being able to critically evaluate it.
dev_l1x_be · 9h ago
I think LLMs are great for generating the structure of your code. The only issue is that these models are amazing when you are navigating in very well documented, talked about, etc. subjects, and start to go off the rail when you are using somewhat esoteric things. The speed boost I get by generating Terraform and HTML, CSS, JS with Deepseek and Grok (sometimes asking one to review the others code) is pretty significant. My biggest disappointment is Claude. I have purchased a subscription from them but I need to cancel it. The usual ouput from 3.7 is horrenndous. I am not even sure why. The same prompt works very well with Deepseek and fails miserably with Claude.
sidrag22 · 6h ago
paid for a year of claude all at once and deeply regret it the last month or so. seems like it just tries to do so much extra lately. feel like half of my prompts lately are just reiterating that i dont want it to try to do more than i ask...
if i dont do that it always seems to throw out 3 fresh files ill need to add to make their crazy implementation work.
ive pretty much swapped to using it just for asking for minor syntax stuff i forget. ill take my slower progress in favor of fully grasping everything ive made.
i have one utility that was largely helped by claude in my current project. it drives me nuts, it works but im so terrified of it and its so daunting to change now.
a7fort · 9h ago
You're right and I will still use it, but only for more limited scopes.
I've also cancelled my Claude subscription recently, but for different reasons - it doesn't have the "memory" feature that makes ChatGPT so much more worth it at the moment.
abhisek · 9h ago
I think getting the initial domain modeling, structure and opinions that build up a code base is more of an art than science. It’s highly opinionated. Sure you have SOLID but in my experience no one follows it to the book.
Once the code base structure and opinions are in place, I think LLMs are decent at writing code that are bounded to a single concern and not cross cutting.
LLM generated code bases work initially but so will code written by college kids for an initial set of requirements. Even a staff+ level engineer will struggle in contributing to a code base that is a mess. Things will break randomly. Don’t see how LLMs are any different.
Molitor5901 · 9h ago
I love using LLMs with my debugging but also helping me to get started. Grok has been helpful in giving it a rubric and having it come up with the framework and headers, which saves some time. Unfortunately in my humble and limited experience, asking it to then improve upon that code is where it runs into mistakes. If you give it a perfect rubric of what you want, I think an LLM can give you some very good code, but it can't think ..beyond that like a human can.
Something I do very much love about LLMs is that I can feed it my server logs and get back an analysis of the latest intrusion attempts, etc. That has taught me so much on its own.
reconnecting · 10h ago
Off-topic: may I ask, why PHP + MySQL is no longer considered suitable to you?
It's hard to say without specifics, but simply upgrading from MySQL to PostgreSQL without rewriting the PHP codebase in Go might resolve most of the potential issues.
a7fort · 9h ago
It's two separate issues:
1) I do a lot of scraping, and Go concurrency + Colly has better performance
2) My DB size is exploding and I have limited budget, and it looks like CH is so much better at compressing data. I recently did a test and for the same table with same exact data, MySQL was using 11GB, ClickHouse 500MB
hodgesrm · 5h ago
> MySQL was using 11GB, ClickHouse 500MB
That's pretty typical best case size for weblogs and other time ordered data where column data correlate with time values. You do have to tweak the schema a bit to get there. (Specifically a "good" sort order, codecs, ZSTD instead of LZ4 compression, etc.)
rco8786 · 9h ago
> I recently did a test and for the same table with same exact data, MySQL was using 11GB, ClickHouse 500MB
That's pretty impressive
reconnecting · 9h ago
The actual makes sense. I thought you had a PHP application that you just decided to rewrite to Go.
bhouston · 10h ago
I also tried to go as hard as I could with agents and I ran into a ton of problems. The core issue I found is that inferring "intent" from existing code is hard and it makes errors that accumulate and build up. This ends up wrecking the software project by adding tons of complexity and duplicate ways of doing things and changing assumptions depending on which part of the code base you are in.
I think that the solution is to somehow move towards an "intent" layer that sits above the code.
I still use Cursor, but I use AI judiciously so that I don't wreck the project and it is only acts as an aid.
xyzal · 9h ago
LLMs are a good tool for certain purposes and that's it. Unfortunately, the vendors are obviously interested in inflating their stated capabilities to get the most returns on investment ...
bicepjai · 6h ago
After spending 1000s of dollars on 3 projects that I thought I could finish in a weeks :) with agents with cline, I reached similar conclusion. Best use cases I have seen it work is for good commit messages, templated documentation and cursory understanding existing code base. Shine wore off on me too.
Sxubas · 9h ago
IMO the mistake was not knowing what he was doing. He basically (at a macro scale) copy-pasted stack overflow code without understanding what it does... We've all been there.
I don't think LLMs are at blame here, it is a tool and it can be used poorly. However, I do wonder what's the long term effects on someone who uses them to work on things they are knowledgeable about. Unfortunately this is not explored in the article.
rcarmo · 9h ago
This isn’t so much about coding as it is about architecting code. As I wrote this week, it’s all about planning and a critical viewpoint:
To me, LLM coding is done best when it's used as a bicycle. You still have to pedal, know where you're going, not get run over by a car.
tasuki · 8h ago
> but I was optimizing for speed and I hadn’t actually sat down just to review the code
There, there's your problem. The problem is not LLMs, the problem is people not using their brain. Can't we use LLMs and our brains as well? Both are amazing tools!
Havoc · 9h ago
Kinda blows my mind that people do this with technology they’ve never used.
If I want to learn something new I won’t vibe code it. And if I vibe code I’ll go with tech I have at least some familiarity with so that I can fix the inevitable issues
rednafi · 7h ago
In my case, LLMs just replaced time wasted in stackoverflow fighting the cocky gatekeepers. Otherwise, I code as usual and it’s a great addition to my repertoire.
mattlondon · 9h ago
So the end game for this is, to a certain extent, does the code quality matter if it does what it is needed?
In the past the quality mattered because maintenance and tech-debt was something that we had to spend time and resources to resolve and it would ultimately slow us down as a result.
But if we have LLMs do we even have "resources" any more? Should we even care if the quality is bad if it is only ever LLMs that touch the code? So long as it works, who cares?
I've heard this positioned in two different ways, from two different directions, but I think they both work as analogies to bring this home:
- do engineers care what machine code a compiler generates, so long as it works? (no, or at least very, very, very rarely does a human look at the machine code output)
- does a CEO care what code their engineers generate, so long as it works? (no)
Its a very very interesting inflection point.
The knee jerk reaction is "yes we care! of course we care about code quality!" but my intuition is that caring about code quality is based on the assumption that bad code = more human engineer time later (bugs, maintenance, refactoring etc).
If we can use a LLM to effectively get an unlimited number of engineer resources whenever we need them, does code quality matter provided it works? Instead of a team of say 5 engineers and having to pick what to prioritise etc, you can just click a button and get the equivalent of 500 engineers work on your feature for 15 minutes and churn out what you need and it works and everyone is happy, should we care about the quality of the code?
We're not there yet - I think the models we have today kinda work for smaller tasks but are still limited with fairly small context windows even for Gemini (I think we'll need at least a 20x-50x increase in context before any meaningfully complex code can be handled, not just ToDo or CRUD etc), but we'll get there one day (and probably sooner than we think)
jack_pp · 9h ago
LLMs aren't free. The more garbage you accept into your code base the harder will be for LLMs to fix / extend it because it will have context issues.
At this point in history they aren't good enough to just vibe code complex projects as the author figured out in practice.
They can be very useful for most tasks, even niche ones but you can't trust it completely.
withinboredom · 10h ago
I've come to mostly use ai to bounce ideas off of. Even that has gotten less useful these last few months for some reason. Now it either wants to do the work for me or just spit out a complete agreement without being any bit critical.
alpaca128 · 8h ago
It's known that OpenAI tweaked ChatGPT output so it's very agreeable and flattering no matter what the input is. Which turns out to be a problem because even the most schizophrenic paranoid rant will get an enthusiastic response praising how the user isn't a sheep and doesn't blindly conform to societal norms etc.
And why did they make this change? Because it makes users spend more time on the platform. ChatGPT wasn't just one of the fastest-growing web services ever, it's now also speedrunning enshittification.
sega_sai · 5h ago
After months of using a hammer, I'm back to using my hands.
LLM is a tool. I think we still need to learn how to use it, and in different areas it can do either more or less stuff for you.
Personally I don't it for most of everyday coding, but if I have something tedious to write, the LLM is the first place I go for a code draft. That works for me and for the LLM I use.
pawanjswal · 9h ago
It’s wild how fast we outsourced thinking to LLMs without realizing the cost.
kburman · 9h ago
I only use LLMs later in the development stage to review my code and check if I’ve missed anything. I don’t use them for implementing features, that just feels like a waste of time.
tantalor · 9h ago
You weren't doing any code reviews? What did you expect?
bytemerger · 9h ago
To be honest, the influencers selling this tools are not also helping. I guess to scale adoption you need to over promise in this case
broast · 7h ago
discern when to delegate to the llm and give it enough context and guardrails to simply save you the keystrokes by coding exactly what you were thinking
huqedato · 9h ago
Sadly, the employers (CEOs, managers etc.) don't think like that.
jeremyjh · 9h ago
This tracks for me. Its tempting to see LLMs as a superpower that makes it possible to use any programming language or platform, but I have much better results in platforms I know intimately...and I think this is largely because I'm using them in much more limited ways. When I've tried to pick up new tools and frameworks and heavily use LLMs to do early scaffolding, I have experiences similar to the author.
So someone could come along and say "well don't do that then, duh". But actually a lot of people are doing this, and many of them have far fewer fucks to give than the author and I, and I don't want to inherit their shitty slop in the future.
donatj · 7h ago
I kind of miss figuring things out by myself. I am unwilling to use anything it generates that I don't completely understand so often I spend just as much time figuring out why the code it created works as I would have just writing it myself.
My job has changed from writing code to code reviewing a psychopathic baby.
bowsamic · 9h ago
Coding and writing is like arithmetic and algebra, use it or lose it. You don’t gain fluency by throwing everything into Mathematica.
Of course the real question is, is there any reason to be good at coding and writing, if LLMs can just do it instead? Of course it’s hard to sell that we should know arithmetic when calculators are ubiquitous.
Personally, I value being skilled, even if everyone is telling me my skills will be obsolete. I simply think it is inherently good to be a skilled human
65 · 6h ago
LLMs are pretty similar in concept to site builders. They can do the basics reasonably well but if you want anything custom they show their weaknesses.
I have been building a gallery for my site with custom zoom/panning UX to be exactly how I want it (which by the way does not already exist as a library). Figuring out the equations for it is simply impossible for an LLM to do.
I wouldn't be surprised if after the LLM hype we go back to site/app builders being the entry level option.
jpcompartir · 9h ago
Spot on.
komali2 · 10h ago
> When it was time to start coding, I asked Claude to give me a big and complicated markdown file outlining my existing infrastructure, my desired new infrastructure, what I’m trying to achieve, why I'm doing it, etc.
Yes, imo, too many people believe current LLMs are capable of doing this well, They aren't. Perhaps soon! But not today, so you shouldn't try to use LLMs to do this for serious projects. Writing that MDN file sounds like a wonderful way to get your own head around the infrastructure and chat with other devs / mentors about it though, that's a great exercise, I'm going to steal it.
Anyway, LLMs are, as we all know, really good text predictors. Asking a text predictor to predict too much will be stretching the predictions too thin: give it one file, one function, and very specific instructions, and an LLM is a great way to increase your productivity and reducing mental overload on menial tasks (e.g., swap out all uses of "Button" in @ExperimentModal.tsx with @components/CustomButton.tsx). I think LLM usage in this sense is basically necessary to stay competitively productive unless you're already a super-genius in which case you probably don't read HN comments anyway so who cares. For the rest of us mortals, I argue that getting good at using LLM co-pilots is as important as learning the key bindings in your IDE and OS of choice. Peel away another layer of mental friction between you and accomplishing your tasks!
roxolotl · 9h ago
> Some days I feel like we’re also being gaslit by LLM providers. Just look at any AI related subreddit, and you’ll see people having completely opposite experiences, with the same exact model, with the same prompt, on the same day. If you code with AI long enough you’ll be able to relate. One day it’s amazing, the next day it’s incredibly stupid.
> Are they throttling the GPUs? Are these tools just impossible to control? What the fuck is going on?
Money and dreams. As everyone knows there’s an obscene amount of money invested in these tools of course. The capitalist class is optimistic their money can finally do the work for them directly instead of having to hire workers.
But more than that, AI is something that’s been alluring to humans forever. I’m not talking about cyberpunk fantasies I’m talking about The Mechanical Turk, Automata in the Middle Ages, Talos[0]. The desire to create an artificial mind is, if not hardwired into us, at least culturally a strong driver for many. We’re at a point where the test of the computer age for determining if we built AI was so utterly destroyed it’s unclear how to go about judging what comes next.
The hype is understandable if you view if you step back and view it through that lens. Maybe we are at an inflection point and just a bit more scaling will bring us the singularity. Maybe we’ve seen a burst of progress that’ll mostly stall from a consumer perspective for another 5-10 years, like all of the major AI moments before.
If you want to use them effectively it’s the same as any tool. Understand what they are good at and where they flounder. Don’t give up your craft, use them to elevate it.
Not quite a source but it’s a fun read from 2019 about this.
apwell23 · 9h ago
it works great for bad engineers who don't abstract and churn out repetitive widgets.
Unfortunately ceo manager types cannot distinguish between bad and good enginners
jmyeet · 9h ago
We ahve numerous, well-doocumented cases of LLMs being just plain wrong and/or hallucinating. There is an epidemic of LLM usage in college (and even high school) to the point where lack of critical thinking is becoming a massive problem and will only get worse.
This represents a huge security threat too if code is uncritically applied to code bases. We've seen many examples where people try and influence LLM output (eg [1][2]). These attempts have generally varied from crude to laughably bad but they'll get better.
Is it so hard to imagine prompt injection being a serious security threat to a code base?
That aside, I just don't understand being "all in" on LLMs for coding. Is this really productive? How much boilerplate do you actually write? With good knowledge of a language or framework, that tends to be stuff you can spit out really quickly anyway.
Use the tool when it makes sense or when someone shows you how to use it more effectively. This is exactly like the calculator "ruining people's ability to do arithmetic" when the vast majority of the population has been innumerate for hundreds of thousands of years up til the IR where suddenly dead white european nobility are cool.
There is nothing fun nor interesting about long division as well as software development.
If LLMs don't work for your usecase (yet) then of course you have to stick with the old method, but the "I could have written this script myself, I can feel my brain getting slower" spiel is dreadfully boring.
sidrag22 · 5h ago
there is a lot to be said for adding complexity you dont understand and then trying to work around it, despite not grasping it.
comparing it to no longer doing the long division portion of a math problem isnt a great 1 to 1 here. long division would be a great metaphor if the user is TRULY only using llms for auto complete of tasks that add 0 complexity to the overall project. if you use it to implement something and dont fully grasp it, you are just creating a weird gap in your overall understanding of the code base.
maybe we are in full agreement and the brunt of your argument is just that if it doesnt fit ur current usecase then dont use it.
i dont think i agree with the conclusion of the article that it is making the non coding population dumber, but i also AGREE that we should not create these gaps in knowledge within our own codebase by just trusting ai, its certainly NOT a calculator and is wrong a lot and regardless if it IS right, that gap is a gap for the coder, and thats an issue.
maxlin · 9h ago
We're at the point where I'm starting to hear this exact "backlash" sentiment from a lot of people. While I never went that far out of discomfort (I like understanding things) I do kind of respect the "effort" of willing to let go a bit. One reason probably also is that I tend up working in large Unity projects which are as incompatible for letting AI "take the wheel" as projects come.
That aside, I wonder if the OP would have had all the same issues with Grok? In several uses I've seen it surprisingly outperform the competition.
I also use it for building things like app landing pages. I hate web development, and LLMs are pretty good at it because I'd guess that is 90% of their training data related to software development. For that I make larger changes, review them manually, and commit them to git, like any other project. It's crazy to me that people will just go completely off the rails for multiple hours and run into a major issue, then just start over when instead you can use a measured approach and always continue forward momentum.
IF you're a:
* 10 year python dev
* work almost entirely on a very large, complex python code base
* have a pycharm IDE fine tuned over many years to work perfectly on that code base
* have very low tolerance for bugs (stable product, no room for move fast, break things)
THEN: LLMs aren't going to 10x you. An IDE like Cursor will likely make you slower for a very long time until you've learned to use it.
IF you're a:
* 1 year JS (react, nextjs, etc.) dev
* start mostly from scratch on new ideas
* have little prior IDE preference
* have high tolerance for bugs and just want to ship and try stuff
THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.
I get the point you’re trying to make, LLMs can be a force multiplier for less experienced devs, but the sweeping generalizations don’t hold up. If you’re okay with a higher tolerance for bugs or loose guardrails, sure, LLMs can feel magical. But that doesn’t mean they’re less valuable to experienced developers.
I’ve been writing Python and Java professionally for 15+ years. I’ve lived through JetBrains IDEs, and switching to VS Code took me days. If you’re coming from a heavily customized Vim setup, the adjustment will be harder. I don’t tolerate flaky output, and I work on a mix of greenfield and legacy systems. Yes, greenfield is more LLM-friendly, but I still get plenty of value from LLMs when navigating and extending mature codebases.
What frustrates me is how polarized these conversations are. There are valid insights on both sides, but too many posts frame their take as gospel. The reality is more nuanced: LLMs are a tool, not a revolution, and their value depends on how you integrate them into your workflow, regardless of experience level.
It's not all or nothing. What you get value out of immediately will vary based on circumstance.
But they drew boundaries with very specific conditions that lead the reader. It’s a common theme in these AI discussions.
Is that not true? That feels sufficiently nuanced and gives a spectrum of utility, not binary one and zero but "10x" on one side and perhaps 1.1x at the other extrema.
The reality is slightly different - "10x" is SLoC, not necessarily good code - but the direction and scale are about right.
People with many years or even decades of hands-on programming experience, have the deep understanding and tacit knowledge that allows them to tell LLMs clearly what they want, quickly evaluate generated code, guide the LLM out of any rut or rabbit hole it dug itself into, and generally are able to wield LLMs as DWIM tools - because again, unlike juniors, they actually know what they mean.
Again I think the high level premise is correct as I already said, the delivery falls flat though. Your more junior devs have larger opportunity of extracting value.
I gently suggested that the problem may have not been with his post but with your understanding. Apparently you missed the point again.
Amen. Seriously. They're tools. Sometimes they work wonderfully. Sometimes, not so much. But I have DEFINITELY found value. And I've been building stuff for over 15 years as well.
I'm not "vibe coding", I don't use Cursor or any of the ai-based IDEs. I just use Claude and Copilot since it's integrated.
Yes, but these lax expectation s are what I don't understand.
What other tools in software sometimes work and sometimes don't that you find remotely acceptable? Sure all tools have bugs, but if your compiler had the same failure rate and usability issues as an LLM you'd never use it. Yet for some reason the bar is so low for LLMs. It's insane to me how much people have indulged in the hype koolaid around these tools.
Other people.
Seriously, all that advice about not anthropomorphizing computers is taken way too seriously now, and is doing a number on the industry. LLMs are not a replacement for compilers or other "classical" tools - they're replacement for people. The whole thing that makes LLMs useful is their ability to understand what some text means - whether or not it's written in natural language or code. But that task is inherently unreliable because the problem itself is ill-specified; the theoretically optimal solution boils down to "be a simulated equivalent of a contemporary human", and that still wouldn't be perfectly reliable.
LLMs are able to trivially do tasks in programming that no "classical" tools can, tasks that defy theoretical/formal specification, because they're trained to mimic humans. Plenty of such tasks cannot be done to the standards you and many others expect of software, because they're NP-complete or even equivalent to halting problem. LLMs look at those and go, "sure, this may be provably not solvable, but actually the user meant X therefore the result is Y", and succeed with that reliably enough to be useful.
Like, take automated refactoring in dynamic languages. Any nontrivial ones are not doable "classically", because you can't guarantee there aren't references to the thing you're moving/renaming that are generated on the fly by eval() + string concatenation, etc. As a programmer, you may know the correct result, because you can understand the meaning and intent behind the code, the conceptual patterns underpinning its design. DAG walkers and SAT solvers don't. But LLMs do.
Anyway, code generation tools almost always are born unreliable, then improve piecewise into almost reliable, and finally get replaced by something with a mature and robust architecture that is actually reliable. I can't imagine how LLMs could traverse this, but I don't think it's an extraordinary idea.
There’s definitely hype out there, but dismissing all AI use as “koolaid” is as lazy as the Medium posts you’re criticizing. It’s not perfect tech, but some of us are integrating it into real production workflows and seeing tangible gains, more code shipped, less fatigue, same standards. If that’s a “low bar,” maybe your expectations have shifted.
In an era where an LLM can hallucinate (present you a defect) with 100% conviction, and vibe coders can ship code of completely unknown quality with 100% conviction, the bar by definition has to have been set lower.
Someone with experience will still bring something more than just LLM-written code to the table, and that bar will stay where it is. The people who don't have experience won't even feel the shortcomings of AI because they won't know what it's getting wrong.
Searching for relevant info on the Internet can take several attempts, and occasionally I end up not finding anything useful.
My ide intellisense tries to guess what identifier I want and put it at the top of the list, sometimes it guesses wrong.
I've heard that the various package repositories will sometimes deliberately refuse to work for a while because of some nonsense called "rate limiting".
Cloud deployments can fail due to resource availability.
It's pretty a really, really simple concept.
If I have a crazy Typescript error, for instance, I can throw it in and get a much better idea of what's happening. Just because that's not perfect, doesn't mean it isn't helpful. Even if it works 90% of the time, it's still better than 0% of the time (Which is where I was at before).
It's like google search without ads and with the ability to compose different resources together. If that's not useful to you, then I don't know what to tell you.
Anyhoo... I find that there are times where you have to really get in there and question the robot's assumptions as they will keep making the same mistake over and over until you truly understand what it is they are actually trying to accomplish. A lot of times the desired goal and their goal are different enough to cause extreme frustration as one tends to think the robot's goal should perfectly align with the prompt. Once it fails a couple times then the interrogation begins since we're not making any further progress, obviously.
Case in point, I have this "Operational Semantics" document, which is correct, and a peg VM, which is tested to be correct, but if you combine the two one of the operators was being compiled incorrectly due to the way backtracking works in the VM. After Claude's many failed attempts we had a long discussion and finally tracked down the problem to be something outside of its creative boundaries and it needed one of those "why don't you do it this way..." moments. Sure, I shouldn't have to do this but that's the reality of the tools and, like they say, "a good craftsman never blames his tools".
* Figuring out how to write small functions (10 lines) in canonical form in a language that I don't have much experience with. This is so I don't end up writing Rust code as if it were Java.
* Writing small shell pipelines that rely on obscure command line arguments, regexes, etc.
But with aider/claude/bolt/whatever your tool of choice is, I can give it a handful of instructions and get a working page to demo my feature. It’s the difference between me pitching the feature or not, as opposed to pitching it with or without the frontend.
I can deliver 5k LoC in a day easily on a greenfield project and 10k if I sweat or there's a lot of boilerplate. I can do code reviews of massive multi-thousand line PRs in a few minutes that are better than most of the ones done by engineers I've worked with throughout a long career, the list just goes on and on. I only manually code stuff if there's a small issue that I see the LLM isn't understanding that I can edit faster than I can run another round of the agent, which isn't often.
LLMs are a force multiplier for everyone, really senior devs just need to learn to use them as well as they've learned to use their current tools. It's like saying that a master archer proves bows are as good as guns because the archer doesn't know how to aim a rifle.
The LLMs learn from examples, but if everyone uses LLMs to generate code, there's no new code to learn new features, libraries or methods from. The next generation of models are just going to be trained on the code generated by it's predecessors with now new inputs.
Being an LLM maximalist is basically freeze development in the present, now and forever.
[1]https://deepmind.google/discover/blog/alphaevolve-a-gemini-p...
Would you ever be able to tell e.g. CoPilot: I need a web framework with these specs, go create that framework for me. The later have Claude actually use that framework?
Sure, why not?
The "magic sauce" of LLMs is that they understand what you mean. They've ingested all the thinking biases and conceptual associations humans have through their training on the entire training corpus, not just code and technical documentation. When Copilot cobbles together a framework for you, it's going to name the functions and modules and variables using domain terms. For Claude reading it, those symbols aren't just meaningless tokens with identity - they're also words that mean something in English in general, as well as in the web framework domain specifically; between that and code itself having common, cross-language pattern, there's more than enough information for an LLM to use a completely new framework mostly right.
Sure, if your thing is unusual enough, LLMs won't handle it as well as something that's over-represented in their training set, but then the same is true of humans, and both benefit from being provided some guidelines and allowed to keep notes.
(Also, in practice, most code is very much same-ish. Every now and then, someone comes up with something conceptually new, but most of the time, any new framework or library is very likely to be reinventing something done by another library, possibly in different language. Improvements, if any, tend to be incremental. Now, the authors of libraries and frameworks may not be aware they're retracing prior art, but SOTA LLMs very likely seen it all, across most programming languages ever used, and can connect the dots.)
And in the odd case someone really invents some unusual, new, groundbreaking pattern, it's just a matter of months between it getting popular and LLMs being trained on it.
my point is exactly inline with your comment. The tools you get immediate value out of will vary based on circumstance. There's no silver bullet.
They key is that I've always had that prompt/edit/verify loop, and I've always leaned heavily on git to be able to roll back bad AI changes. Those are the skills that let me blow past my peers.
github.com/AlDanial/cloc v 2.04 T=0.05 s (666.3 files/s, 187924.3 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- Python 24 1505 1968 5001 Markdown 4 37 0 121 Jinja Template 3 17 2 92 ------------------------------------------------------------------------------- SUM: 31 1559 1970 5214 -------------------------------------------------------------------------------
Note this project also has 199 test cases.
Initial commit for cred:
commit caff2ce26225542cd4ada8e15246c25176a4dc41 Author: redacted <redacted> Date: Thu May 15 11:32:45 2025 +0800
And when I say easy, I was playing the bass while working on this project for ~3 hours.Yikes. But also lol.
Back then, Javis wasn’t built for code, but it was a surprisingly great coding companion. Yes. It only gave you 80% working code, but because you had to get your hands dirty, you actually understand what was happening. It didn't give me 10x but I'm happy with 2x with good understanding on what's going on.
Fast-forward to now: Copilot, Cursor, roo code, windsurf and the rest are shockingly good at output, but sometimes the more fluent the AI, the sneakier the bugs. They hand you big chunks of code, and I bet most of us don't have a clear picture of what's going on at ground 0 but just an overall idea. It's just too tempting to blindly "accept all" the changes.
It’s still the old wisdom — good devs are the ones not getting paged at 3am to fix bugs. I'm with the OP. I'm more happy with my 2x than waking up at 3am.
> * 1 year JS (react, nextjs, etc.) dev
> * start mostly from scratch on new ideas
> * have little prior IDE preference
> * have high tolerance for bugs and just want to ship and try stuff
> THEN: LLMs will to 10x you. An IDE like Cursor will immediately make you way faster.
And also probably dead-end you, and you'll stay the bug-tolerate 1 year JS dev for the next 10 years of your career.
It's like eating your seed corn. Sure you'll be fat and have it easy for a little while, but then next year...
~15 years later, I don't think I'm worse off than my peers who stayed away from all those websites. Doing the right searches is probably as important as being able to read manuals properly today.
no they didn't, no one said that
i know that because i was around then and everyone was doing the same thing
also, maybe there's a difference between searching and collating answers and just copy and pasting a solution _without thinking_ at all
Having a tool that’s embedded into your workflow and shows you how things can be done based on tons of example codebases could help a junior dev quite a lot to learn, not just to produce.
Based on the classmates I had in college who were paying to get a CS degree, I'd be surprised if many junior devs already working a paid job put much effort into learning rather than producing.
I'd still expect most junior Deva that use an LLM to get their job done won't be motivated to study the generated code enough to really learn it.
A student is also only as good as the teacher, though that's a whole other can of works with LLMs.
I'm not sure there's much of a skillset to speak of for these tools, beyond model-specific tricks that evaporate after a few updates.
Much of the time I spend writing code, not thinking about the general overview etc but about the code I am about to write itself, and if I actually care about the actual code (eg I am not gonna throw it away anyway by the end of the day) it is about how to make it as concise and understandable to others (incl future me) as possible, what cases to care about, what choices to make so that my code remain maintainable after a few days. It may be about refactoring previous code and all the decisions that go with that. LLM generated code, imo, is too bloated; them putting stuff like asserts is always a hit or miss about what they will think is important or not. Their comments tend to be completely trivial, instead of stating the intention of stuff, and though I have put some effort in getting them use a coding style similar to mine, they often fail there too. In such cases, I only use them if the code they write can be isolated enough, eg write a straightforward, auxiliary function here and there that will be called in some places but does not matter as much what happens in there. There are just too many decisions at each step that LLMs are not great at resolving ime.
I depend more on LLMs if I care less about maintenability of the code itself and more about getting it done as fast as possible, or if I am just exploring and do not actually care about the code at all. For example, it can be I am in a rush to get sth done and care about the rest later (granted they can actually do the task, else I am losing time). But when I tried this for my main work, it soon became a mess that would take more time to fix even if they seem like speeding me up initially. Granted, if my field was different and the languages I was using more popular/represented in training data, I may have found more uses for them, but I still think that after some point it becomes unsustainable to leave decisions to them.
Basically, use LLMs as a tool for specific things, don't let them do whatever and everything.
However it's great for simple "write a function that does X", which I could do myself but it would take longer, be boring and require several iterations to get it right.
Having said that, blindly copying a few lines of ChatGPT code did lead to automated newsletters being sent out with the wrong content.
You have the copilot mode which takes no learning at all which might give you some speedup, especially if you are doing repetitive stuff, it might even 10x+ you.
You have cmdk mode which you need to prompt and seems to he a lobotomized version of chat. I find putting comments and waiting for the copilot mode to kick in better as then the way we got there is saved.
Then there is agentic editing chat: that is the timewaster you speak off I believe, but what is there to learn? Sometimes it generates a metric ton of code, including in legacy massive code bases, that help, and often it just cannot do whatever.
I don't think these cases you make, or at least, when the second one goes beyond the basics, are different. There is nothing to learn except that you need read all the code, decide what you want in tech detail and ask that of the agentic chat. Anything else fails beyond the basics and 'learning to use it' will be that but if you didn't know that after 5 minutes you definitely didn't do any 'fine tuned pycharm ide', ever.
It is a tool that customizes code it ingested for your case specifically, if it can. That is it. If it never saw a case, it won't solve it, no matter what you 'learn to use'. And I am fine doing that in public: we use LLMs a lot and I can give you very simple cases that, besides (and often even that doesn't work) typing up the exact code, it will never fix with the current models. It just gets stuck doing meaningless changes with confidence.
I have some grey hair and I've been programming since I was a kid. Using CoPilot autocompletion roughly doubles my productivity while cutting my code quality by 10%.
This happens because I can see issues in autocompleted code far faster than I can type, thanks to years of reading code and reviewing other people's code.
The 10% quality loss happens because my code is no longer lovingly hand-crafted single-author code. It effectively becomes a team project shared by me and the autocomplete. That 10% loss was inevitable as soon as I added another engineer, so it's usually a good tradeoff.
Based on observation, I think my productivity boost is usually high compared to other seniors I've paired with. I see a lot of people who gain maybe 40% from Copilot autocomplete.
But there is no world in which current AI is going to give me a 900% productivity boost when working in areas I know well.
I am also quite happy to ask Deep Research tools to look up the most popular Rust libraries for some feature, and to make me a pretty table of pros and cons to skim. It's usually only 90% accurate, but it cuts my research time.
I do know how to drive Claude Code, and I have gotten it to build a non-trivial web front-end and back-end that isn't complete garbage without writing more than a couple of dozen lines myself. This required the same skill set as working with an over-caffeinated intern with a lot of raw knowledge, but who has never written anything longer than 1,000 lines before. (Who is also a cheating cheater.) Maybe I would use it more if my job was to produce an endless succession of halfway decent 5,000-line prototypes that don't require any deep magic.
Auto-complete plus Deep Research is my sweet spot right now.
they will make you clueless about what the code does and your code will be unmaintanable.
LOL
The worst is the middleground of stacks that are popular enough to be known but not enough for an LLM to know them. I say worst because in these cases the facade that the LLM understands how to create your product will fall before you the software's lifecycle ends (at least, if you're vibe-coding).
For what it's worth, I've mostly been a hobbyist but I'm getting close to graduating with a CS degree. I've avoided using LLMs for classwork because I don't want to rob myself of an education, but I've occasionally used them for personal, weird projects (or tried to at least). I always give up with it because I tend to like trying out niche languages that the LLM will just start to assume work like python (ex: most LLMs struggle with zig in my experience).
there's MCP servers now that should theoretically help with that, but that's its own can of worms.
I have one project that is very complex and for this I can't and don't use LLMs for.
I've also found it's better if you can get it code generate everything in the one session, if you try other LLMs or sessions it will quickly degrade. That's when you will see duplicate functions and dead end code.
Most code out there is glue. So there’s a lot of trainning data on integrating/composing stuff.
If you take this as a whole, you could do that 30-60 min into 5 min for most dev work.
This is a reasonable usage of LLMs up to a certain point, and especially if you're in full control of all the requirements as the dev. If you don't mind missing details related to sales and marketing such as SEO and analytics, I think those are not really "landing pages", but rather just basic web pages.
> I hate web development, and LLMs are pretty good at it because I'd guess that is 90% of their training data related to software development.
Your previous sentence does not support this at all since web development is a much more broad topic than your perception of landing pages. Anything can be a web app, so most things are nowadays.
They are being marketed a virtual assistants that will literally do all the work for you. If they become marketed truthfully, however, people will probably realize that they aren't worth the cost and it's largely more beneficial to search the web and/or crowdsource answers.
To be uncharitable and cynical for a moment (and talking generally rather than about this specific post), it yields content. It gives people something to talk about. Defining their personality by their absolutes, when in reality the world is an infinite shades of gradients.
Go "all in" on something and write about how amazing it is. In a month you can write your "why I'm giving up" the thing you went all in on and write about how relieved/better it is. It's such an incredibly tired gimmick.
"Why I dumped SQL for NoSQL and am never looking back" "Why NoSQL failed me"
"Why we at FlakeyCo are all in on this new JavaScript framework!" "Why we dumped that new JavaScript framework"
This same incredibly boring cycle is seen on here over and over and over again, and somehow people fall for it. Like, it's a huge indicator that the writer more than likely has bad judgment and probably shouldn't be the person to listen to about much.
Like most rational people that use decent judgement (rather than feeling I need to "all in" on something, as if the more I commit the more real the thing I'm committing to is), I leverage LLMs many, many times in my day to day. Yet somehow it has authored approximately zero percentage of my actual code, yet is still a spectacular resource.
But this CEO I just met on LinkedIn?
"we already have the possibility to both improve our productivity and increase our joy. To do that we have to look at what software engineering is. That might be harder than it looks because the opportunity was hidden in plain sight for decades. It starts with rethinking how we make decisions and with eliminating the need for reading code by creating and employing contextual tools."
Context is how AI is a whole new layer of complexity that SWE teams have to maintain.
I'm so confused.
It may be bad practice, but consider that the median developer does not care at all about the internals of the dependencies that they are using.
They care about the interface and about whether they work or not.
They usually do not care about the implementation.
Code generated by LLM is not that different than pulling in a random npm package or rust crate. We all understand the downsides, but there is a reason that practice is so popular.
So I really hope you don't pull in packages randomly. That sounds like a security risk.
Also, good packages tend have a team of people maintaining it. How is that the same exactly?
It absolutely is, but that is besides the point
> Also, good packages tend have a team of people maintaining it. How is that the same exactly?
The famously do not https://xkcd.com/2347/
[citation needed]
> Code generated by LLM is not that different than pulling in a random npm package or rust crate
It's not random, there's an algorithm for picking "good" packages and it's much simpler than reviewing every single line of LLM code.
Everybody agrees that e.g. `make` and autotools is a pile of garbage. It doesn't matter, it works and people use it.
> It's not random, there's an algorithm for picking "good" packages and it's much simpler than reviewing every single line of LLM code.
But you don't need to review every single line of LLM code just as you don't need to review every single line of dependency code. If it works, it works.
Why does it matter who wrote it?
Yes, LLM code is significantly worse than even a random package as it very often doesn't even compile.
Yup. I've spoken about this on here before. I was a Cursor user for a few months. Whatever efficiency gains I "achieved" were instantly erased in review, as we uncovered all the subtle and not-so-subtle bugs it produced.
Went back to vanilla VSCode and still use copilot but only when I prompt it to do something specific (scaffold a test, write a migration with these columns, etc).
Cursor's tab complete feels like magic at first, but the shine wore off for me.
My favorite thing here watching a co-worker is when Cursor tries to tab complete what he just removed, and sometimes he does it by reflex.
For me LLMs are a game changer for devops (API knowledge is way less important now that it's even been) but I'm still doing copy pasting from ChatGPT, however primitive it may seem.
Fundamentally I don't think it's a good idea to outsource your thinking to a bot unless it's truly better than you at long term decision making. If you're still the decision maker, then you probably want to make the final call as to what the interfaces should look like. I've definitely had good experiences carefully defining object oriented interfaces (eg for interfacing with AWS) and having LLMs fill in the implementation details but I'm not sure that's "vibe coding" per se.
But after I got a week into my LLM-led code base, it became clear it was all spaghetti code and progress ground to a halt.
This article is a perfect snapshot of the state of the art. It might improve in the future, but this is where it is in May 2025.
I still use LLMs heavily. However, I now follow two rules:
* Do not delegate any deep thought to them. For example, when thinking through a difficult design problem, I do it myself.
* Deeply review and modify any code they generate. I go through it line-by-line and edit it thoroughly. I have to do this because I find that much of what they generate is verbose, overly defensive, etc. I don't care if you can fix this through prompting; I take ownership over future maintainability.
"Vibe coding" (not caring about the generated code) gives me a bad feeling. The above approach leaves me with a good feeling. And, to repeat, I am still using them a lot and coding a lot faster because of it.
It takes a good hour or two to draw up the plans, but it's the kind of thing that would take me all day to do, possibly several as my ADHD brain rebels against the tedium. AI can do yeomans work when it just wings it, and sometimes I have just pointed at a task and did it in one shot, but they work best when they have detailed plans. Plus it's really satisfying to be able to point at the plan doc and literally just say "make it so".
This is the issue, right? If you have to do this, are you saving any time?
I find most benefit in writing tests for a yet-inexistent function I need, then giving the LLM the function signature, and having it implement the function. TDD in the age of LLMs is great!
The other thing is that I have the LLM make the modifications I want.
I know how long it takes to get an extremely bad programmer to do what you want, but the LLM is far better than that, so I do come out ahead.
I'm working on a few toy projects and I am using LLM for 90% of it.
The result is 10x faster than if I coded it "by hand", but the architecture is worse and somewhat alien.
I'm still keeping at it, because I'm convinced that LLM driven code is where things are headed, inevitably. These tools are just crazy powerful, but we will have to learn how to use them in a way that does not create a huge mess.
Currently I'm repeatedly prompting it to improve the architecture this way or that way, with mixed results. Maybe better prompt engineering is the answer? Writing down the architecture and guidelines more explicitly?
Imagine how the whole experience will be if the latency was 1/10th of what it is right now and the tools are 10x better.
Just like you're mentioning "maybe better prompt engineering", I feel like we're being conditioned to think "I'm just not using it right" where maybe the tool is just not that good yet.
This way you're doing the big picture thinking while having the LLM do what's it's good at, generating code within the limits of its context window and ability to reason about larger software design.
I mostly treat the LLM as an overly eager to please junior engineer that types very quickly, who can read the documentation really quickly, but also tends to write too much code and implement features that weren't asked for.
One of the good things is that the code that's generated is so low effort to generate that you can afford to throw away large chunks of it and regenerate it. With LLM assistance, I wrote some code to process a dataset, and when it was too screwy, I just deleted all of it and rewrote it a few times using different approaches until I got something that worked and was performant enough. If I had to type all of that I would've been disappointed at having to start over, and probably more hesitant to do so even if it's the right thing to do.
Yes, very explicit like “if (condition) do (action)” and get more explicit when… oh wait!
It’s an iterative process, not a linear one. And the only hige commits are the scaffolding and the refactorings. It’s more like sculpture than 3d printing, a perpetual refinement of the code instead of adding huge lines of code.
This is the reason I switched to Vim, then Emacs. They allow for fast navigation, and faster editing. And so easy to add your own tool as the code is a repetitive structure. The rare cases I needed to add 10s of lines of code is with a code generator, or copy-pasting from some other file.
I gave them a fair shake. However, I do not like them for many reasons. Code quality is one major reason. I have found that after around a month of being forced to use them I felt my skill atrophy at an accelerated rate. It became like a drug where instead of thinking through the solution and coming up with something parsimonious I would just go to the LLM and offload all my thinking. For simple things it worked okay but it’s very easy to get stuck in a loop. I don’t feel any more productive but at my company they’ve used it as justification to increase sprint load significantly.
There has been almost a religious quality associated to LLMs. This seems especially true among the worst quality developers and the non-technical morons at the top. There are significant security concerns that extend beyond simple bad code.
To me we have all the indicators of the maximum of the hype cycle. Go visit LinkedIn for confirmation. Unless the big AI companies begin to build nuclear power it will eventually become too expensive and unprofitable to run these models. They will continue to exist as turbo autocomplete but no further. The transformer model has fundamental limitations and much like neural networks in the 80s it’ll become more niche and die everywhere else. Like its cousins WYSIWIG and NoCode in 30 more years it’ll rise again like a phoenix to bring “unemployment” to developers once more. It will be interesting to see who among us was swimming without clothes when the water goes out.
I've started a "no Copilot Fridays" rule for myself at $DAYJOB to avoid this specifically happening.
use-it-or-lose-it is the cognitive rule.
I got more success that I hoped for, but I had to adjust my usage to be effective.
First of all, treat the LLM as a less experienced programmer. Don't trust it blindly but always make code review of its changes. This gives several benefits.
1) It keeps you in touch with the code base, so when need arise you can delve into it without too much trouble
2) You catch errors (sometimes huge ones) right away, and you can have them fixed easily
3) You can catch errors on your specification right away. Sometimes I forget some detail and I realize it only when reviewing, or maybe the LLMs did actually handle it, and I can just tell it to update the documentation
4) You can adjust little by little the guidelines for the LLM, so that it won't repeat the same "mistakes" (wrong technical decisions) again.
In time you get a feeling of what it can and cannot do, where you need to be specific and where you know it will get it right, or where you don't need to go into detail. The time required will be higher than vibe coding, but decreases over time and still better than doing by myself.
There is another important benefit for me in using an LLM. I don't only write code, I do in fact many other things. Calls, writing documentation, discussing requirements etc. Going back to writing code requires a change of mental state and to recall into memory all the required knowledge (like how is the project structured, how to use some apis etc.). If I can do two hours of coding it is ok, but if the change is small, it becomes the part where I spend the majority of time and mental energy.
Or I can ask the LLM to make the changes and review them. Seeing the code already done requires less energy and will help me reminding stuff.
This reminds me of the day of Dreamweaver and the like. Everybody loved how quickly they could drag and drop UI components on a canvas, and the tool generated HTML code for them. It was great at the beginning, but when something didn't work correctly, you spent hours looking at spaghetti HTML code generated by the tool.
At least, back then, Dreamweaver used deterministic logic to generate the code. Now, you have AI with the capability to hallucinate...
The argument is that the precision allowed by formal languages for programming, math etc were the key enabler for all of the progress made in information processing.
ie, Vibe-coding with LLMs will make coding into a black-art known only to the shamans who can prompt well.
[1] https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
I don't use LLMs for my main pieces of work exactly due to the issues described by the author of the blogpost.
This is the best way to get Gemini to be a really good assistant, unless you want to add System Instructions which precisely describe how it should behave.
Because if you just say it should solve some problem for you, it eagerly will generate a lot of code for you, or add a lot of code to the clean code which you provided.
It returned over 600 lines of code across 3 code blocks, almost all of them commented out for some reason, each with an accompanying essay, and each stuffed with hallucinated and unnecessary helper functions. Apparently Gemini Pro struggles to wrap its weights around recursion more than I do. I just wrote it myself and only needed 26 lines. It's not using tail calls, but hey, my target platform still doesn't support tail call optimization in 2025 anyway.
You can't abdicate your responsibility as a builder to the LLM. You are still responsible for the architecture, for the integrity, for the quality. In the same way you wouldn't abdicate your responsibility if you hired a more junior engineer.
And I feel more productive. I recommend that everyone gives it a try.
As a tenured SW developer in my company my measurements for success are much more than "how much code can I spit out". There are mentoring, refactoring, code readability/mantainability, and quality that are important to my job. I found that LLM generated code was not hitting the necessary bars for me in these areas (agent or code autocompletion) and so I have stepped back from them. The readability point is extra important to me. Having maintained million lines of code products, I have found that readability is more important than writing a ton of code: and LLMs just don't hit the bar here.
When I am playing with side projects that I don't have the same bar on, sure Ill have bolt or lovable generate me some code in combination with cursor or windsurf, but these are low stakes and in some ways I just want to get something on paper.
He lost me here. Sounds like he tried to change from a boring stack he understood, to Go and Clickhouse because it's cooler.
You're asking for trouble. LLMs are great and getting better, but you can't expect them to handle something like this right now
> I have no concept of Go or Clickhouse best practices.
> One morning, I decide to actually inspect closely what’s all this code that Cursor has been writing. It’s not like I was blindly prompting without looking at the end result, but I was optimizing for speed and I hadn’t actually sat down just to review the code.
> I’m defaulting to coding the first draft of that function on my own.
I feel like he’s learnt the wrong lesson from this. There is a vast gulf between letting an LLM loose without oversight in a language you don’t know and starting from scratch yourself. There’s absolutely nothing wrong with having AI do the first draft. But it actually has to be a first draft, not something you blindly commit without review.
> “Vibe coding”, or whatever “coding with AI without knowing how to code” is called, is as of today a recipe for disaster, if you’re building anything that’s not a quick prototype.
But that’s what vibe coding is. It’s explicitly about quick throwaway prototypes. If you care about the code, you are not vibe coding.
> There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. […] It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.
— https://x.com/karpathy/status/1886192184808149383
He’s basically saying that vibe coding is a disaster after his experience doing something that is not vibe coding.
You can’t pick up vibe coding and then complain that it’s behaving as described or that it isn’t giving you something that wasn’t promised.
in which Windsurf is forging ahead with an agentic LLM product that endeavors to not only replace software engineers but actually take over the entire software engineering PROCESS.
We're at a very interesting point, where investors and corporate interests are crystal clear in their intent to use LLMs to replace as many expensive humans as possible, while the technology available to do so is not there yet. And depending on your own perspective, it's not clear it ever will be, or perhaps it'll eventually be "good enough" for them to downsize us anyway.
I keep thinking about compilers. The old timers had been writing assembly by hand for years. The early C compilers were producing defective garbage output that was so incredibly poor that it was faster to keep coding by hand. Then the compilers got better. And better, and better, and now pretty much nobody inspects the output, let alone questions it.
2. The ratio increases and programmer layoffs begin
3. Bugs appear, AI handles most but not all
4. Bug complexity increases, company hires programmers to fix
5. Programmers can't decipher AI code mess
6. Programmers quit
7. Company unable to maintain their product
> the horror ensues
Post-LLM era, you can use a shortcut to get a representation of what something could be. However I see a lot of folks shipping that representation straight to prod. By comparison, the mental models created are weaker and have less integrity. You might be able to feel that something is off, but you lack the faculties to express or explain why.
Even if the idea that an LLM will help you do it is false, perhaps it is still a good idea if it convinces the experienced programmer to go ahead and use SQL for the query, Go for the async, Javascript for the frontend, etc. Right now, few if any companies would let you use the best tool for the job, if that's not one they already use. Perhaps the best use of LLMs is to convince programmers, and their bosses, to use the best tool for each job.
But, after you've gotten past that part, you will probably (like the author of this article) need to throw away the LLM-generated code and write it yourself.
I never write python in my day to day but every from-scratch project I've knocked out with Claude code has been in python because that's what it seems to default to if I don't specify anything
I wonder if this will also mean that new languages (or even algorithms or code patterns) are harder to get adopted, because the mass of existing code (that LLMs learned from) exerts a gravitational force pulling things back down to the status quo.
I have basically given up on "in IDE" AI for now. I simply have a web interface on my 2nd monitor of whatever the "best" LLM (currently Gemini, was Claude) is and copy paste snippets of code back and forth or ask questions.
This way I have to physically copy and paste everything back - or just redo it "my way", which seems to be enough of an overhead that I have to mentally understand it. When it's in an IDE and I just have to tick accept I just end up getting over eager with it, over accepting things and then wishing I hadn't, and spend more time reverting stuff when the penny drops later this is actually a bad way to do things.
It's really more a UX/psychology problem at the moment. I don't think the 'git diff' view of suggested changes is the right one that many IDEs use for it - I'm psychologically conditioned to review these like pull requests and it seems my critique of these is just not aligned to critiquing LLM code. 99% of PR reviews I do are finding edge cases and clarifying things. They're not looking at very plausible yet subtly completely wrong code (in the most part).
To give a more concrete example; if someone is doing something incredibly wrong in 'human PRs' they will tend to name the variables wrong because they clearly don't understand the concept, at which point the red flag goes off in my head.
In LLM PRs the variables are named often perfectly - but just don't do what they say they will. This means my 'red flag' doesn't fire as quickly.
Luckily in this particular case, being able to parallel park unassisted isn’t all that critical in the overall scheme of things, and as soon as I turned that feature off, my parking skills came back pretty quickly.
But the lesson stuck with me, and when LLMs became capable of generating some degree of functioning code, I resolved not to use them for that purpose. I’m not trying to behave like a stodgy old-timer who increasingly resists new tech out of discomfort or unfamiliarity—it’s because I don’t want to lose that skill myself. I use LLMs for plenty of other things like teaching myself interesting new fields of mathematics or as a sounding board for ideas, but for anything where it would (ostensibly) replace some aspect of my critical thinking, I try to avoid them for those purposes.
Developing without the knowledge what LLM writes is dangerous. For me having LLM as a tool is like having a few Junior Developers around, that I can advise and work with. If I have a complicated logic that I need to write - I write it. After I wrote it, I can ask LLM to review it, it might be good to find some corner cases in some places. When I need to "move things from one bucket to another", like call API and save to DB - that is a perfect task for LLM, that I can easily review after.
At the same time, LLM is able to write pretty good complicated logic as well for my side projects. I might need to give it a few hints, but the results are amazing.
Eric Schmidt gave a TED interview about the subject this week. He predicts the US and China bombing each other's data centers.
When I do use the agent, I inspect its output ruthlessly. The idea that pages of code can be written before being inspected is horrifying to me.
Without access to the code, it's challenging to verify the authors' claims and form an independent opinion. In my view, we should be cautious about trusting articles that lack examples or sources. As someone wiser than me once said:
> Articles without sources are merely the opinions of the author.
For me, the magic of LLMs is that I already get an hour of coding via 30 minutes of prompting and finetuning the generated code. And I know that the ratio will constantly improve as LLMs become better and as I finetune my prompts to define the code style I prefer. I have been coding for pretty much all my life and I never felt more excited about it than I do now.
It would be cool if people shared their prompts and the resuling commits. If someone here is disappointed by the commits LLMs make, I would love to see the prompt, the commit, and which model made it.
So I kind of do a top-down design and use the LLM to help me with toil or with unfamiliar things that would require me finding the right documentation, code examples, bug reports, etc, etc... LLMs with web search are great for this kind of toil elimination.
LLMs are useful tools, but they have no notion of hierarchy, of causal relationships, or any other relationship, actually. Any time they seem to have those capabilities in the code they generate, it is merely a coincidence, a very probably coincidence, but still, it is not intentional.
Think about when chat gpt gives you the side-by-side answers and asks you to rate which is "better".
Now consider the consequence of this at scale with different humans with different needs all weighing in on what "better" looks like.
This is probably why LLM generated code tends to have excessive comments. Those comments would probably get it a higher rating but you as a developer may not want that. It also hints at why there's inconsistency in coding styles.
In my opinion, the most important skill for developers today is not in writing the code but in being able to critically evaluate it.
if i dont do that it always seems to throw out 3 fresh files ill need to add to make their crazy implementation work.
ive pretty much swapped to using it just for asking for minor syntax stuff i forget. ill take my slower progress in favor of fully grasping everything ive made.
i have one utility that was largely helped by claude in my current project. it drives me nuts, it works but im so terrified of it and its so daunting to change now.
Once the code base structure and opinions are in place, I think LLMs are decent at writing code that are bounded to a single concern and not cross cutting.
LLM generated code bases work initially but so will code written by college kids for an initial set of requirements. Even a staff+ level engineer will struggle in contributing to a code base that is a mess. Things will break randomly. Don’t see how LLMs are any different.
Something I do very much love about LLMs is that I can feed it my server logs and get back an analysis of the latest intrusion attempts, etc. That has taught me so much on its own.
It's hard to say without specifics, but simply upgrading from MySQL to PostgreSQL without rewriting the PHP codebase in Go might resolve most of the potential issues.
1) I do a lot of scraping, and Go concurrency + Colly has better performance
2) My DB size is exploding and I have limited budget, and it looks like CH is so much better at compressing data. I recently did a test and for the same table with same exact data, MySQL was using 11GB, ClickHouse 500MB
That's pretty typical best case size for weblogs and other time ordered data where column data correlate with time values. You do have to tweak the schema a bit to get there. (Specifically a "good" sort order, codecs, ZSTD instead of LZ4 compression, etc.)
That's pretty impressive
I think that the solution is to somehow move towards an "intent" layer that sits above the code.
I still use Cursor, but I use AI judiciously so that I don't wreck the project and it is only acts as an aid.
I don't think LLMs are at blame here, it is a tool and it can be used poorly. However, I do wonder what's the long term effects on someone who uses them to work on things they are knowledgeable about. Unfortunately this is not explored in the article.
https://taoofmac.com/space/blog/2025/05/13/2230
There, there's your problem. The problem is not LLMs, the problem is people not using their brain. Can't we use LLMs and our brains as well? Both are amazing tools!
If I want to learn something new I won’t vibe code it. And if I vibe code I’ll go with tech I have at least some familiarity with so that I can fix the inevitable issues
In the past the quality mattered because maintenance and tech-debt was something that we had to spend time and resources to resolve and it would ultimately slow us down as a result.
But if we have LLMs do we even have "resources" any more? Should we even care if the quality is bad if it is only ever LLMs that touch the code? So long as it works, who cares?
I've heard this positioned in two different ways, from two different directions, but I think they both work as analogies to bring this home:
- do engineers care what machine code a compiler generates, so long as it works? (no, or at least very, very, very rarely does a human look at the machine code output)
- does a CEO care what code their engineers generate, so long as it works? (no)
Its a very very interesting inflection point.
The knee jerk reaction is "yes we care! of course we care about code quality!" but my intuition is that caring about code quality is based on the assumption that bad code = more human engineer time later (bugs, maintenance, refactoring etc).
If we can use a LLM to effectively get an unlimited number of engineer resources whenever we need them, does code quality matter provided it works? Instead of a team of say 5 engineers and having to pick what to prioritise etc, you can just click a button and get the equivalent of 500 engineers work on your feature for 15 minutes and churn out what you need and it works and everyone is happy, should we care about the quality of the code?
We're not there yet - I think the models we have today kinda work for smaller tasks but are still limited with fairly small context windows even for Gemini (I think we'll need at least a 20x-50x increase in context before any meaningfully complex code can be handled, not just ToDo or CRUD etc), but we'll get there one day (and probably sooner than we think)
At this point in history they aren't good enough to just vibe code complex projects as the author figured out in practice.
They can be very useful for most tasks, even niche ones but you can't trust it completely.
And why did they make this change? Because it makes users spend more time on the platform. ChatGPT wasn't just one of the fastest-growing web services ever, it's now also speedrunning enshittification.
LLM is a tool. I think we still need to learn how to use it, and in different areas it can do either more or less stuff for you. Personally I don't it for most of everyday coding, but if I have something tedious to write, the LLM is the first place I go for a code draft. That works for me and for the LLM I use.
So someone could come along and say "well don't do that then, duh". But actually a lot of people are doing this, and many of them have far fewer fucks to give than the author and I, and I don't want to inherit their shitty slop in the future.
My job has changed from writing code to code reviewing a psychopathic baby.
Of course the real question is, is there any reason to be good at coding and writing, if LLMs can just do it instead? Of course it’s hard to sell that we should know arithmetic when calculators are ubiquitous.
Personally, I value being skilled, even if everyone is telling me my skills will be obsolete. I simply think it is inherently good to be a skilled human
I have been building a gallery for my site with custom zoom/panning UX to be exactly how I want it (which by the way does not already exist as a library). Figuring out the equations for it is simply impossible for an LLM to do.
I wouldn't be surprised if after the LLM hype we go back to site/app builders being the entry level option.
Yes, imo, too many people believe current LLMs are capable of doing this well, They aren't. Perhaps soon! But not today, so you shouldn't try to use LLMs to do this for serious projects. Writing that MDN file sounds like a wonderful way to get your own head around the infrastructure and chat with other devs / mentors about it though, that's a great exercise, I'm going to steal it.
Anyway, LLMs are, as we all know, really good text predictors. Asking a text predictor to predict too much will be stretching the predictions too thin: give it one file, one function, and very specific instructions, and an LLM is a great way to increase your productivity and reducing mental overload on menial tasks (e.g., swap out all uses of "Button" in @ExperimentModal.tsx with @components/CustomButton.tsx). I think LLM usage in this sense is basically necessary to stay competitively productive unless you're already a super-genius in which case you probably don't read HN comments anyway so who cares. For the rest of us mortals, I argue that getting good at using LLM co-pilots is as important as learning the key bindings in your IDE and OS of choice. Peel away another layer of mental friction between you and accomplishing your tasks!
> Are they throttling the GPUs? Are these tools just impossible to control? What the fuck is going on?
Money and dreams. As everyone knows there’s an obscene amount of money invested in these tools of course. The capitalist class is optimistic their money can finally do the work for them directly instead of having to hire workers.
But more than that, AI is something that’s been alluring to humans forever. I’m not talking about cyberpunk fantasies I’m talking about The Mechanical Turk, Automata in the Middle Ages, Talos[0]. The desire to create an artificial mind is, if not hardwired into us, at least culturally a strong driver for many. We’re at a point where the test of the computer age for determining if we built AI was so utterly destroyed it’s unclear how to go about judging what comes next.
The hype is understandable if you view if you step back and view it through that lens. Maybe we are at an inflection point and just a bit more scaling will bring us the singularity. Maybe we’ve seen a burst of progress that’ll mostly stall from a consumer perspective for another 5-10 years, like all of the major AI moments before.
If you want to use them effectively it’s the same as any tool. Understand what they are good at and where they flounder. Don’t give up your craft, use them to elevate it.
[0]: https://news.stanford.edu/stories/2019/02/ancient-myths-reve...
Not quite a source but it’s a fun read from 2019 about this.
Unfortunately ceo manager types cannot distinguish between bad and good enginners
This represents a huge security threat too if code is uncritically applied to code bases. We've seen many examples where people try and influence LLM output (eg [1][2]). These attempts have generally varied from crude to laughably bad but they'll get better.
Is it so hard to imagine prompt injection being a serious security threat to a code base?
That aside, I just don't understand being "all in" on LLMs for coding. Is this really productive? How much boilerplate do you actually write? With good knowledge of a language or framework, that tends to be stuff you can spit out really quickly anyway.
[1]: https://www.theguardian.com/technology/2025/may/16/elon-musk...
[2]: https://www.theguardian.com/technology/2025/jan/28/we-tried-...
Use the tool when it makes sense or when someone shows you how to use it more effectively. This is exactly like the calculator "ruining people's ability to do arithmetic" when the vast majority of the population has been innumerate for hundreds of thousands of years up til the IR where suddenly dead white european nobility are cool.
There is nothing fun nor interesting about long division as well as software development.
If LLMs don't work for your usecase (yet) then of course you have to stick with the old method, but the "I could have written this script myself, I can feel my brain getting slower" spiel is dreadfully boring.
comparing it to no longer doing the long division portion of a math problem isnt a great 1 to 1 here. long division would be a great metaphor if the user is TRULY only using llms for auto complete of tasks that add 0 complexity to the overall project. if you use it to implement something and dont fully grasp it, you are just creating a weird gap in your overall understanding of the code base.
maybe we are in full agreement and the brunt of your argument is just that if it doesnt fit ur current usecase then dont use it.
i dont think i agree with the conclusion of the article that it is making the non coding population dumber, but i also AGREE that we should not create these gaps in knowledge within our own codebase by just trusting ai, its certainly NOT a calculator and is wrong a lot and regardless if it IS right, that gap is a gap for the coder, and thats an issue.
That aside, I wonder if the OP would have had all the same issues with Grok? In several uses I've seen it surprisingly outperform the competition.