Dev jobs are about to get a hard reset and nobody's ready

46 ubj 75 6/22/2025, 11:39:18 PM old.reddit.com ↗

Comments (75)

furyofantares · 6h ago
I'm doing a reasonable size project with Claude Code doing almost all of the programming, and it's quite challenging.

Vibe coding is easy and fast, but you end up not being an expert in the code base or really having any idea about it. And once it reaches a certain size, the LLM isn't an expert on it either. It is only an expert on cleanly isolated sections of it, which, by the way, it's really bad at producing without lots of good guidance.

Bespoke coding is easy and slow, and you end up an expert in the systems you make.

What I've found is once the system is beyond the size that an LLM can reasonably handle, it's faster to be an expert than to try to get the LLM to do things; in some cases infinitely faster (the LLM can't do it.)

Maybe you get a system of that size done in a week instead of 3 months with vibe coding, and this applies to subsystems as well if they're isolated. But for now you really still want someone to end up an expert in all the code that was produced.

So for now I think there's a big skill point that is still only achieved by humans -- guiding the LLM to produce systems that both the human and the LLM will be good at long term. That is, NOT vibe coding from the beginning, but doing something a bit slower than vibe coding and much faster than bespoke coding, with an eye toward clean, testable, isolated systems - something the LLM is not naturally any good at making but can be done with good guidance.

flashgordon · 2h ago
I spent the last two months "vibe coding". I really think VC as it is defined (keep smashing the accept button and let the LLM eventually "get there") is terrible. My flow has been to use Claude code for an incredibly amazing code gen where:

1. I know exactly what I want architecturally

2. I know how I want it

In this mode the flow is then about validating the code to make sure it is my "image" frequently rather than me dreading not know what rube Goldberg machine it generated after 10s or Ks of lines.

Sometimes I even let it get "close" (ie it took care of all the nitty gritty) and I take over and finish the interesting bits then tell CC what I did and why to update the project memory. Frequent checkpointing and sprinkling .md files with latest understanding is very important (it also has the advantage of making your code llm-portable).

I think the biggest irony is PMs, VPs, CEO had traditional been pretty derisive of "clean code" and yet clean code is absolutely essential to make vibe coding work. Feels like a huge vindication.

furyofantares · 2h ago
We're on the same page.

And you have to be vigilant, too. You can spend a day or two in full vibe code mode if you really want to ship a bunch of features fast, and they'll all work and it will feel amazing, all while it's secretly shitting all over your codebase, and you won't know it until it's too late. And not in a "oh now you just have to fix it up" way - if you go too long it may be just about as difficult to fix as it would have been to write.

bluedino · 6h ago
> Bespoke coding is easy and slow, and you end up an expert in the systems you make.

What good is that when the code gets written by a fresher from Accenture, or someone who ends up leaving the company (average job length for a software dev is just over two years)

zeroCalories · 5h ago
Maybe job mobility goes down as expertise is valued over job hopping?
cwsx · 10m ago
I wrote a comment in a similar thread a few weeks ago describing my LLM-coding experience - here's a copy+paste (so any quote replies will be out of context / not actually replying to your comment):

    I'll preface this comment with: I am a recent startup owner (so only dev, which is important) and my entire codebase has been generated via Sonnet (mostly 3.7, now using 4.0). If you actually looked at the work I'm (personally) producing, I guess I'm more of a product-owner/project-manager as I'm really just overseeing the development.

    > I have yet to see an LLM-generated app not collapse under it’s own weight after enough iterations/prompts.

    There's a few crucial steps to make an LLM-generated app maintainable (by the LLM):

    - _have a very, very strong SWE background_; ideally as a "strong" Lead Dev, _this is critical_

    - your entire workflow NEEDS to be centered around LLM-development (or even model-specific):

      - use MCPs wherever possible and make sure they're specifically configured for your project

      - don't write "human" documentation; use rule + reusable prompt files

      - you MUST do this in a *very* granular but specialized way; keep rules/prompts very small (like you would when creating tickets)

      - make sure rules are conditionally applied (using globs); do not auto include anything except your "system rules"

      - use the LLM to generate said prompts and rules; this forces consistency across prompts, very important

      - follow a typical agile workflow (creating epics, tickets, backlogs etc)

      - TESTS TESTS AND MORE TESTS; add automated tools (like linters) EVERYWHERE you can

      - keep your code VERY modular so the LLM can keep a focused context, rules should provide all key context (like the broader architecture); the goal is for your LLM to only need to read or interact with files related to the strict 'current task' scope

      - iterating on code is almost always more difficult than writing it from scratch: provided your code is well architected, no single rewrite should be larger than a regular ticket (if the ticket is too large then it needs to be split up)

    This is off the top of my head so it's pretty broad/messy but I can expand on my points.

    LLM-coding requires a complete overhaul of your workflow so it is tailored specifically to an LLM, not a human, but this is also a massive learning curve (that take's a lot of time to figure out and optimize). Would I bother doing this if I were still working on a team? Probably not, I don't think it would've saved me much time in a "regular" codebase. As a single developer at a startup? This is the only way I've been able to get "other startup-y" work done while also progressing the codebase - the value of being able to do multiple things at a time, let the LLM and intermittently review the output while you get to work on other things.

    The biggest tip I can give: LLMs struggle at "coding like a human" and are much better at "bad-practice" workflows (e.g. throwing away large parts of code in favour of a total rewrite) - let the LLM lead the development process, with the rules/prompts as guardrails, and try stay out of it's way while it works (instead of saying "hey X thing didn't work, go fix that now") - hold its hand but let it experiment before jumping in.
threatofrain · 6h ago
I've seen a different kind of usage pattern recently, which is to find a problem that the LLM is good at, something which is very local in reasoning, and do it at big scale across the whole codebase.
b33j0r · 6h ago
Could you give an example? I like almost know what you mean, but not quite
spike021 · 5h ago
I've used it to add test coverage. Granted it wasn't for new major features but small features that still necessitated having tests.

IME so far that's what it's best at. As long as there exists tests in the codebase that provide enough of the basic skeleton (i.e. if they already test part of a controller or code that interacts with a DB) to guess what's needed then it can do a decent job. It's still not perfect though and when it comes to inventing bespoke tests that are uniquely different from other tests it needs much more of a helping hand.

threatofrain · 1h ago
A big Bay Area data company recently refactored their old JS codebase from var to let and const. It was done at a scale that is way beyond "senior engineer double-checks LLM to see if code is good."
conradev · 5h ago
Updating to best API practices. i.e. imagine your linter could take new rules as English
ljlolel · 6h ago
refactoring
furyofantares · 5h ago
LLMs are terrible at that in my experience. In what world is refactoring "very local in reasoning"?

Switching libraries/frameworks or switching piecemeal to a new language for a codebase that's already well structured seems like it would be noticeably less costly though.

emeraldd · 5h ago
In a situation where the transformations are explicit and can almost be mechanical with little to no involved reasoning. With those requirements, the LLM just has to recognize the patterns and "translate" them to the new pattern. Which is almost exactly what they were designed to do.
furyofantares · 5h ago
Agree with that but it doesn't really fall under the umbrella that I'd use the term "refactoring" for.

Replacing something with something else isn't a refactor to me, a refactor implies a structural change not a wide surface level change.

chc4 · 5h ago
See, using AI as the equivalet of super-IDE snippets or to generate things in isolation is probably really good! It's also categorically not the same thing as what all of the AI hypemen (including the OP) are describing, with it replacing wide swathes of software developers. It devolves into a motte and bailey argument where it is actually possible for AI to be a useful tool in a programmer's toolbox, and make people more productive in isolated ways, without also agreeing with frankly anything the OP thread is saying.
zeroCalories · 5h ago
I have had a lot of success building small isolated features / projects, but have found Claude to be frustratingly inadequate for any non-trivial work. Ran it on an old and complex C++ codebase, and spent an hour unsuccessfully trying to coax it into fixing a bug. I even helped it by writing a failing test for it by hand. The tools need to improve a ton before software developers can forget how to code.
gwynforthewyn · 6h ago
It’s only a feeling, but I’d swear I’ve seen variations on this post across a half a dozen software-adjacent subreddits every day for the last month. The common denominator has always been “Paying $200 for Claude Max is a steal” with absolutely no evidence of what the author did with it.

I honestly think we’re being played.

Bjorkbat · 6h ago
Yeah, I was about to say, it sounds a lot like this guy is just riding an intense high from getting Claude to build some side-project he's been putting off, which I feel is like 90% of all cases where someone writes a post like this.

But then I never really hear any update on whether the high is still there or if it's tapered off and now they're hitting reality.

gwynforthewyn · 5h ago
For sure.

Fwiw, I use Claude Pro in my own side project. When it’s great, I think it’s a miracle. But when it hits a problem it’s a moron.

Recently I was fascinated to see it (a) integrate swagger into a golang project in two minutes with full docs added for my api endpoints, and then (b) spend 90 minutes unable to figure out that it couldn’t align a circle to the edge of a canvas because it was moving the circle in increments of 20px and the canvas was 150px wide.

Where it’s good it’s a very good tool, where it’s bad it’s very bad indeed.

furyofantares · 2h ago
I don't think you're getting played.

I think it's legitimately possible to get something done in a week that used to take 3 months, without realizing that you haven't actually done that.

You might have all the features that would have taken 3 months, but you personally have zero understanding of the code produced. And that code is horrible. The LLM won't be able to take it further, and you won't either.

I think we're seeing people on a high before they've come to understand what they have.

dbalatero · 6h ago
Watch: it's just Dario Amodei's marketing reddit account.
tomlockwood · 6h ago
Agreed. Where's the code?
giantrobot · 5h ago
Where's the 10x, 20x, or whatever increases in profit from all the AI "productivity"? Typing is not the challenging aspect of writing code. Writing boilerplate faster isn't the super power that a lot of non-technical people seem to think it is.
jkmcf · 4h ago
Are we returning to ideas having importance now since PoC are cheaper and easier? Any dev on HN can create a Facebook competitor, but getting the traffic shift will require some magical thinking.
giantrobot · 4h ago
A PoC has never really been a problem that needed solving. It's going from that to a product that's actually fit for purpose. AWS will happily drain your bank account because you're throwing tons of resources at a poor implementation. Hackers will also happily exploit trivial security vulnerabilities a vibe coder had no ability to identify let alone fix.

This is not the first time the industry has been down this road. They're not in themselves bad tools but their hype engenders a lot of overconfidence in non-technical users.

XorNot · 5h ago
"it handles the boilerplate!" Has always been the weirdest argument about most things. Like, sure...so does a library. That's why we have libraries.

(Or at the extreme end, this is what something like C++ templates were for).

giantrobot · 3h ago
Libraries and frameworks remove some boilerplate but there's still tons of it. It's rare a library exposes a single doTheThingINeed() function that runs a business. Everyone needs boring but domain specific code.
tomlockwood · 5h ago
Yeah the current Vibe for me seems to be: Congratulations, you trained a giant machine that makes copying code from Stackoverflow marginally faster.
joshdavham · 5h ago
> I honestly think we’re being played.

Supposing this is true, who is playing us and why?

tomlockwood · 5h ago
Sam Altman and the money.
gyomu · 6h ago
Yeah maybe. Talk is cheap, show me the code.

> Last week, I did something I’ve put off for 10 years. Built a full production-grade desktop app in 1 week. Fully reviewed. Clean code. Launched builds on Launchpad. UI/UX and performance? Better than most market leaders. ONE. WEEK.

I really wonder why people who write these things never actually show these apps they vibe coded in a week that are “better than most market leaders”.

> We’ve hit the point where asking “Which programming language should I learn?” is almost irrelevant. The real skill now is system design, architecture, DevOps, cloud — the stuff that separated juniors from seniors. That’s what’ll matter.

My UML professor said very similar things 25 years ago. You’d just draw a UML diagram, and boom! All the code would be generated. The only thing I remember from this class is how terrible she was at coding.

rented_mule · 5h ago
> My UML professor said very similar things 25 years ago.

Yeah, it feels like a lot of these stories are the new version of the methodology / database / framework hype we've seen for so many years... with our new tool, you can write Reddit / Yelp / Wikipedia / S3 / etc. in a week. Sure, you can write a prototype that kind of replicates the surface level functionality. How about the actual hard part? Things like scaling, optimization of computing resources across large fleets, evolution of the system to attract more customers, fighting spam, increasingly complicated integrations with 3rd parties, maintenance, etc.

So much of this will turn out like all those things we've seen before. It will take skilled engineers doing a lot of thinking to figure out the right way to leverage the tools to make better systems for less.

lostmsu · 1h ago
I think on it from the perspective that Claude can already replace the majority of methodology zealots because it is intellectually on par with them. (in coding)
vrosas · 5h ago
> Fully reviewed

Reviewed by a bot (that wrote it no less) is not reviewed. This person is not a software engineer.

joshdavham · 5h ago
I’m always a little surprised when LinkedIn-style posts like this make it on Reddit.
georgemcbay · 5h ago
> Yeah maybe. Talk is cheap, show me the code.

Or even just link the app (which the poster is presumably eager to market, right?) as a bare minimum.

If LLMs for coding were half as productive at cranking out production quality software unassisted as the astroturfed hype around them suggests we should have seen a large unmistakable wave of better, faster-released software by now just in the output of software companies, startups, etc, but beyond the hype blogs (or reddit posts) I'm not seeing anything other than the status quo.

I'm not saying LLMs are worthless, they are pretty useful as advanced autocomplete and documentation that you can also happen to ask things in natural languages and get reasonable results (so long as you already have a solid enough base of knowledge to recognize when they are going off the rails). They can be very useful productivity speed-up tools, especially when you start working in a new realm.

But there's a long way from that to the "replaces 90% of all developers in writing 100% of their code" that is being sold in the hype.

zeroCalories · 5h ago
Claude Code has been out for a decent amount of time. How long until we have a completely vibe-coded web browser? Linux replacement? New programming language that solves all our previous problems? Serenity was able to make a decent amount of progress with a small team in a few years, so with a 10x productivity improvement surely we'll see fully polished and complete products in the next year. Surely we'll see FAANG become more productive. Surely we'll see new unicorns come to unseat stagnating giants.
chc4 · 6h ago
While everyeone is just unilaterally asserting things, I'll jump in: no they won't. Practically every output of LLMs I've seen (even the "cutting edge" agentic ones that everyone says you have to evaluate or else your opinion don't count) has been poor quality with baffling bugs. These things do actually matter. Especially for anything that can be remotely described as "niche" or "research" they are astonishingly bad - maybe they're great for Go microservices or webdev, but there's a huge gap from there to "the entire software developer industry is doomed". Just calm down, man. You don't have to either breathlessly praise AI or start doomsaying, or say they're flat worthless, but a bit of humility over how uncertain the future may turn out is a noble trait to have.
parpfish · 6h ago
I started trying to vibecode over the last couple weeks, and it sucks at 90% of the stuff I have to do at work. That last 10% is nice, but not worth the cost IMO.

However, because I does show promise on a small segment of my work I could see there being an effect on industry where a certain type of dev work gets wiped out. To the extent that there are companies paying people to “be a webmaster” or maintain a basic crud-app, those teams could drastically downsize. Even if that’s just 5% of the total dev population, that’s still a lot of jobs and an entire type of work that just gets wiped out.

999900000999 · 6h ago
AI is like an intern with a LOC quota to hit.

These tools love to just write new code, they hate refactoring or maintaining code which is what this job actually is.

falcor84 · 6h ago
The first comment to the post referenced Kent Beck's:

> "The value of 90% of my skills just dropped to $0. The leverage for the remaining 10% went up 1000x. I need to recalibrate" [0]

At the time that felt like an exaggeration, but from my own use of Claude Code over the last month, I now entirely agree. My take on this is that we need to educate future devs from the very start as engineering managers - to be concerned a lot more with the "what", the "why" and the "is it good for our expected needs" rather than the low-level "how".

[0] https://tidyfirst.substack.com/p/90-of-my-skills-are-now-wor...

zeroCalories · 4h ago
What have you accomplished with Claude Code? Anything you can share?
catlifeonmars · 5h ago
> We’ve hit the point where asking “Which programming language should I learn?” is almost irrelevant. The real skill now is system design, architecture, DevOps, cloud — the stuff that separated juniors from seniors. That’s what’ll matter.

I’ve never worked in a shop where knowing how to write code is enough. While I’m sure these places exist, from my first software engineering job ~13 years ago to today, distributed systems design and architecture has always been a table stakes skill and specialization in a particular language has been secondary.

I feel like I’m living in a parallel universe. Where all the code only jobs anyway?

proc0 · 5h ago
> We’ve hit the point where asking “Which programming language should I learn?” is almost irrelevant. The real skill now is system design, architecture, DevOps, cloud — the stuff that separated juniors from seniors. That’s what’ll matter.

Those skills require knowing how code works. You can't leapfrog into a senior dev., at least not in most cases. Languages have drastically different features plus the AI agents will not be perfect and you'll need to review code and probably modify some of it. The more you know how to code the easier this will be.

royal__ · 5h ago
Why is a post by some rando on Reddit considered quality news, worthy of reaching the front page of HN? This is just comment bait. The discussion is fine, but all I see is anecdotal conjecturing.
jen729w · 5h ago
You and I clicked on it, and the algorithm did the rest.
Havoc · 6h ago
>It’s already doing 100% of the coding.

Way to go killing credibility straight out the gate. I could buy a high percentage for certain types of apps, but not 100%

have-a-break · 5h ago
Cant even tell you how many times vibe code leaves obvious security bugs. Atleast it makes me feel like my job isn't going away anytime soon.
goalieca · 6h ago
Probably the biggest sign of the times right now is a massive unfusion of capital into data centers and hardware and less so into programmers. It's hard to land a job right now if you're looking.
sieve · 5h ago
I like LLMs. I really do. But my experience with them is very different from the Chicken Little folks.

Let's park coding to the side for a bit.

Case 1:

I am collaborating with a friend to build a graded Sanskrit reader for beginners using the Aesop's fables.

As a precursor, I asked Gemini 2.5 Pro if it had access to all the stories. Yes, it said. The three popular PD ones? Yes.

I asked it to print all three versions of a particular one, and it did. One of them was not the version it confidently claimed it was. We argued about it for a while. It shut up when I provided actual evidence.

I then decided to upload the three Gutenberg text files and asked it to use them as the source of truth to give me a list of unique stories putting variant plots, variant titles etc under the main heading. I gave it certain formatting requirements so that I could later verify if all 600-odd tales across the three books were properly accounted for.

Gemini tied itself into knots trying to do this. It could not guarantee that all the tales were present in the list it generated. It didn't know how to accomplish the task. Finally, I gave it a series of steps, an algorithm based on an n-branched tree. Only then did it manage to generate the list for me.

This took me four hours of wrangling across three different sessions.

Case 2:

I have been buying TASCHEN editions of impressionists and other classical artists. I wanted Gemini to compare various editions, give me the pros and cons so that I could pick a good edition to buy. By the time we came to Michelangelo it went nuts, hallucinating editions, ISBN numbers, page counts, authoritative urls, worldcat searches ...

This took about two hours.

There are more such amusing anecdotes. Some from DeepSeek as well.

I have tried LLMs with python, and typst and a few other things. Sometimes they work, sometimes they don't. They definitely do not write code the way I want them to. They will use OOP even if I specifically warn them not to.

LLMs are VERY good at translation and languages. I will give them that. But reasoning? I am not convinced. I will believe that LLMs are good enough to replace programmers when the Amodei siblings can operate their company only using LLM developers.

__MatrixMan__ · 3h ago
> FUTURE GENERATION WILL HAVE HIGHER PRODUCTIVITY INGRAINED AS A EVOLUTIONARY TRAIT IN THEM

If we're lucky, future generations will not have to think about productivity at all. For so long we've been in this mode where more is better: Make the economy go brrr and good things happen by proxy.

If AI can handle the brrr, then maybe we can start being a bit more thoughtful about which direction our efforts are steering it in, about whose dreams we're making come true and whether they're any good.

If so, then the valuable skill is going to be the ability to chart a course to where people want to be, and to be believed when you say you can get us there from here. Letting AI handle most of the middle steps does not strike me as the path to that credibility--that's not a leader, that's just the hype guy.

geodel · 5h ago
I see it from many sides.

1. A non-web dev creating a personal website for spouse because well, I am the IT/tech guy. 2. At work with side/ productivity projects where manager asked something to be done but not how it is to be done. 3. At work with enterprise project as main role.

So on 1) AI/LLM are amazing. As easy Hugo maybe I can't create a small website on weekend without even using readymade themes. For 2) again it is great I was able to lot of interesting code/scripts for personal automation with Go, Java, duckdb etc. For 3) they are okay mainly because those large IT applications have far too many hard dependencies on legacy backends and services. Generating some fancy wrapper in Node, React or whatever is not going to do much (for me). They do help with bits on API discoveries and usage examples.

Maybe it is my experience bias but all the Java code it throws at me uses old Java 8 APIs even though my projects are setup to use at least Java 21. So I have to constantly nudge it towards using better APIs, whereas for 1) and 2) I'll take what I get.

stego-tech · 5h ago
I’m not a developer, just an IT dinosaur, but one comment in the thread stuck with me:

> it has felt like a movement of “code first, think and ask questions later” took over the narrative during the past decade

That, I believe, is the real “hard reset” nobody is really talking about. Because the outcome is the same regardless of how AI goes:

* If AI is real, this progress continues, and magically half of my technical grievances are addressed, then thinking through the actual long-term problems, project planning, and technology architecture skills will explode in value overnight and the next millionaires will be Architects who can orchestrate and integrate complex systems with AI tooling…

OR

* This AI is just a fad, the bubble pops, and suddenly everyone who has “bandwagoned” into tech but not cultivated a growing skill set beyond coding problems will be forced out of a job. Those left behind will be the ones who kept their skills sharp and growing while everyone else drank the Kool-Aid, and no amount of community college programs or ITT Tech schemes will be able to create the amount of talent needed.

Just my two cents, and why I’ve been punching my way into architecture since COVID. I’m no dummy.

jongjong · 5h ago
I've been waiting for architecture skills to 'explode in value' for my entire 15+ years career. Basically the reverse happened thus far; architecture has been neglected and even the number of roles like 'Software Architect' and 'Solution Architect' have been on the decline.

I hope AI will force this shift to occur. Once the juniors stop thinking that they know everything, stop seeing themselves as coders and start seeing themselves as 'vibe coders'; they may be more inclined to rely on senior devs to evaluate 'their' code and to fill in the gaps.

Part of the issue before was that juniors who could churn out code at a rapid rate, didn't want to take advice from dinosaurs. I understand this very well because I was on the other side of the fence as a highly sought-after junior dev back in the day. I myself didn't see the value of senior devs back then.

Part of the issue is that it takes a long time/effort even to be a junior dev and so by the time junior devs can write any apps at all and they've read a couple of 'Software design patterns' books (sigh...), they think they're geniuses and don't need to take advice from anyone.

geodel · 4h ago
> Basically the reverse happened thus far; architecture has been neglected...

I see the same. My conclusion is a lot of people who'd be doing these marketing, analysts and many other low paying jobs where one needs to appear slick have now joined computer/IT due to hig pay in considerable numbers and to the point they hold positions of technical director etc.

At least in my experience I find this push for micro services, async, reactive, cloud, kubernetes, kafka which are part of "we are going state of the art" narrative is just to appear slick.

stego-tech · 4h ago
> At least in my experience I find this push for micro services, async, reactive, cloud, kubernetes, kafka which are part of "we are going state of the art" narrative is just to appear slick.

I can second this. If your stuff runs fine on a VM and provides more in value than it costs to support, then there’s no reason to move unless those technologies provide demonstrable/tangible value above and beyond their costs.

As for both of your comments, I’m also admittedly taking a swing for the fences here based on my read of the environment. A lot of metaphorical checks were written with the intent AI will pay them in full (or whatever the next fad is), but at some point the proverbial bill always comes due.

georgeecollins · 5h ago
I would really like to see the App this person is talking about. I am sure AI coding is very useful, but its so hard for me to judge without seeing the projects people are making with it. I have seen lots of toy examples in my field (not intended to be commercial) but this person claims they are making a commercial thing from scratch with AI. It's so hard to judge without seeing the work product.
apical_dendrite · 5h ago
I'm curious what software engineering is like inside companies like Anthropic. How much of their work is done using Claude Code and how much more productive are they? I'm not looking for something like an engineering blog that's been crafted for marketing purposes. I would really like an honest appraisal from a developer working there.
tehjoker · 6h ago
post by someone financially self interested in scaring labor into submission
roarcher · 3h ago
> Going forward, I won’t be hiring for languages, I’ll hire devs who can solve problems, no matter the stack.

Ah, there it is. Every time I see one of these posts, it's by someone who isn't actually a developer. It's always some gleeful executive type salivating over how much money he thinks this will save him.

I suppose we'll have to suffer a wave of these idiots firing their engineering staff and crashing their companies before they learn their lesson.

pxc · 5h ago
I've recently been writing some Nix modules for personal use (mostly flake-parts) and using o3 to help, since o3 is the first OpenAI model I've used that seems good enough for many tasks.

It's useful for answering some questions about Nixpkgs conventions, but it's not much faster than just looking in the manuals or reading the source.

But when it comes to looking at my actual code and answering questions about it, it's extremely hit and miss. It's good at constructing simple functions, but the module code it writes sometimes inappropriately imports idioms from other languages. It hallucinates often.

And it's absolutely worse than useless for debugging Nix module system issues. It gives nonsense answers to questions about infinite recursion issues which manage to be plausible enough to make me waste a lot of time thinking about them, before I learned more details of the module system.

After getting burned for following it down its rabbit holes, I unfortunately find myself ignoring most of its output related to this project, even as I continue to reflexively ask it things. I have often noticed in these cases that it turns out to have been right, but am left still with the empty feeling not that I shouldn't have bothered to figure out my own answer, but that I shouldn't have bothered to ask.

All of that is to say: I think working in a language/ecosystem LLMs are "bad at" is a useful sanity check. The ways that LLMs suck at languages they suck with are instructive because they reiterate the nature of the things. The failure modes are still what you'd expect from a stochastic parrot, even as the models get "smarter".

The massive training data pools for more popular programming ecosystems make it too easy to fool yourself into believing that these things can reason. The unevenness of their performance tells you what hasn't actually "generalized".

tomlockwood · 6h ago
Often see these glowing reviews.

Where's the code?

userbinator · 6h ago
Good luck trying to debug the code you've generated without knowing anything about what it does.
denkmoon · 6h ago
Is this really any different from debugging code that has been built by hundreds of people over 10s of years and you're just the latest schmuck working on it?
DanHulton · 5h ago
It really depends on if the 10-year-old code in question was put together by a talented team with high standards, doesn't it?

But regardless, on a large project, you're always going to have some version of this problem. At least when you yourself are getting in there and writing and understand the code, you have a half-decent shot at debugging in a reasonable period of time. When even you are vibe coding your additions to the pile, you're all the way back at square one when shit hits the fan, trying to learn it all from scratch.

userbinator · 6h ago
Yes, because you actually had an incentive to learn about the code.
plorkyeran · 5h ago
Maintaining legacy code built by a huge team over decades is sort of famously difficult and the productivity expectations on those teams is incredibly low. A project getting into a similar state after only months would be disastrously bad.
ofjcihen · 5h ago
Yes. Because you know the language
waltbosz · 6h ago
This is true. I've been using LLMs for code generation, and it does stuff I don't understand well. Also it makes mistakes, bugs.

What I do find helpful is use it to ask for suggestions for how to achieve some result, then learn from the code it gives me.

But even then, going back to this code later is difficult because I didn't go through that period of the failure research success learning loop.

mcny · 6h ago
Maybe at the high end of the talent but most of us are already there. I wouldn't admit this to my manager but a lot of the time if you made me explain what actually happens at the machine level, I have no idea.
waltbosz · 5h ago
I think there is an acceptable level of black boxedness in our profession.
revskill · 6h ago
They're all pattern recognizers. Do not overhype it, it's all they can do. THey have no brain.

No comments yet

hnthrow90348765 · 5h ago
>Honestly, I’m questioning why I’d need a designer in a year.

Idk my guy maybe you should sit down with them and ask them how they're using AI and if they aren't, maybe be cooperative and introduce them to it instead of getting a stiff one fantasizing about firing them.

This whole hype cycle has some serious arrogance and sociopathic undertones about unemploying a lot of technical people suddenly (like they had it coming or something).

dudeinjapan · 5h ago
Wake me up when Claude Max Ultra Xtreme Millennium Edition can make its own Claude for you, and that Claude is better than the one you are using.