Where's the shovelware? Why AI coding claims don't add up

280 dbalatero 161 9/3/2025, 9:18:29 PM mikelovesrobots.substack.com ↗

Comments (161)

some-guy · 2h ago
These claims wouldn't matter if the topic weren't so deadly serious. Tech leaders everywhere are buying into the FOMO, convinced their competitors are getting massive gains they're missing out on. This drives them to rebrand as AI-First companies, justify layoffs with newfound productivity narratives, and lowball developer salaries under the assumption that AI has fundamentally changed the value equation.

This is my biggest problem right now. The types of problems I'm trying to solve at work require careful planning and execution, and AI has not been helpful for it in the slightest. My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company". The mass hysteria among SVPs and PMs is absolutely insane right now, I've never seen anything like it.

Seattle3503 · 1h ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".

If we can delegate incident response to automated LLMs too, sure, why not. Let the CEO have his way and pay the reputational price. When it doesn't work, we can revert our git repos to the day LLMs didn't write all the code.

I'm only being 90% facetious.

rglover · 2h ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".

Lord, forgive them, they know not what they do.

leoc · 2h ago
I think Chuck Prince's "As long as the music is playing, you've got to get up and dance. We're still dancing." from the GFC https://www.reuters.com/article/markets/funds/ex-citi-ceo-de... is the more relevant famous line here.
coffeemug · 1h ago
I haven’t heard this before, this is incredible. Thanks for sharing. There were a bunch of phenomena that didn’t quite make sense to me before, which make perfect sense now that I read the quote.
herpdyderp · 1h ago
Oh, they for sure know what they're doing.
bsder · 2h ago
Do not forgive them. We already have a description for them:

"A bunch of mindless jerks who'll be the first against the wall when the revolution comes."

o11c · 1h ago
Remember, the origin of that quote explicitly specifies "marketing department".

The thing about hype cycles (including AI) is that the marketing department manages to convince the purchases to do their job for them.

atleastoptimal · 1h ago
I think this hits at the heart of why you and so many people on HN hate AI.

You see yourselves as the disenfranchised proletariats of tech, crusading righteously against AI companies and myopic, trend-chasing managers, resentful of their apparent success at replacing your hard-earned skill with an API call.

It’s an emotional argument, born of tribalism. I’d find it easier to believe many claims on this site that AI is all a big scam and such if it weren’t so obvious that this underlies your very motivated reasoning. It is a big mirage of angst that causes people on here to clamor with perfunctory praise around every blog post claiming that AI companies are unprofitable, AI is useless, etc.

Think about why you believe the things you believe. Are you motivated by reason, or resentment?

nemomarx · 1h ago
Find a way to make sure workers get the value of ai labor instead of bosses and the workers will like it better. If the result is "you do the same work but managers want everything in 20% of the time" why would anyone be happy?
atleastoptimal · 1h ago
I agree that if there are productivity gains that everyone should benefit, but the only thing that would allow this to happen are systems and incentive structures that allow that to happen. A manager's job is to increase revenue and cut costs, that's how they get their job, how they keep their job, and how they are promoted. People very rarely get free benefits outside the range of what the incentive structures they exist in allow them to.
UncleMeat · 11m ago
> I agree that if there are productivity gains that everyone should benefit

And if they don't, then you'd understand the anger surely. You can't say "well obviously everybody should benefit" and then also scold the people who are mad that everybody isn't benefiting.

ozgrakkurt · 44m ago
And people don’t like this. Something being logical doesn’t mean people have to accept it.

Also AI has been basically useless every time I tried it except converting some struct definitions across languages or similar tasks, it seems very unlikely that it would boost productivity by more than 10% let alone 400%.

atleastoptimal · 35m ago
What AI coding tools/models have you been using?
munificent · 1h ago
> Are you motivated by reason, or resentment?

I think most people are motivated by values. Reason and emotion are merely tools one can use in service of those.

My experience that people who hew too strongly to the former tend to be more oblivious to what's going on in their psychology than most.

foxylad · 45m ago
I own my company so have no fear of losing my job - indeed I'd love to offload all the development I do, so I have no resentment against AI.

But I also really care about the quality of our code, and so far my experiments with AI have been disappointing. The empirical results described in this article ring true to me.

AI definitely has some utility, just as the last "game changer" - blockchain - does. But both technologies have been massively oversold, and there will be many, many tears before bedtime.

dreadnip · 1h ago
I don’t agree. HN is full of technical people, and technical people see LLMs for what they truly are: pattern matching text machines. We just don’t buy into the AGI hype because we’ve seen nothing to support it.

I’m not concerned for my job, in fact I’d be very happy if real AGI would be achieved. It would probably be the crowning tech achievement of the human race so far. Not only would I not have to work anymore, the majority of the world wouldn’t have to. We’d suddenly be living in a completely different world.

But I don’t believe that’s where we’re headed. I don’t believe LLMs in their current state can get us there. This is exactly like the web3 hype when the blockchain was the new hip tech on the block. We invent something moderately useful, with niche applications and grifters find a way to sell it to non technical people for major profit. It’s a bubble and anyone who spends enough time in the space knows that.

atleastoptimal · 40m ago
Calling LLM's "pattern matching text machines" is a catchy thought-terminating cliche, which accounts to calling a human brain a "blob of fats, salts, and chemicals". It technically makes sense, but it is seeing the forest for the trees, and ignores the fact that this mere pattern patching text machine is doing things people said were impossible a few years ago. The simplicity and seeming mundanity of a technology has no bearing on its potential or emergent properties. A single termite, observed by itself, could never reveal what it could build when assembled together with its brethren.

I agree that there are lots of limitations to current LLM's, but it seems somewhat naive to ignore the rapid pace of improvement over the last 5 years, the emergent properties of AI at scale, especially in doing things claimed to be impossible only years prior (remember when people said LLM's could never do math, or that image models could never get hands or text right?).

Nobody understands with greater clarity or specificity the limitations of current LLM's than the people working in labs right now to make them better. The AGI prognostications aren't suppositions pulled out of the realm of wishful thinking, they exist because of fundamental revelations that have occurred in the development of AI as it has scaled up over the past decade.

I know I claimed that HN's hatred of AI was an emotional one, but there is an element to their reasoning too that leads them down the wrong path. By seeing more flaws than the average person in these AI systems, and seeing the tact with which companies describe their AI offerings to make them seem more impressive (currently) than they are, you extrapolate that sense of "figuring things out" to a robust model of how AI is and must really be. In doing so, you pattern match AI hype to web3 hype and assume that since the hype is similar in certain ways, that it must also be a bubble/scam just waiting to pop and all the lies are revealed. This is the same pattern-matching trap that people accuse AI of making, and see through the flaws of an LLM output while it claims to have solved a problem correctly.

gdbsjjdn · 1h ago
Did you read TFA, which shows that developers are slower with AI and think they're faster?

The two types of responses to AI I see are your very defensive type, and people saying "I don't get it".

atleastoptimal · 1h ago
The article is one person recording their own use of AI, finding no statistical significance but claiming since that the evaluated ratio of AI:human speed in performing various coding tasks resembled the METR study, that AI has no value. People have already talked about issues with the METR study, but importantly with that study and this blog post, it querying a small number of people using AI tools for the first time, working in a code base they already have experience and deep understanding of.

Their claim following that is that because there hasn't been an exponential growth in App store releases, domain name registrations or Steam games, that, beyond just AI producing shoddy code, AI has led to no increase in the amount of software at all, or none that could be called remarkable or even notable in proportion to the claims made by those at AI companies.

I think this ignores the obvious signs of growth in AI companies which providing software engineering and adjacent services via AI. These companies' revenues aren't emerging from nothing. People aren't paying them billions unless there is value in the product.

These trends include

1. The rapid growth of revenue of AI model companies, OpenAI, Anthropic, etc. 2. The massive growth in revenue of companies that use AI including Cursor, replit, loveable etc 3. The massive valuation of these companies

Anecdotally, with AI I can make shovelware apps very easily, spin them up effortlessly and fix issues I don't have the expertise or time to do myself. I don't know why the author of TFA claims that he can't make a bunch of one-off apps with capabilities avaliable today when it's clear that many many people can, have done so, have documented doing so, have made money selling those apps, etc.

The_Fox · 43m ago
> The rapid growth of revenue of AI model companies, OpenAI, Anthropic, etc.

You can't use growth of AI companies as evidence to refute the article. The premise is that it's a bubble. The growth IS the bubble, according to the claim.

> I don't know why the author of TFA claims that he can't make a bunch of one-off apps

I agree... One-off apps seem like a place where AI can do OK. Not that I care about it. I want AI that can build and maintain my enterprise B2B app just as well as I can in a fraction of the time, and that's not what has been delivered.

atleastoptimal · 37m ago
Bubbles are born out of evaluations, not revenue. Web3 was a bubble because the money its made wasn't real productivity, but hype cycles, pyramid schemes, etc. AI companies are merely selling API calls, there is no financial scheming, it is very simply that the product is worth what it is being sold for.

> I want AI that can build and maintain my enterprise B2B app just as well as I can in a fraction of the time, and that's not what has been delivered.

AI isn't at that level yet but it is making fast strides in subsets of it. I can't imagine systems of models and the models themselves won't reach there in a couple years given how bad AI coding tools were just a couple years ago.

baxtr · 29m ago
AI has become the ultimate excuse for weak managers to pressure tech folks.
gedy · 6m ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".

Challenge your manager to a race, have him vibe code

vkou · 1h ago
I'd like to see those SVPs and PMs, or shit, even a line manager use AI to implement something as simple as a 2-month intern project[1] in a week.

---

[1] We generally budget about half an intern's time for finding the coffee machine, learning how to show up to work on time, going on a fun event with the other interns to play minigolf, discovering that unit tests exist, etc, etc.

elevatortrim · 1h ago
I actually built something (a time tracking tool that helps developers log their time consistently on jira and harvest) that most developers in my company use in under a week.

I have backend development background so I was able to review the BE code and fix some bugs. But I did not bother learning Jira and Harvest API specs at all, AI (cursor+sonnet 4) figured it out all.

I would not be able to write the front-end of this. It is JS based and updates the UI based on real-time http requests (forgot the name of this technology, the new ajax that is) and I do not have time to learn it but again, I was able to tweak what AI generated and make it work.

Not only AI helped me do something in much shorter than it would take, it enabled me do something that otherwise would not be possible.

panarchy · 1h ago
I'd rather see those SVPs, PMs, and line managers be turned into AI.
com2kid · 2h ago
Multiple things can be true at the same time:

1. LLMs do not increase general developer productivity by 10x across the board for general purpose tasks selected at random.

2. LLMs dramatically increases productivity for a limited subset of tasks

3. LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.

LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.

Fixing up existing large code bases? Productivity is at best a wash.

Setting up a scaffolding for a new website? LLMs are amazing at it.

Writing mocks for classes? LLMs know the details of using mock libraries really well and can get it done far faster than I can, especially since writing complex mocks is something I do a couple times a year and completely forget how to do in-between the rare times I am doing it.

Navigating a new code base? LLMs are ~70% great at this. If you've ever opened up an over-engineered WTF project, just finding where HTTP routes are defined at can be a problem. "Yo, Claude, where are the route endpoints in this project defined at? Where do the dependency injected functions for auth live?"

Right tool, right job. Stop using a hammer on nails.

heavyset_go · 1h ago
> LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.

I wax and wane on this one.

I've had the same feelings, but too often I've peaked behind the curtain, read the docs and got familiar with external dependencies and then realize whatever the LLM responds with paradoxically either wasn't following convention or tried to shoehorn your problem to fit code examples found online, used features inappropriately, took a long roundabout path to do something that can be done simply, etc.

It can feel like magic until you look too closely at it, and I worry that it'll make me complacent with the feeling of understanding without actually taking away an understanding.

SchemaLoad · 1h ago
Yeah LLMs get me _an_ answer far faster than I could find it myself, but it's often not correct. And then I have to verify it myself which was exactly the work I was trying to skip by using the LLM to start with.

If I have to manually verify every answer, I may as well read the docs myself.

ksenzee · 1h ago
> Stop using a hammer on nails.

sorry, what am I supposed to use on nails?

falcor84 · 1h ago
Nail polish remover
mvdtnz · 1h ago
> LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.

What is this supposed busy work that can be done in the background unsupervised?

I think it's about time for the AI pushers to be absolutely clear about the actual specific tasks they are having success with. We're all getting a bit tired of the vagueness and hand waving.

jfengel · 2h ago
If it can figure out where dependencies come from I'm going to have to look more into this. I really hate the way injection makes other people's code bases impenetrable. "The framework scans billions of lines of code to find the implementation, and so can you!"
com2kid · 48m ago
I'm not looking forward to the cancer of @ invading JavaScript code. Ugh. I am a big fan of wysiwyg. Plz don't Decorate my code....
iLoveOncall · 2h ago
> Setting up a scaffolding for a new website? LLMs are amazing at it.

So amazing that every single stat showed by the author in the article has been flat at best, despite all being based on new development rather than work on existing code-bases.

daxfohl · 1h ago
Maybe the world has run out of interesting websites to create. That they are created faster doesn't necessarily imply they'll be created more frequently.
daxfohl · 1h ago
Of course if that's the case (and it well may be), then THAT is the reason for tech layoffs. Not AI. If anything, it means AI came too late.
rglover · 2h ago
Most of it doesn't exist beyond videos of code spraying onto a screen alongside a claim that "juniors are dead."

I think the "why" for this is that the stakes are high. The economy is trembling. Tech jobs are evaporating. There's a high anxiety around AI being a savior, and so, a demi-religion is forming among the crowd that needs AI to be able to replace developers/competency.

That said: I personally have gotten impressive results with AI, but you still need to know what you're doing. Most people don't (beyond the beginner -> intermediate range), and so, it's no surprise that they're flooding social media with exaggerated claims.

If you didn't have a superpower before AI (writing code), then having that superpower as a perceived equalizer is something that you will deploy all resources (material, psychological, etc) to ensuring that everyone else maintain the position that 1) superpower good, 2) superpower cannot go away 3) the superpower being fallible should be ignored.

Like any other hype cycle, these people will flush out, the midpoint will be discovered, and we'll patiently await the next excuse to incinerate billions of dollars.

SchemaLoad · 1h ago
At least in my experience, it excels in blank canvas projects. Where you've got nothing and want something pretty basic. The tools can probably set up a fresh React project faster than me. But at least every time I've tried them on an actual work repo they get reduced to almost useless.

Which is why they generate so much hype. They are perfect for tech demos, then management wonders why they aren't seeing results in the real world.

tomrod · 1h ago
Exactly. It quickly builds a lot of technical debt that must be paid down, especially for people writing code in areas they aren't deep in.

For tight tasks it can be super helpful -- like for me, an AI/Data Science guy, setting up a basic reverse proxy. But I do so with a ton of scrutiny -- pushing it, searching on Kagi or docs to at least confirm the code, etc. This is helpful because I don't have a mental map about reverse proxy -- so it can help fill in gaps but only with a lot of reticence.

That type of use really doesn't justify the billion dollar valuations of any companies, IMO.

herpdyderp · 1h ago
I've had great success with GPT5 in existing projects because its agent mode is very good (the best I've seen so far) at analyzing the existing codebase and then writing code that feels like it fits in already (without prompt engineering on my part). I still agree that AI is particularly good on fresh projects though.
SchemaLoad · 1h ago
Could be that there is a huge difference in the products. Last few companies have given me Github Copilot which I find entirely useless, I found the automatic suggestions more distracting than useful, and the fix and explain functions never work. But maybe if you burn $1000/day on Claude Code it works a lot better. And then companies see the results from that and wonder why they aren't getting it spending a couple of dollars on Copilot.
fennecbutt · 2h ago
I mean the truth should be fairly obvious to people given a lot of the talk around AI stuff rings very much like the ifls/mainstream media style "science" articles which always make some outrageous "right around the corner" claim based off some small tidbit out of a paper they only skimmed the abstract of.
captainkrtek · 2h ago
This tracks with my own experience as well. I’ve found it useful in some trivial ways (eg: small refactors, type definition from a schema, etc.) but so far tasks more than that it misses things and requires rework, etc. The future may make me eat my words though.

On the other hand, I’ve lately seen it misused by less experienced engineers trying to implement bigger features who eagerly accept all it churns out as “good” without realizing the code it produced:

- doesn’t follow our existing style guide and patterns.

- implements some logic from scratch where there certainly is more than one suitable library, making this code we now own.

- is some behemoth of a PR trying to do all the things.

nicce · 2h ago
> implements some logic from scratch where there certainly is more than one suitable library, making this code we now own - is some behemoth of a PR trying to do all the things

Depending on the amount of code, I see this only as positive? Too often people pull huge libraries for 50 lines of code.

captainkrtek · 2h ago
I'm not talking about generating a few lines instead of importing left-pad. In recent PRs I've had:

- Implementing a scheduler from scratch (hundreds of lines), when there are many many libraries for this in Go.

- Implementing some complex configuration store that is safe for concurrent access , using generics, reflection, and a whole other host of stuff (additionally hundreds of lines plus more for tests).

While I can't say any of the code is bad, it is effectively like importing a library which your team now owns, but worse in that no one really understands it or supports it.

Lastly, I could find libraries that are well supported, documented, and active for each of these use-cases fairly quickly.

davidcelis · 2h ago
Someone vibe coded a PR on my team where there were hundreds of lines doing complex validation of an uploaded CSV file (which we only expected to have two columns) instead of just relying on Ruby's built-in CSV library (i.e. `CSV.parse` would have done everything the AI produced)
mandeepj · 1h ago
That’s a good example of ‘getting a desired outcome based on prompt’ - use a built-in lib or not.
vkou · 1h ago
And when it hallucinates a non-existant library, what are the magic prompts that you give it that makes it stop trying to bullshit you?
mandeepj · 53m ago
> what are the magic prompts that you give it that makes it stop trying to bullshit you?

Maybe keep your eyes open? :-)

vkou · 16m ago
As I thought.

And for the record - my eyes are open. I'm aware I'm being bullshitted. I don't trust, I verify.

But I also don't have a magical lever that I can pull to make it stop hallucinating.

... and every time I ask if one exists, I get either crickets, or a response that doesn't answer the question.

7thpower · 1h ago
I wonder how many times the LLM randomly tried to steer back to that library only to get chastised for not following instructions.
daxfohl · 2h ago
And that may be where the discrepancy comes in. You feel fast because, whoa I created this whole scheduler in ten seconds! But the you also have to spend an hour code reviewing that scheduler, which, still it feels fast to have a good working scheduler in such a short time. But without AI, maybe it feels slow to find and integrate with some existing scheduling library, but in wall clock time it was the same.
SchemaLoad · 1h ago
The trick is that no one is actually carefully reviewing this stuff. Reviewing code is properly extremely hard. I'd say even harder than writing it from scratch. But there's no minimum amount of work you have to do. If you just do a quick skim over the result, no one will know you didn't carefully review every single detail. Then it gets merged to production full of mistakes.
captainkrtek · 43m ago
To add to this:

If I as a reviewer don’t know if the author used AI, I can’t even assume a single human (typically the author) has even read any or major parts of the code. I could be the first person reviewing it.

Not that it’s a great assumption to make, but it’s also fair to take a PR and register that the author wrote it, understands it, and considers it ready for production. So much work, outside of tech as well, is built on trust at least in part.

heavyset_go · 2h ago
Yes, for leftpad-like libraries it's fine, but does your URL or email validation function really handle all valid and invalid cases correctly now and into the future, for example?
adelie · 1h ago
i've seen this fairly often with internal libraries as well - a recent AI-assisted PR i reviewed included a complete reimplementation of our metrics collector interface.

suspect this happened because the reimplementation contained a number of standard/expected methods that we didn't have in our existing interface (because we didn't need them), so it was considered 'different' enough. but none of the code actually used those methods (because we didn't need them), so all this PR did was add a few hundred lines of cognitive overhead.

captainkrtek · 41m ago
I’ve seen this as well as PR feedback to authors of AI assisted PRs: “hey we already have a db driver and interface we’re using for this operation, why did you write this?”
mcny · 2h ago
> Too often people pull huge libraries for 50 lines of code.

I used to be one of those people. It just made sense to me when I was (I still am to some extent) more naïve than I am today. But then I also used to think "it makes sense for everyone to eat together at a community kitchen of some sort instead of cooking at home because it saves everyone time and money" but that's another tangent for another day. The reason I bring it up is I used to think if it is shared functionality and it is a small enough domain, there is no need for everyone to spend time to implement the same idea a hundred times. It will save time and effort if we pool it together into one repository of a small library.

Except reality is never that simple. Just like that community kitchen, if everyone decided to eat the same nutritious meal together, we would definitely save time and money but people don't like living in what is basically an open air prison.

codebje · 1h ago
Also there are people occasionally poisoning the community pot, don't forget that bit.
fennecbutt · 2h ago
Granted, _discovery_ of such things is something I'm still trying to solve at my own job and potentially llms can at least be leveraged to analyse and search code(bases) rather than just write it.

It's difficult because you need team members to be able to work quite independently but knowledge of internal libraries can get so siloed.

captainkrtek · 1h ago
I do think the discovery piece is hugely valuable. I’m fairly capable with grep and ag, but asking Claude where something is in my codebase is very handy.
skydhash · 1h ago
I've always gone from entry point of the code (with a lot of assumptions) and then do a deep dive of one of the module or branches. After a while you develop an intuition where code may be (or follow the import/include statement).

I've explored code like FreeBSD, Busybox, Laravel, Gnome, Blender,... and it's quite easy to find you way around.

captainkrtek · 41m ago
Definitely, I’ve based a lot of my debugging on this. AI is just another tool in the toolbox for my searching, but usually not my first tool.
lumost · 1h ago
The experience in green field development is very different. In the early days of a project, the LLMs opinion is about as good as the individuals starting the project. The coding standards and other items have not yet been established. The buggy/half nonsense code means that the project is still demo able. Being able to explore 5 projects to demo status instead of 1 is a major boost.
jryio · 2h ago
I completely agree with the thesis here. I also have not seen a massive productivity boost with the use of AI.

I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...

Yee, AI is not the 2x or 10x technology of the future ™ is was promised to be. It may the case that any productivity boost is happening within existing private code bases. Even still, there should be a modest uptick in noticeably improved offer deployment in the market, which does not appear to be there.

In my consulting practice I am seeing this phenomenon regularly, wereby new founders or stir crazy CTOs push the use of AI and ultimately find that they're spending more time wrangling a spastic code base than they are building shared understanding and working together.

I have recently taken on advisory roles and retainers just to reinstill engineering best practices..

heavyset_go · 1h ago
> I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...

I've found this to be the case with most (if not all) skills, even riding a bike. Sure, you don't forget how to ride it, but your ability to expertly articulate with the bike in a synergistic and tool-like way atrophies.

If that's the case with engineering, and I believe it to be, it should serve as a real warning.

jryio · 1h ago
Yes and this is the placid version where lazy programmers elect to lighten their cognitive load by farming out to AI.

An insidious version is AGI replacing human cognition.

To replace human thought is to replace a biological ability which progresses on evolutionary timescales - not a Moore's law approximate curve. The issue in your skull will quite literally be as useful as a cow's for solving problems... think about that.

Automating labor in the 20th century disrupts society and we've see its consequences. Replacing cognition entirely: driving, writing, decision making, and communication; yields far worse outcomes than transitioning the population from food production to knowledge work.

If not our bodies and not our minds, then what do we have? (Note: Altman's universal basic income ought to trip every dystopian alarm bell).

Whether adopted passivity or foisted actively - cognition is what makes us human. Let's not let Claude Code be the nexus for something worse.

InCom-0 · 14m ago
On one hand I don't understand what all the fuss is about. LLMs are great at all kinds of things around and about: searching for (good) information, summarizing existing text, conceptual discussions where it points you in the right directions very quickly, etc. ..... they are just not great (some might say harmful) at straight up non-trivial code generation or design of complex systems with the added peculiarity that on the surface the models seem almost capable to do it but never quite ... which is sort their central feature: producing text so that it is seems correct from statistical perspective, but without actual reasoning.

On the other hand, I do understand that the things the LLMs are really great at is not actually all that spectacular to monetize ... and so as a result we have all these snake oil salesmen on every corner boasting about nonsensical vibecoding achievements, because that's where the real money would be ... if it were really true ... but it is not.

wrs · 2h ago
This makes some sense. We have CEOs saying they're not hiring developers because AI makes their existing ones 10X more productive. If that productivity enhancement was real, wouldn't they be trying to hire all the developers? If you're getting 10X the productivity for the same investment, wouldn't you pour cash into that engine like crazy?

Perhaps these graphs show that management is indeed so finely tuned that they've managed to apply the AI revolution to keep productivity exactly flat while reducing expenses.

heavyset_go · 1h ago
As the rate of profit drops, value needs to be squeezed out of somewhere and that will come from the hiring/firing and compensation of labor, hence a strong bias towards that outcome.

99% of the draw of AI is cutting labor costs, and hiring goes against that.

That said, I don't believe AI productivity claims, just pointing out a factor that could theoretically contribute to your hypothetical.

wrs · 1h ago
Maybe if you have a business where the need for software is a constant, so it’s great to get it for 90% off. (It’s not clear what business that is in 2025, maybe a small plumbing contractor?)

But if your business is making software it’s hard to argue you only need a constant amount of software. I’ve certainly never worked at a software company where the to-do list was constant or shrinking!

moduspol · 2h ago
A lot of these C-suite people also expect the remaining ones to be replaced by AI. They subscribe to the hockey-stick "AGI is around the corner" narrative.

I don't, but at least it is somewhat logical. If you truly believe that, you wouldn't necessarily want to hire more developers.

wrs · 1h ago
Or CEOs.
quantumcotton · 2h ago
Today you will learn what diminishing returns are :)

You can only utilize so many people or so much action within a business or idea.

Essentially it's throwing more stupid at a problem.

The reason there are so many layoffs is because of AI creating efficiency. The thing that people don't realize is it's not that one AI robot or GPU is going to replace one human at a one to one ratio. It's going to replace the amount of workload one person can do. Which in turn gets rid of one human employee. It's not that you job isn't taken by AI. It's started. But how much human is needed is where the new supply demand lies and how long the job lasts. There will always be more need for more creative minds. The issue is we are lacking them.

It's incredible how many software engineers I see walking around without jobs. Looking for a job making $100,000 to $200,000 a year. Meanwhile, they have no idea how much money they could save a business. Their creativity was killed by school.

They are relying on somebody to tell them what to do and when nobody's around to tell anybody what to do. They all get stuck. What you are seeing isn't a lack of capability. It's a lack of ability to control direction or create an idea worth following.

Nextgrid · 2h ago
I disagree that layoffs are because of AI-mediated productivity improvements.

The layoffs are primarily due to over-hiring during the pandemic and even earlier during the zero-interest-rate period.

AI is used as a convenient excuse to execute layoffs without appearing in a bad position to the eyes of investors. Whether any code is actually generated by AI or not is irrelevant (and since it’s hard to tell either way, nobody will be able to prove anything and the narrative will keep being adjusted as necessary).

heavyset_go · 1h ago
Bootstrapping is a lot easier when you have your family's or someone else's money to start a business and then fall back on if it doesn't pan out.

The reason people take jobs comes down to economics, not "creativity".

mattmanser · 1h ago
The reason there were so many layoffs is because cheap money dried up.

Nothing to do with AI.

Interest rates are still relatively high.

searls · 1h ago
The answer is that we're making it right now. AI didn't speed me up at all until agents got good enough, which was April/May of this year.

Just today I built a shovelware CLI that exports iMessage archives into a standalone website export. Would have taken me weeks. I'll probably have it out as a homebrew formula in a day or two.

I'm working on an iOS app as well that's MUCH further along than it would be if I hand-rolled it, but I'm intentionally taking my time with it.

Anyway, the post's data mostly ends in March/April which is when generative AI started being useful for coding at all (and I've had Copilot enabled since Nov 2022)

davidcbc · 1h ago
It's amazing how whenever criticisms pop up the responses for the last 3 years have been "well you aren't using <insert latest>, it's finally good!"
shepherdjerred · 27m ago
isn't this likely to be the case when a field is developing quickly and there are a large number of people who have different opinions on the subject?

e.g. I liked GitHub Copilot but didn't find it to be a game changer. I tried Cursor this year and started to see how AI can be today.

anp · 1h ago
FWIW this closely matches my experience. I’m pretty late to the AI hype train but my opinion changed specifically because of using combinations of models & tools that released right before the cut off date for the data here. My impression from friends is that it’s taken even longer for many companies to decide they’re OK with these tools being used at all, so I would expect a lot of hysteresis on outputs from that kind of adoption.

That said I’ve had similar misgivings about the METR study and I’m eager for there to be more aggregate study of the productivity outcomes.

noidesto · 37m ago
Agreed. Agentic AI is a completely different tool than “traditional” AI.

Im curious what the author’s data and experiment would look like a year from now.

mvdtnz · 1h ago
> AI didn't speed me up at all until agents got good enough, which was April/May of this year.

That was 5 months ago, which is 6 years in 10x time.

m-hodges · 44m ago
This article reminds me of two recent observations by Paul Krugman about the internet:

"So, here’s labor productivity growth over the 25 years following each date on the horizontal axis [...] See the great productivity boom that followed the rise of the internet? Neither do I. [...] Maybe the key point is that nobody is arguing that the internet has been useless; surely, it has contributed to economic growth. The argument instead is that its benefits weren’t exceptionally large compared with those of earlier, less glamorous technologies."¹

"On the second, history suggests that large economic effects from A.I. will take longer to materialize than many people currently seem to expect [...] And even while it lasted, productivity growth during the I.T. boom was no higher than it was during the generation-long boom after World War II, which was notable in the fact that it didn’t seem to be driven by any radically new technology [...] That’s not to say that artificial intelligence won’t have huge economic impacts. But history suggests that they won’t come quickly. ChatGPT and whatever follows are probably an economic story for the 2030s, not for the next few years."²

¹ https://www.nytimes.com/2023/04/04/opinion/internet-economy....

² https://www.nytimes.com/2023/03/31/opinion/ai-chatgpt-jobs-e...

larve · 2h ago
In case the author is reading this, I have the receipts on how there's a real step function in how much software I build, especially lately. I am not going to put any number on it because that makes no sense, but I certainly push a lot of code that reasonably seems to work.

The reason it doesn't show up online is that I mostly write software for myself and for work, with the primary goal of making things better, not faster. More tooling, better infra, better logging, more prototyping, more experimentation, more exploration.

Here's my opensource work: https://github.com/orgs/go-go-golems/repositories . These are not just one-offs (although there's plenty of those in the vibes/ and go-go-labs/ repositories), but long-lived codebases / frameworks that are building upon each other and have gone through many many iterations.

noidesto · 30m ago
Agree. In the hands of a seasoned dev not only does productivity improve but the quality of outputs.

If I’m working against a deadline I feel more comfortable spending time on research and design knowing I can spend less time on implementation. In the end, it took the same amount of time, though hopefully with an increase of reliability, observability, and extendibility. None of these things show up in the author’s faulty dataset and experiment.

trenchpilgrim · 2h ago
Same. On many days 90% of my code output by lines is Claude generated and things that took me a day now take well under an hour.

Also, a good chunk of my personal OSS projects are AI assisted. You probably can't tell from looking at them, because I have strict style guides that suppress the "AI style", and I don't really talk about how I use AI in the READMEs. Do you also expect I mention that I used Intellisense and syntax highlighting too?

droidjj · 2h ago
The author’s main point is that there hasn’t been an uptick in total code shipped, as you would expect if people are 10x-ing their productivity. Whether folks admit to using AI in their workflow is irrelevant.
larve · 2h ago
Their main point is "AI coding claims don't add up", as shown by the amount of code shipped. I personally do think some of the more incredible claims about AI coding add up, and am happy to talk about it based on my "evidence", ie the software I am building. 99.99% of my code is ai generated at this point, with the occasional one line I fill in because it'd be stupid to wait for an LLM to do it.

For example, I've built 5-6 iphone apps, but they're kind of one-offs and I don't know why I would put them up on the app store, since they only scratch my own itches.

trenchpilgrim · 2h ago
Oh yeah, I love building one off tools with it. I am working on a game mod with a friend, we are hand writing the code that runs when you play it, but we vibe code all sorts of dev tools to help us test and iterate on it faster.

Do internal, narrow purpose dev tools count as shipped code?

daxfohl · 1h ago
This seems to be a common thread. For personal projects where most details aren't important, they are good at meeting the couple things that are important to you and filling in the rest with reasonable, mostly-good-enough guesses. But the more detailed the requirements are, the less filler code there is, and the more each line of code matters. In those situations it's probably faster to type the line of code than to type the English equivalent and hand-hold the assistant through the editing process.
larve · 1h ago
I don't think so, although I think at that point experience heavily comes into play. With GPT-5 especially, I can basically point cursor/codex at a repo and say "refactor this to this pattern" and come back 25 minutes later to a pretty much impeccable result. In fact that's become my favourite past time lately.

I linked some examples higher up, but I've been maintaining a lot of packages that I started slightly before chatgpt and then refactored and worked on as I progressively moved to the "entirely AI generated" workflow I have today.

I don't think it's an easy skill (not saying that to make myself look good, I spent an ungodly amount of time exploring programming with LLMs and still do), akin to thinking at a strategic level vs at a "code" level.

Certain design patterns also make it much easier to deal with LLM code: state reducers (redux/zustand for example), event-driven architectures, component-based design systems, building many CLI tools that the agent can invoke to iterate and correct things, as do certain "tools" like sqlite/tmux (by that I mean just telling the LLM "btw you can use tmux/sqlite", you allow it to pass hurdles that would otherwise just make it spiral into slop-ratatouille).

I also think that a language like go was a really good coincidence, because it is so amenable to LLM-ification.

Aeolun · 2h ago
I don’t think this is necessarily true. People that didn’t ship before still don’t ship. My ‘unshipped projects’ backlog is still nearly as large. It’s just got three new entries in the past two months instead of one.
trenchpilgrim · 2h ago
The bottleneck on how much I ship has never been how fast I can write and deploy code :)
warkdarrior · 2h ago
Maybe people are working less and enjoying life more, while shipping the same amount of code as before.

If someone builds a faster car tomorrow, I am not going to go to the office more often.

leoc · 2h ago
"In this economy?", as the saying goes.
nerevarthelame · 1h ago
How are you sure it's increasing your productivity if it "makes no sense" to even quantify that? What are the receipts you have?
larve · 1h ago
I have linked my github above. I don't know how that fares in the bigger scope of things, but I went from 0 opensource to hundreds of tools and frameworks and libraries. Putting a number on "productivity" makes no sense to me, I would have no idea what that means.

I generate between 10-100k lines of code per day these days. But is that a measure of productivity? Not really...

throwaway13337 · 2h ago
Great angle to look at the releases of new software. I, too, thought we'd see a huge increase by now.

An alternative theory is that writing code was never the bottleneck of releasing software. The exploration of what it is you're building and getting it on a platform takes time and effort.

On the other hand, yeah, it's really easy to 'hold it wrong' with AI tools. Sometimes I have a great day and think I've figured it out. And then the next day, I realize that I'm still holding it wrong in some other way.

It is philosophically interesting that it is so hard to understand what makes building software products hard. And how to make it more productive. I can build software for 20 years and still feel like I don't really know.

balder1991 · 2h ago
Also when vou create a product you can’t speed up the iterative process of seeing how users want it, fixing edge cases that you only realized later etc. these are the things that make a product good and why there’s that article about software taking 10 years to mature: https://www.joelonsoftware.com/2001/07/21/good-software-take...
Nextgrid · 1h ago
This is the answer. Programming was never the bottleneck in delivering software, whether free-range, organic, grass-fed human-generated code or AI-assisted.

AI is just a convenient excuse to lay off many rounds of over-hiring while also keeping the door open for potential investors to throw more money into the incinerator since the company is now “AI-first”.

stillsut · 1h ago
Got your shovelware right here...with receipts.

Background: I'm building a python package side project which allows you to encode/decode messages into LLM output.

Receipts: the tool I'm using creates a markdown that displays every prompt typed, and every solution generated, along with summaries of the code diffs. You can check it out here: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

Specific example: Actually used a leet-code style algorithms implementation of memo-ization for branching. This would have taken a couple of days to implement by hand, but it took about 20 minutes to write the spec and 20 minutes to review solutions and merge the solution generated. If you're curious you can see this diff generated here: https://github.com/sutt/innocuous/commit/cdabc98

Noumenon72 · 1h ago
You should have used the word "steganography" in this description like you did in your readme, makes it 100% more clear what it does.
bastawhiz · 2h ago
The amount of shovelware is not a reliable signal. You know what's almost empty for the first time in almost a decade? My backlog. Where AI tools shine is taking an existing codebase and instructions, and going to town. It's not dreaming up whole games from scratch. All the engineers out there didn't quit their jobs to build new stuff, they picked up new tools to do their existing jobs better (or at least, to hate their jobs less).

The shovelware was always there. And it always will be. But that's doesn't mean it's splurting out faster, because that's not what AI does. Hell, if anything I expect that there's less visible shovelware because when it does get created, it's less obvious (and perhaps higher quality).

At some point, the quality of uninspired projects will be lifted up by the baseline of quality that mainstream AI allows. At what point is that "high enough that we can't tell what's garbage"? We've perhaps found ourselves at or around that point.

protocolture · 1h ago
AI has made me a 10x hobby engineer. IE if I need skills I dont have to do work thats just for me. Its great.

Its sometimes helpful when writing an email but otherwise has not touched any of my productive work.

benjiro · 2h ago
I need to agree with the author, with a caveat. He is a well developed developer. For somebody like him, churning out good quality code is probably easy.

Where i expect to see a lot of those metrics of feeling fast come from, is from people who may have less coding experience, and with AI are coding way above their level.

My brother in law asks for a nice product website, i just feed his business plan into a LLM, do some fine tuning on the results, and have a good looking website in a hour time. If i did it myself manually, just take me behind a barn as those jobs are so boring and take for ages. But i know that website design is a weakness of mine.

That is the power of LLMs. Turn out quick code, maybe offer some suggestion you did not think about, but ... it also eats time! Making your prompts so that the LLM understands, waiting for the result, ... waiting ... ok, now check the result, can you use it? O no, it did X, Y, Z wrong. Prompt again ... and again. And this is where your productivity goes to die.

So when you compare a pool of developer feedback, your going to get a broad "it helps a lot", "some", "is worse then my code", ... mix in with the prompting, result delays etc...

It gets even worse with Agent / Vibe coding, as you just tend to be waiting, 5, 10min for changed to be done. You need to review them, test them, ... o no, the LLM screwed something up again. O no, it removed 50% of my code. Hey, where did my comments go. And we are back to a loss of time.

LLMs are a tool... But after a lot of working with them, my opinion is to use them when needed but do not depend on them for everything. I sometimes look with cow eyes when people say they are coding so much with LLMs and spending 200, or more bucks per month.

They can be powerful tools, but i feel that some folks become so over dependent on them. And worst is my feeling that our juniors are going to be in a world of hurt, if their skills are more LLM monkey coding (or vibe coding), then actually understanding how to code (and the knowledge behind the actual programming languages and systems).

CompoundEyes · 27m ago
It takes time! It took me about two years to become effective. Knowing what I know now, a contract project I shipped when I was just starting out with gpt-3.5 could be done in weeks and not months still using gpt-3.5. During that period I spent hours and hours of my time and a small fortune on inference at work, for contract jobs and plenty out of pocket to learn and experiment. It’s chaos and I love it.
kenjackson · 2h ago
Shovelware may not be a good way to track additional productivity.

That said, I’m skeptical that AI is as helpful for commercial software. It’s been great for in automating my workflow because I suck at shell scripting and AI is great at it. But most of the code I write I honestly halfway don’t know what I’m going to write until I write it. The prompt itself is where my thinking goes - so the time savings would be fairly small, but I also think I’m fairly skilled (except at scripting).

NathanKP · 1h ago
I think the explanation is simple: there is a direct correlation between being too lazy and demotivated to write your own code, and being too lazy and demotivated to actually finish a project and publish your work online.

The same people who are willing to go through all the steps to release an application online are also willing to go through the extra effort of writing their own code. The code is actually the easy part compared to the rest of it... always has been.

atleastoptimal · 1h ago
All these bearish claims about AI coding would hold weight if models were stuck permanently at the capabilities level they are now with no chance at improvement. This is very likely not the case given improvements over the past year, and even with diminishing returns models will be significantly more capable both independently and as a copilot in a year.
SchemaLoad · 1h ago
Sure, no one can say what the future will look like. The problem is these products are being marketed today based on what they might do tomorrow. And it's warping perceptions of management who get sold on hype that isn't real yet and possibly not for a very long time.
hinkley · 1h ago
Hype cycles affect funding. When the Trough of Disillusionment hits anything that's being started will take years to finish due to a more difficult funding terrain.

The arrival of the Trough is predicated by the amount of lies and utter bullshit that have been shoveled out during the earlier parts of the cycle. So while it's unfortunate that the real goods don't get delivered for years and years after they might have been, it's typically and often entirely the fault of the people on the train that this has happened.

There's an awful lot of utter bullshit in the AI hype.

goalieca · 2h ago
There's a relatively monotonous task in software engineering that pretty much everyone working no a legacy c/c++ code base has had to face: static analysis and compiler warnings. That seems about as boring and routine of an operation that exists. As simple as can be. I've seen this task farmed out to interns paid barely anything just to get it done.

My question to HN is... can LLMs do this? Can they convert all the unsafe c-string invocations to safe. Can they replace system calls with posix calls. Can they wrap everything in a smart pointer and make sure that mutex locks are added where needed.

jes5199 · 1h ago
if you have a static analysis tool that gives a list of problems to fix, and something like a unit test suite that makes sure nothing got badly broken due to a sloppy edit, then yes. If you don’t have these things, you’ll accumulate mistakes
goalieca · 40m ago
We’re talking legacy code bases. Automated Test coverage is generally poor.
thewarrior · 2h ago
While I agree with the points he’s raising let me play devils advocate.

There’s a lot more code being written now that’s not counted in these statistics. A friend of mine vibe coded a writing tool for himself entirely using Gemini canvas.

I regularly vibe code little analyses or scripts in ChatGPT which would have required writing code earlier.

None of these are counted in these statistics.

And yes AI isn’t quite good enough to super charge app creation end to end. Claude has only been good for a few months. That’s hardly enough time for adoption !

This would be like analysing the impact of languages like Perl or Python on software 3 months after their release.

timdiller · 2h ago
I haven't found ChatGPT helpful in speeding up my coding because I don't want to give up understanding the code. If I let ChatGPT do it, then there are inevitable mistakes, and it sometimes hallucinates libraries, etc. I have found it very useful in guiding me through the dev-ops of working with and configuring AWS instances for a blog server, for a git server, etc. As a small business owner, that has been a big time saver.
smjburton · 2h ago
I generally agree with the sentiment of the article, but the OP should also be looking at product launch websites like ProductHunt, where there are tens to hundreds of vibe coded SaaS apps listed daily.

From my experience, it's much easier to get an LLM to generate code for a React/Tailwind CSS web app than a mobile app, and that's why we're seeing so many of these apps showing up in the SaaS space.

mattmanser · 1h ago
I actually just looked, if anything the PH data supports his theory, assuming the website I found is scraping this data accurately.

In fact it looks like there were less products launched last month on PH than the same period a year ago.

https://hunted.space/monthly-overview

It's a bit hard as they're not summing by months but it looks like less to me quickly scanning it.

And as Claude Code has only really been out 3/4 months you'd be expecting launches to be shooting up week-by-week right about now as all the vibe products get finished.

They're not, see the 8 week graph:

https://hunted.space/stats

mysterydip · 2h ago
Good article, gave me some points I hadn't considered before. I know there are some AI generated games out there, but maybe the same people were using asset flips before?

I'd also be curious how the numbers look for AI generated videos/images, because social media and youtube seem absolutely flooded with the stuff. Maybe it's because the output doesn't have to "function" like code does?

Grammatical nit: The phrase is "neck and neck", like where two race horses are very close in progress

whiterook6 · 2h ago
What the author is missing is the metric that matters more than shipping product: how much happier am I when my AI auto complete saves me typing and figures out what I'm trying to articulate for me. If devs using copilot are happier--and I am, at least--then that's value right there.
neilv · 1h ago
There's also the questionable copyright/IP angle.

As an analogy, can you imagine being a startup that hired a developer, and months later finding out the bulk of the new Web app they "coded" for you was actually copy&pasted open source code, loosely obfuscated, which they were passing it off as something they developed, and to which the company had IP rights?

You'd immediately convene the cofounders and a lawyer, about how to make this have never happened.

First you realize that you need to hand the lawyer the evidence (against the employee), and otherwise remove all traces of that code and activity from the company.

Simultaneously, you need to get real developers started rushing to rewrite everything without obvious IP taint.

Then one of you will delicately ask whether firing and legal action against the employee is sufficient, or whether the employee needs to sleep with the fishes to keep them quiet.

The lawyer will say this kind of situation isn't within the scope of their practice, but here's the number of a person they refer to only as 'the specialist'.

Soon, not only are you losing the startup, and the LLC is being pierced to go after your personal assets, but you're also personally going to prison. Because you were also too cheap to pay the professional fee for 'the specialist', and you asked ChatGPT to make the employee have a freak industrial shredder accident.

All this because you tried to cheap out, and spend $20 or $200 on getting some kind of code to appear in your repo, while pretending you didn't know where it came from.

falcor84 · 1h ago
That's a fantastic piece of short fiction, but it is fiction. In practice though, I've seen so many copy&pasted unsourced open source snippets in proprietary code that I've lost all ability to be surprised by it, and I can't think of any one time where the company was sued about that, let alone anyone facing any personal repercussions, not even those junior devs. And if anything, by being "lossy encyclopedias" rather than copy-pasters, LLMs significantly reduce this ostensible legal liability.

Oh, and then you have the actual tech giants offering legal commitment to protect you against any copyright claims:

https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot...

https://cloud.google.com/blog/products/ai-machine-learning/p...

neilv · 15m ago
neilv · 54m ago
The festival of pillaging open source is suddenly so ubiquitous, and protected by deep-pocketed exploiters selling pillaging shovels, that everyone else is just going to try to get their share of the loot?

You might be right, but the point needs to be made.

InCom-0 · 47m ago
You can copy paste unsourced open source snippets just fine, ain't nothing wrong with that (usually) It is another story whether anyone should do that for other reasons having nothing to do with open source or licensing.
Vanclief · 2h ago
While I like the self reflection from this article, I don't think his methodology adds up (pun intended). First there are two main axis where LLMs can make you more productive: speed & code quality. I think everyone is obsessed about the first one, but its less relevant.

My personal hypothesis is that when using LLMs, you are only faster if you would be doing things like boilerplate code. For the rest, LLMs don't really make you faster but can make your code quality higher, which means better implementation and caching bugs earlier. I am a big fan of giving the diff of a commit to an LLM that has a file MCP so he can search for files in the repo and having it point any mistakes I have made.

ksenzee · 2h ago
This doesn’t match my experience. I needed a particularly boilerplate module the other day, for a test implementation of an API, so I asked Gemini to magic one up. It was fairly solid code; I’d have been impressed if it had come from a junior engineer. Unfortunately it had a hard-to-spot defect (an indentation error in an annotation, which the IDE didn’t catch on paste), and by the time I had finished tracking down the issue, I could have written the module myself. That doesn’t seem to me like a code quality improvement.
malfist · 2h ago
I don't know what world you're living in, but quality code isn't a forte of ai
kmnc · 2h ago
No one wants it? If there is no demand, then no one is going to become a supplier. You don’t even want the apps you’re dreaming of building, you wouldn’t use them. If you would use them, you would already be using apps that are available. It’s why developers claim huge benefits but the output is the same, there isn’t much demand for your average software company to push more output, the bottleneck is customer demand. If anything customer demand is falling because of AI. There is no platform that is blowing up for people to shovel shit to. Everything is saturated, there is no room for shovelware.
balder1991 · 2h ago
The argument isn’t only applied to creating new todo apps. If the speed up was true, we’d be existing open source tools with more and more features, more polished than ever etc.

Instead I’m not waiting for something like Linux on smartphones to come so soon.

rjsw · 57m ago
The human barrier to Linux on smartphones is that the drivers for them exist only in old vendor forks of the source tree and Android.

I guess someone could try a prompt of "generate a patch set from Linux tree X to apply to mainline Linux for this CPU".

giantg2 · 1h ago
Until AI can understand business requirements and how they are implemented in code (including integrating with existing systems), it will continue to be overhyped. Devs will hate it, but in 10-15 years someone will figure out that the proper paradigm is to train the AI to build based off of something similar to Cucumber TDD with comprehensive example tables.
bjackman · 2h ago
There is actually a lot of AI shovelware on Steam. Sort by newest releases and you'll see stuff like a developer releasing 10 puzzle games in one day.

I have the same experience as OP, I use AI every day including coding agents, I like it, it's useful. But it's not transformative to my core work.

I think this comes down to the type of work you're doing. I think the issue is that most software engineering isn't in fields amenable to shovelware.

Most of us either work in areas where the coding is intensely brownfield. AI is great but not doubling anyone's productivity. Or, in areas where the productivity bottlenecks are nowhere near the code.

elzbardico · 1h ago
Got lots of data in my own work. The mission: Demonstrate the gains of AI to C-level.

Well... no significant effects show except for a few projects. It was really hard torturing the data to come to my manager's desired conclusion.

Kiro · 1h ago
> If so many developers are so extraordinarily productive using these tools, where is the flood of shovelware?

On my computer. Once I've built something I often realize the problems with the idea and abandon the project, so I'm never shipping it.

daxfohl · 1h ago
So is a flood of unshippable code now an indicator of increased productivity?
iamkd · 2h ago
My hunch is that the amount of shovelware (or really, any software) is mostly proportional to the number of engineers wishing to work on that.

Even if AI made them more productive, it's on a person to decide what to build and how to ship, so the number (and desire) of humans is a bottleneck. Maybe at some point AI will start buying up domains and spinning up hundreds of random indiehacker micro-SaaS, but we're not there. Yet.

ge96 · 2h ago
I've already experienced being handed a vibe coded app, which so far it's been a communication problem/code cleanliness eg. don't leave two versions of an app and not say which one is active. And the docs man so many docs/redundant/conflicting.
back2dafucha · 2h ago
I could give a rats ass what his industry thinks about me or my skills. I can build whole systems. They cant.
Aeolun · 2h ago
Hmm, I definitely have more issues with AI generated code that I wouldn’t have if I did it all manually, but the lack of typing may make up for the lost time itself.
stillpointlab · 2h ago
> We all know that the industry has taken a step back in terms of code quality by at least a decade. Hardly anyone tests anymore.

I see pseudo-scientific claims from both sides of this debate but this is a bit too far for me personally. "We all know" sounds like Eternal September [1] kind of reasoning. I've been in the industry about as long as the article author and I think he might be looking with rose-tinted glasses on the past. Every aging generation looks down at the new cohort as if they didn't go through the same growing pains.

But in defense of this polemic, and laying out my cards as an AI maximalist and massive proponent of AI coding, I've been wondering the same. I see articles all the time about people writing this and that software using these new tools and it so often is the case they never actually share what they built. I mean, I can understand if someone is heads-down cranking out amazing software using 10 Claude Code instances and raking in that cash. But not even to see one open source project that embraces this and demonstrates it is a bit suspicious.

I mean, where is: "I rewrote Redis from scratch using Claude Code and here is the repo"?

1. https://en.wikipedia.org/wiki/Eternal_September

techpineapple · 1h ago
> I mean, where is: "I rewrote Redis from scratch using Claude Code and here is the repo"?

This is one of my big datapoints in the skepticism, there's all these articles about how individual developers are doing amazing things, but almost no data points about the increase of productivity as a result.

paulhodge · 2h ago
I think different things are happening...

For experienced engineers, I'm seeing (internally in our company at least) a huge amount of caution and hesitancy to go all-in with AI. No one wants to end up maintaining huge codebases of slop code. I think that will shift over time. There are use cases where having quick low-quality code is fine. We need a new intuition about when to insist on handcrafted code, and when to just vibecode.

For non-experienced engineers, they currently hit a lot of complexity limits with getting a finished product to actually work, unless they're building something extremely simple. That will also shift - the range of what you can vibecode is increasing every year. Last year there was basically nothing that you could vibecode successfully, this year you can vibecode TODO apps and stuff like that. I definitely think that the App Store will be flooded in the coming future. It's just early.

Personally I have a side project where I'm using Claude & Codex and I definitely feel a measurable difference, it's about a 3x to 5x productivity boost IMO.

The summary.. Just because we don't see it yet, doesn't mean it's not coming.

copperx · 1h ago
If I've learned something about humans in my 40+ years of being alive is that in the long term, convenience trumps all other considerations.
bcrosby95 · 47m ago
I think it depends.

There are very simple apps I try to vibe code that AI cannot handle. It seems very good at certain domains, and others it seems complete shit at.

For example, I hand wrote a simulation in C in just 900 LOC. I wrote a spec for it and tried to vibe code it in other languages because I wanted to compare different languages/concurrency strategies. Every LLM I've tried fails, and manages to write 2x+ more code in comparatively succinct languages such as Clojure.

I can totally see why people writing small utilities or simple apps in certain domains think its a miracle. But when it comes to things like e.g. games it seems like a complete flop.

NooneAtAll3 · 1h ago
I wish that first bar graph was log scale...
wewewedxfgdf · 2h ago
Maybe if the number of "Show HNs" has gone up that might be a data point.
lazarus01 · 1h ago
> demand they show receipts or shut the fuck up

This is what I always look for. Haven’t found one salient success story with a claim for success.

lowbloodsugar · 58m ago
Turns out AI can’t help script kiddies write production ready applications. Also turns out that AI is good for some things and not others, and a coin toss isn’t a good method to decide which tasks to do using AI. I read that JavaScript is by far the most popular language: still not using it for the mission critical software I write. So it doesn’t bother me that 90% of HN is “AI sucks!” stories. I find it extremely effective when used appropriately. YMMV.
scotty79 · 1h ago
How widely is AI adopted in the wider IT industry anyways? I imagine $200 per month subscription isn't that popular with people refusing to pay for their IDEs and going with free alternatives instead. And month worth of free tier of AI agent can be spent in two intense evenings.

So who pays for AIs for developers? Mostly corpos. And the speed of individual developer was never a limiting factor in corpos. Average corporate development was always 10 times slower than indie. So even doubling it won't make any impression.

I don't know if I'm faster with AI at a specific task, but I know that I'm doing things I wouldn't touch because I hate the tedium. And I'm doing them while cooking and eating dinner and thinking about wider context and next things to come. So for me it feels worth it.

I think it might be something like with cars and safety. Any car safety improvements are going to be offset by the drivers driving faster and more recklessly. So maybe any speed improvements that AI might make for the project is nullified by developers doing things they would just skip without it.

groby_b · 2h ago
I think the author misses a few points

* METR was at best a flawed study. Repo-familiarity and tool-unfamiliarity being the biggest points of critique, but far from the only one

* they assume that all code gets shipped as a product. Meanwhile, AI code has (at least in my field of view) led to a proliferation of useful-but-never-shipped one-off tools. Random dashboards to visualize complex queries, scripts to drive refactors, or just sheer joy like "I want to generate an SVG of my vacation trip and consume 15 data sources and give it a certain look".

* Their own self-experiment is not exactly statistically sound :)

That does leave the fact that we aren't seeing AI shovelware. I'm still convinced that's because commercially viable software is beyond the AI complexity horizon, not because AI isn't an extremely useful tool

flyinglizard · 1h ago
I get excellent productivity gains from AI. Not everywhere, and not linearly. It makes the bad stuff about the work (boilerplate, dealing with things outside my specialties) tolerable and the good stuff a bit better. It makes me want to create more. Business guys missing some visualization? Hell why not, few minutes on Aider and it's there. Let's improve our test suites. And let's migrate away from that legacy framework or runtime!

But my workflow is anything but "let her rip". It's very calculated, orderly, just like mastering any other tool. I'm always in the loop. I can't imagine someone without serious experience getting good stuff, and when things go bad, oh boy you're bringing a firehose of crap into your org.

I have a junior programmer who's a bright kid but lacking a lot of depth. Got him a Cursor subscription, tracking his code closely via PRs and calling out the BS but we're getting serious work done.

I just can't see how this new situation calls for less programmers. It will just bring about more software, surely more capable software after everyone adjusts.

degamad · 1h ago
> It will just bring about more software

But according to the graphs in the article, after three years of LLM chatbots and coding assistants, we're seeing exactly the same rate of new software...

vFunct · 1h ago
From the post, if AI was supposed to make everyone 25% more productive, then a 4 month project becomes a 3 month project. It doesn't become a 1 day project.

Was the author making games and other apps in 30 hours? Because that seems like a 4 month project?

UltraSane · 2h ago
I find LLMs useful to decide what is the best option to solve a problem and see some example code.
gyanchawdhary · 1h ago
Punch card programmers mocked assembly: “Too high-level, real programmers know machine code!”

Assembly programmers mocked C: “It hides what the CPU is doing, you’ll never optimize properly!”

C/C++ programmers mocked Java & Python: “Garbage collection? That’s for people who don’t understand memory management!”

Web developers mocked JavaScript frameworks: “Real engineers write everything in raw JS!”

2025 Developer: I'm furious and angry because the narrative of AI coding tools dramatically boosting developer productivity is false

techpineapple · 1h ago
One of these things is not like the other.
huflungdung · 2h ago
Stupid article. Can be summed up into one sentence “ai creates fantastic code but can not market it for you or find a novel usp”