These claims wouldn't matter if the topic weren't so deadly serious. Tech leaders everywhere are buying into the FOMO, convinced their competitors are getting massive gains they're missing out on. This drives them to rebrand as AI-First companies, justify layoffs with newfound productivity narratives, and lowball developer salaries under the assumption that AI has fundamentally changed the value equation.
This is my biggest problem right now. The types of problems I'm trying to solve at work require careful planning and execution, and AI has not been helpful for it in the slightest. My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company". The mass hysteria among SVPs and PMs is absolutely insane right now, I've never seen anything like it.
culopatin · 1d ago
Today a friend of mine connected me with his uncle who wanted to develop an MVP for his company.
He claimed he didn’t want to distract his engineers on this project, but that “it shouldn’t take you more than 5hs to do this with vibe coding”. I promptly declined, if it takes 5hs why are you reaching out to me? It would take more than 5hs just to bring me into the loop of what you want vs your own engineers. If vibe coding is that good might as well DIY while you’re showering!
noduerme · 1d ago
These kinds of "clients" have always been around. If they want to tell me how long they think it should take, my answer has always been that they should do it themselves. If they say something like, "it shouldn't take too long," I said "send me what you need it to do." Then I look at that and ask "what if A, B or C happens? What if a user does X?" Make a list about 15 bullet points long of edge cases they hadn't thought of showing, the flaws in their business logic. (This, btw, is what I'd do if I were instructing an LLM as well). Then it's, "well, how long will that take?"
And my answer - like the best of car mechanics who work on custom rides - is: I don't know how long until I actually get in there, but minimum 5x what you think, and my rate is $300/hr. Is it worth it to you to do it right?
Usually the answer is no. When it's yes, I have a free hand. And having a few clients who pay well is worth a lot more than having a few dozen who think they know everything and are too cheap to pay for it anyway.
Wowfunhappy · 1d ago
I have fully vibe coded several apps at this point (none professionally, but stuff I actively use in my life). One thing I don't think everyone understands is that it still takes time. I need to do countless rounds of testing, describing the exact problem I find, rinse and repeat again and again for hours and hours. I happen to enjoy this way of working and I do think it's faster than writing the code myself, but it's not fast.
some-guy · 1d ago
Not only that, but the requirements are far more lax when it's your own project. In an enterprise setting on a 14-year old codebase (where this 80%-reduced-timeframe project lies), vibe coding doesn't work at all! PMs and managers simply do not understand the nuances of these tools.
patapong · 1d ago
Yep agreed... It's a bit like sifting gold, most of what it produces is crap, but if you stay with it long enough and put in enough effort you will eventually have something that works well.
Part of the issue is that it is so fast to get to a working prototype that it feels like you are almost there, but there is a lot of effort to go.
yetihehe · 23h ago
> that it feels like you are almost there, but there is a lot of effort to go.
People made tools specifically to make templates look like templates and not a finished product, just because client was ready to push the product just after seeing real looking mockups of interface.
No comments yet
Sharlin · 20h ago
Us devs always (justifiably) complain how the PHBs of the world go all "looks great, now ship it" when presented a minimal prototype hacked together in a week, missing 95% of all the time-consuming parts. Apparently it’s easy to fall into the same trap even when you should know better…
dangus · 1d ago
I find that the main way it saves time is in situations where yes I know how to write code but I am looking to write something in a domain I’m less familiar with.
And regex.
ta12653421 · 5h ago
++1
duxup · 1d ago
If it's 5hs, why not distract his own engineers?
That's pretty much nothing ... to me that line indicates a whole lot of other possible things.
materielle · 1d ago
My opinion is that AI isn’t actually the root of the problem here.
It’s that we are heading towards a big recession.
As in all recessions, people come up with all sorts of reasons why everything is fine until it can’t be denied anymore. This time, AI was a useful narrative to have lying around.
osigurdson · 1d ago
I think a kind of AI complacency has set in. Companies are just in chill mode right now, laying off people here and there while waiting for AI to get good enough to do actual work.
baselessness · 1d ago
Everyone is bracing for a labor supply shock. It will move in the direction opposite what investors expect.
2030 will be 2020 all over again.
denkmoon · 1d ago
Why?
rsynnott · 1d ago
If (a) companies lay too many people off because the magic robots will make engineers unnecessary and (b) the pipeline collapses, because being a software engineer is an undesirable career because it is being replaced by robots and (c) it emerges that the robots are kinda bullshit, then there's going to be one hell of a shortage.
When I started a CS degree in 2003, we were still kinda in the "dot com crash has happened, no-one will ever hire a programmer again" phase, and there were about 50 people in my starting class. I think in the same course two years ago, there'd been about 200. The 'correct' number, for actual future demand, was certainly closer to 200 than 50, and the industry as a whole had a bit of a labour crunch in the early 10s in particular.
InsideOutSanta · 1d ago
I believe we are vastly underestimating the number of programmers needed, as some companies reap unusually high rewards from hiring programmers. Companies like Google can pay huge sums of money to programmers because they make even higher sums of money from the programmer's work.
This means that they inflate programmer salaries, which makes it impossible for most companies that could benefit from software development to hire developers.
We could probably have five times as many software developers as we have now, and they would not be out of work; they would only decrease average salaries for programmers.
phatskat · 1h ago
But if only Google or similarly sized companies can pay that well, and there’s tons of programmers, obviously the average salary will balance out lower than what Google pay but will still be competitive to the thousands of programmers who didn’t get hired at Google.
camdenreslink · 1d ago
If a company could benefit from software developers but can’t afford them, then they can purchase Saas offerings written by companies that can afford developers. I don’t think we’ve run out of opportunities to improve the business world with software quite yet.
InsideOutSanta · 1d ago
The fact that there is a market for these products, but they are almost universally terrible, supports my point.
osigurdson · 1d ago
I think it might be worse that that as staff reductions are across the board, not just in software development roles. My hope is start up creation will be unprecedented to take advantage of the complacency. They will wonder why AI deleted their customers when they thought it was supposed to delete their employees.
DrillShopper · 1d ago
Holding on for that sweet sweet pay bump after the coming AI winter
ModernMech · 22h ago
Combine a bunch of factors:
1) fewer students are studying computer science, I'm faculty at a top CS program and we saw our enrollment decline for the first time ever. Other universities are seeing similar slowdowns of enrollment [1]
2) fewer immigrants coming to the united states to work and live, US is perhaps looking at its first population decline ever [2]
3) Current juniors are being stunted by AI, they will not develop the necessary skills to become seniors.
4) Seniors retiring faster because they don't want to have to deal with this AI crap, taking their knowledge with them.
So we're looking at a negative bubble forming in the software engineering expertise pipeline. The money people are hoping that AI can become proficient enough to fill that space before before everything bursts. Engineers, per usual, are pointing out the problem before it becomes one and no one is listening.
I have never, ever seen SVPs, CEOs, and PMs completely misunderstand a technology before. And I agree with you, I think it's more of an excuse to trim fat--actual productivity is unlikely to go up (it hasn't at our Fortune 500 company)
osigurdson · 22h ago
>> productivity is unlikely to go up
I wonder how that would even be measured? I suppose you could do it for roles that do the same type of work every day. I.e. perhaps there is some statistical relevance to number of calls taken in a call center per day or something like that. One the software development side however, productivity metrics are very hard to quantify. Of course, you can make a dashboard look however you want, but impossible, essentially to tie those metrics to NPV.
isodev · 1d ago
> we are heading towards a big recession
Who is we? One country heading into a recession is hardly enough to nudge the trend of "all things code"
viridian · 1d ago
The last US recession that didn't also pull in the rest of the western world was in 1982, over 40 years ago. Western Europe, Aus, NZ, Canada, and the US all largely rise and sink on the same tides, with differences measured in degrees.
dragonwriter · 12h ago
If that “one country” is the US and not, say, Burkina Faso, it is a major impact on financing, and software has an unusually high share of positions dependent on speculative investment for future return rather than directly related to current operations.
colechristensen · 1d ago
America's recessions are global recessions.
brianmcc · 1d ago
Sadly yes - "When America sneezes, the World catches cold"
danaris · 1d ago
Enough of the tech industry is America-based that a US recession is enough to do much more than nudge the trend of "all things code". Much as I would prefer that it were not so.
aprilthird2021 · 1d ago
First thing I thought of was Benioff saying he cut thousands of customer support roles because AI can do it better then turning around and giving lackluster earnings report with revised down guidance and the stock tanks
rootusrootus · 1d ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate
That's insane. Who the hell pulls a number out of their ass and declares it the new reality? When it doesn't happen, he'll pin the blame on you, but everyone else above will pin the blame on him. He's the one who will get fired.
Laying off unnecessary developers is the answer if LLMs turn out to make us all so much more productive (assuming we don't just increase the amount of software written instead). But that happens after successful implementation of LLMs into the development process, not in advance.
Starting to think I should do the inadvisable and move my investments far far away from the S&P 500 and into something that will survive the hype crash that can't be too far off now.
torginus · 1d ago
The whole 'startup/scaleup' culture (which have become the industry titans of today) - is insane and have been insane for as long as it has been a thing - the culture of either 'just grow and figure out how to monetize'(Social media, food delivery etc.) or 'we're selling you this technology that doesn't exist yet but going to be insanely valuable in the future (AGI, self driving)' or 'we're selling shovels to the first 2' (cloud providers, Nvidia) has been insane.
I'd argue that compared to a decade. 15 years ago, relatively little value has been created. If you sat down in front of a 15 yo computer, or tried to solve a technical challenge with the tooling of 15-10 years ago, I don't think you'd get a significantly worse result.
Yet in this time the US has doubled its GDP, most of it owning to the top, to the tech professionals and financiers who benefited from this.
And some of this money went into assets with constrained supplies, such as housing, marking up the prices adjusted for inflation, making average people that much worse off.
While I do feel society is making progress, it's been a slow and steady march. Which in relative terms means slowing. Of I gave you $10 every week, by week 2, you'd double your wealth, by the end of the year, you almost didn't notice the increase.
Technology accumulation is the same, I'd argue it's even worse, since building on top of existing stuff has a cost proportional to the complexity for a fixed unit of gain (and features get proportionally less valuable as you implement the most important ones first).
Sorry got distracted from my main point - what happens when people stop believing that these improvements are meanigful or that technology that was priced in to produce 100x the value will come at all (and more importantly the company you're invested in will be able to caputre it)?
dangus · 1d ago
> If you sat down in front of a 15 yo computer, or tried to solve a technical challenge with the tooling of 15-10 years ago, I don't think you'd get a significantly worse result.
While you have decent points in your comment (essentially, the idea of tech industry growth slowing due to low hanging fruit being picked), if this statement going to be your barometer you’re going to end up looking stupendously wrong.
You can sit your Grandma down at her computer and have her type in “please make me a website for my sewing club that includes a sign up form and has a pink background” and AI will just do it and it’ll probably even work the first time you run it.
15 years ago tossing a website on Heroku was a pretty new concept, and you definitely had to write it all on your own.
10 years ago Kubernetes had its initial release.
Google Drive and Slack are not even 15 years old.
TensorFlow is just hitting its 10th birthday.
I think you’re vastly underestimating the last 15 years of technological progress and in turn value creation.
ndriscoll · 1d ago
I just tried it and no, you can't. I don't really even see how you could. Where is the form supposed to go? Is Grandma supposed to rent a server somewhere to do processing? Where?
25 years ago we had wysiwyg editors to build web pages, and she could just include a link to email her to have people ask her to "sign up" (the entire point of a sewing club is to socialize. You don't need an automated newsletter or whatever. That's for people that want to sell you something). You'd put it into the space your ISP gave you, or Windows actually included a personal web server. You had to be somewhat technical to use the Windows server, but it could've been made more friendly instead of being removed, and personal computers should make it easy to share a static website.
We've clearly regressed since then. People think you need an engineer or AI to build basic web pages. My actual grandma was doing some of this stuff when I was a kid in the 90s, as was I.
What happened to the web seems analogous to me to as if we abandoned word processors and spreadsheets and said you need a six-figure engineer to do custom programming any time you need the features of one of those things.
ansgri · 20h ago
100% this. At that time if one needed a simple custom database client with GUI and analytical reports, the go-to solution was MS Access and it was working well. Could even be placed on a shared drive. Now, as accepted solution, you either pay for different SaaSes, each with a fraction of functionality, or have to code a bespoke website using overcomplicated frameworks.
dangus · 8h ago
MS access still exists and is still maintained and offered for sale.
You should not look back on those solutions and neglect to acknowledge their limitations. Access can only handle small databases (maximum 2GB) and about 100 simultaneous connections.
Access is basically a product of the technological limitations of its time. It isn’t designed the way it is because that way is the best design.
The kind of business that relied on Access is going to find that solutions like Airtable are far easier and more powerful, and certainly less prone to data loss than mom and pop’s shared drive. There are even open source self-hosted alternatives like NocoDB and Baserow (or cloud hosted).
You’ll inevitably complain about the subscription cost of SaaS solutions but it’s not like MS Office business licenses were ever cheap.
dangus · 9h ago
The thing is that you and grandma were only accomplishing static sites back then. You can still do that today with many more options than you had back then and you can even use many of the exact same methods.
But when it comes to actually accomplishing something with application logic on the web, Grandma could ask the AI agent literally all these questions and get solid answers. At any point you get confused you can just ask what to do. You can highlight a line of code and ask “what does this mean?” You can even say “I want this web page to be put online what do I do?”
Beats waiting for a human to answer you.
You’re also taking my example application way too literally. No I don’t know why Grandma needs a signup form, I just couldn’t think of a web app off the top of my head.
MS Access and WSYWIG tools like FrontPage and iWeb were not good. I know because I was there and I used FrontPage at work. A link to your email (get spammed forever) is not a replacement for an application or email form. The whole reason code is preferred over WYSIWYG is because you inevitably have issues with change management even for simple personal projects, which is why static site generators have gained popularity. I’m sure your grandma could have handled markdown + someone else’s Hugo template + Netlify.
Hell, if we want to talk about progress in WYSIWYG editors, Trix greatly improved and modernized the shortfalls of that process and that was launched in 2014, less than 10 years ago. So even in the world of WSYWIG we have better tools now than before.
IIS has not been removed from Windows home edition, by the way.
dep_b · 1d ago
25 years ago I uploaded zip files with my PHP update to overwrite my older site, and possibly I had to do a MySQL migration before I could go live again. And when I don't have to care about anybody else's professional standards, that's still how I make simple websites.
Apart from doing the styles and layout, I don't think current tools have less friction. They're a lot safer though. Can't say I never dropped a production database.
ViewTrick1002 · 1d ago
Kind of missing the existence of Wordpress? A random theme + a "form builder" plugin and it would be done with hosting taken care off wherever you created the site.
dangus · 7h ago
I really just didn’t bring it up because it is older than 15 years old and just spans the evolution of the web.
I definitely agree that it’s a reasonable choice.
insane_dreamer · 22h ago
> vastly underestimating the last 15 years of technological progress
based on your examples, I'd say you're vastly overestimating
sure, those are all new techs, but it's not like you couldn't do those things before in different ways or they are revolutionary in any way
ModernMech · 22h ago
Their point was that we don't need things like Heroku, Kubernetes, Slack, TensorFlow, etc. They're not creating value, they're propping up a tech stack, whose value is questionable, given the amount we've invested in it. It seems that over the past 15 years of tech, the end result is that a few companies and people became fabulously wealthy, and the rest of us are pretty much worse off. Tech isn't transforming our lives or society the way tech companies promised it would 15 years ago.
As for grandma, 15 years ago she could have just posted her sewing club on Facbook, she doesn't need Heroku or AI.
dangus · 8h ago
I think you’re willfully ignorant if you’re going to claim that Slack and Kubernetes haven’t created any value. And of course these are just random examples that I happened to think of.
I would say that these technologies/products being so wildly popular puts burden of proof on you to show me some kind of evidence that these technologies aren’t productive. Like are you trying to say that something had better deployment velocity and reliablity than Kubernetes at handling high-complexity application infrastructure deployments for large enterprise companies? What was it? Why was it better than Kubernetes?
The analogy is that you’re basically saying that zippers aren’t really better than buttons but then literally everyone is overwhelmingly wearing pants and coats with zippers and very strongly prefer zippers. So really it’s on you to prove to me that I should be using pants with buttons instead.
Finally, there’s a lot of irony in your first paragraph complaining about a few tech oligarchs becoming fabulously wealthy and then suggesting that Grandma just use Facebook instead of building her own site. In any event, my web app example was just a poorly thought out example of a web app, I really just mean a website that has a little more utility than a static site.
ModernMech · 2h ago
> I think you’re willfully ignorant if you’re going to claim that Slack and Kubernetes haven’t created any value.
The juice hasn't been worth the squeeze. You can look at all societal indicators except the stock market pointing downward to get to that conclusion. Nothing is actually better than it was in 2010 despite Uber, Airbnb, Kubernetes, Slack, and all the other SV tech "innovations". People are not happier or wealthier because of the tech coming from Silicon Valley. In general the end result of the last 15 years of tech is that it's made us more neurotic, disconnected, depressed, and angry.
We don't need "better deployment velocity and reliability for high-complexity application infrastructure deployments for large enterprises". Listen to yourself man, you sound like you've been hypnotized by the pointy-haired boss. The tech sector makes false promises about a utopia future, and then it just delivers wealth for shareholders, leaving everyone else worse off.
Grandma especially doesn't need deployment velocity, she's being evicted by her landlord because he wants to turn her flat into an Airbnb. She can't get to the grocery store because the town won't invest in public transport and Uber is the only option. She's been radicalized by Meta and Youtube and now she hates her own trans grandchild because her social media feed keeps her algorithmically outraged. Oh, and now she's getting scammed by AI asking her to invest her life savings in shitcoins and NFTs.
> The analogy is that you’re basically saying that zippers aren’t really better than buttons but then literally everyone is overwhelmingly wearing pants and coats with zippers and very strongly prefer zippers.
I don't agree that the ubiquity and utility are necessarily correlated, so I don't see the zippers and Kubernetes as analogous.
But the proliferation of zippers has more to do with the fact they are easier for manufacturers to integrate into products compared to buttons -- they come pre-measured and installing them is a straight stitch that can be done with a machine, whereas installing buttons is more time-consuming.
Zippers are worse for consumers in many ways, repairability chief among them. But really they are part of a general trend over my lifetime of steadily falling garment quality, as manufacturers race to the bottom.
> In any event, my web app example was just a poorly thought out example of a web app, I really just mean a website that has a little more utility than a static site.
You said it, not me. We had the technology to throw up a static site in 2010 and my grandmother could actually do that with dreamweaver and FTP, and it worked fine.
queenkjuul · 17h ago
I actually had a lot of fun building a native .NET web frontend in VB 2005 recently lol. I thought it kind of amazing that i could just bind UI controls directly to state objects and the UI would automatically React to any changes i made. Felt very natural as a modern web dev lol. Found a lightweight .NET JSON library that was compatible all the way back to VB 2005 as well.
In case you also need to control Spotify from Windows 95 :D
> That's insane. Who the hell pulls a number out of their ass and declares it the new reality?
Product and Sales?
Ekaros · 1d ago
Why don't we cut sales commissions by 30% and expect double the sales now. Surely LLMs will make them that much more effective and they still make more.
iamacyborg · 1d ago
I see you’ve met the senior leadership at my current employer
toomuchtodo · 1d ago
VXUS
Not investing advice; the bottom 490 companies in the S&P500 are nominally flat since 2022 and down against inflation, GPUs and AI hype are holding everything together at the moment.
> In simpler terms, 35% of the US stock market is held up by five or six companies buying GPUs. If NVIDIA's growth story stumbles, it will reverberate through the rest of the Magnificent 7, making them rely on their own AI trade stories.
> Capex spending for AI contributed more to growth in the U.S. economy in the past two quarters than all of consumer spending, says Neil Dutta, head of economic research at Renaissance Macro Research, citing data from the Bureau of Economic Analysis.
> Two Nvidia customers made up 39% of Nvidia’s revenue in its July quarter, the company revealed in a financial filing on Wednesday, raising concerns about the concentration of the chipmaker’s clientele.
In some cases, I'm sure it would play that way. But I've been on both sides, and most places I've worked have been more reluctant to fire engineers than managers.
theandrewbailey · 1d ago
He will fire you before he gets fired.
InsideOutSanta · 1d ago
Here's a new keyboard. I've cut all your estimations by five percent; surely you can type much faster with this.
y1n0 · 1d ago
> That's insane. Who the hell pulls a number out of their ass and declares it the new reality?
Chatgpt.
klodolph · 1d ago
I think ChatGPT isn’t the “who”, it’s just the ass that people are pulling numbers out of. A big ole extra butt you graft onto your body.
kunley · 1d ago
Them managers have always been pulling a number out of their ass.
Seattle3503 · 1d ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".
If we can delegate incident response to automated LLMs too, sure, why not. Let the CEO have his way and pay the reputational price. When it doesn't work, we can revert our git repos to the day LLMs didn't write all the code.
I'm only being 90% facetious.
bitwize · 1d ago
The CEO doesn't care. He'll fail upwards. Be able to spin it as increasing shareholder value by cutting costs and bounce before the chickens come home to roost.
vorpalhex · 1d ago
I agree with you and I'm being 0% facetious.
I think making stakeholders have to engage with these models is the most critical point for people having deadlines or expectations based on them.
Let Claude run incident response for a few weeks. I'll gladly pause pagerduty for myself.
rglover · 1d ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".
I haven’t heard this before, this is incredible. Thanks for sharing. There were a bunch of phenomena that didn’t quite make sense to me before, which make perfect sense now that I read the quote.
"Bobby Lehman is ninety three years old and he dances the twist. He is 100 years old! 120! Maybe 140! He dances like a madman!"
herpdyderp · 1d ago
Oh, they for sure know what they're doing.
bsder · 1d ago
Do not forgive them. We already have a description for them:
"A bunch of mindless jerks who'll be the first against the wall when the revolution comes."
o11c · 1d ago
Remember, the origin of that quote explicitly specifies "marketing department".
The thing about hype cycles (including AI) is that the marketing department manages to convince the purchases to do their job for them.
RichardCA · 18h ago
Ironically, I would far prefer the Douglas Adams idea of "Genuine People Personalities" over the current status quo.
If the self checkout scanner at the supermarket started bickering with me for entering the wrong produce code, that would wrap up the whole Turing Test thing for me.
atleastoptimal · 1d ago
I think this hits at the heart of why you and so many people on HN hate AI.
You see yourselves as the disenfranchised proletariats of tech, crusading righteously against AI companies and myopic, trend-chasing managers, resentful of their apparent success at replacing your hard-earned skill with an API call.
It’s an emotional argument, born of tribalism. I’d find it easier to believe many claims on this site that AI is all a big scam and such if it weren’t so obvious that this underlies your very motivated reasoning. It is a big mirage of angst that causes people on here to clamor with perfunctory praise around every blog post claiming that AI companies are unprofitable, AI is useless, etc.
Think about why you believe the things you believe. Are you motivated by reason, or resentment?
nemomarx · 1d ago
Find a way to make sure workers get the value of ai labor instead of bosses and the workers will like it better. If the result is "you do the same work but managers want everything in 20% of the time" why would anyone be happy?
atleastoptimal · 1d ago
I agree that if there are productivity gains that everyone should benefit, but the only thing that would allow this to happen are systems and incentive structures that allow that to happen. A manager's job is to increase revenue and cut costs, that's how they get their job, how they keep their job, and how they are promoted. People very rarely get free benefits outside the range of what the incentive structures they exist in allow them to.
UncleMeat · 1d ago
> I agree that if there are productivity gains that everyone should benefit
And if they don't, then you'd understand the anger surely. You can't say "well obviously everybody should benefit" and then also scold the people who are mad that everybody isn't benefiting.
atleastoptimal · 1d ago
i’m not scolding anyone who is mad that not everyone is benefitting
UncleMeat · 1d ago
What are you doing then?
queenkjuul · 16h ago
I mean you certainly fooled us all in that front
ozgrakkurt · 1d ago
And people don’t like this. Something being logical doesn’t mean people have to accept it.
Also AI has been basically useless every time I tried it except converting some struct definitions across languages or similar tasks, it seems very unlikely that it would boost productivity by more than 10% let alone 400%.
atleastoptimal · 1d ago
What AI coding tools/models have you been using?
spc476 · 1d ago
Let me guess ... they're holding it wrong, and the model they're using is older than 20 minutes.
atleastoptimal · 1d ago
You’re assuming how i would respond before i even respond. Please allow inquiries to happen naturally without polluting the thread with meritless cynicism.
jcranmer · 1d ago
With all due respect, with a response like "What AI coding tools/models have you been using?" to a complaint that AI tools just don't seem to be effective, what difference does a reply to that even make? If your experience makes you believe that certain tools are particularly good--or particularly bad--for the tasks at hand, you can just volunteer those specifics.
FWIW, my own experiences with AI have ranged from mediocre to downright abysmal. And, no, I don't know which models the tools were using. I'm rather annoyed that it seems to be impossible to express a negative opinion about the value of AI without having to have a thoroughly documented experiment that inevitably invites the response that obviously some parameter was chosen incorrectly, while the people claiming how good it is get to be all offended when someone asks them to maybe show their work a little bit.
atleastoptimal · 1d ago
Some people complain about AI but are using the free version of ChatGPT. Others are using the best models without a middleman system but still see faults, and I think it’s valuable to inquire about which domains they see no value in AI from. There are too many people saying “I tried AI and it didn’t work at all” without clarifying what models, what tools, what they asked it to do, etc. Without that context it’s hard to gauge the value of any value judgement on AI.
It’s like saying “I drove a car and it was horrible, cars suck” without clarifying what car, the age, the make, how much experience that person had driving, etc. Of course its more difficult to provide specifics than just say it was good or bad, but there is little value in claims that AI is altogether bad when you don’t offer any details about what it is specifically bad at and how.
jcranmer · 1d ago
> It’s like saying “I drove a car and it was horrible, cars suck” without clarifying what car, the age, the make, how much experience that person had driving, etc.
That's an interesting comparison. That kind of statement can be reasonably inferred to be made by someone just learning to drive who doesn't like the experience of driving. And if I were a motorhead trying to convert that person to like driving, my first questions wouldn't be those questions, trying to interrogate them on their exact scenario to invalidate their results, but instead to question what aspect of driving they don't like to see if I could work out a fix for them that would meaningfully change their experience (and not being a motorhead, the only thing I can think of is maybe automatic versus manual transmission).
> there is little value in claims that AI is altogether bad when you don’t offer any details about what it is specifically bad at and how.
Also, do remember that this holds true when you s/bad/good/g.
KronisLV · 1d ago
> With all due respect, with a response like "What AI coding tools/models have you been using?" to a complaint that AI tools just don't seem to be effective, what difference does a reply to that even make?
"Damn, these relational databases really suck, I don't know why anyone would use them, some of the data by my users had emojis in them and it totally it! Furthermore, I have some bits of data that have about 100-200 columns and the database doesn't work well at all, that's horrible!"
In some cases knowing more details could help, for example in the database example a person historically using MySQL 5.5 could have had a pretty bad experience, in which case telling them to use something more recent or PostgreSQL would have been pretty good.
In other cases, they're literally just holding it wrong, for example trying to use a RDBMS for something where a column store would be a bit better.
Replace the DB example with AI, same principles are at play. It is equally annoying to hear people blaming all of the tools when some are clearly better/worse than others, as well as making broad statements that cannot really be proven or disproven with the given information, as it is people always asking for more details. I honestly believe that all of these AI discussions should be had with as much data present as possible - both the bad and good experiences.
> If your experience makes you believe that certain tools are particularly good--or particularly bad--for the tasks at hand, you can just volunteer those specifics.
My personal experience:
* most self-hosted models kind of suck, use cloud ones unless you can get really beefy hardware (e.g. waste a lot of money on them)
* most free models also aren't very good, nor have that much context space
* some paid models also suck, the likes of Mistral (like what they're doing, just not very good at it), or most mini/flash models
* around Gemini 2.5 Pro and Claude Sonnet 4 they start getting somewhat decent, GPT 5 feels a bit slow and like it "thinks" too much
* regardless of what you do, you still have to babysit them a lot of the time, they might take some of the cognitive load off, but won't make you 10x faster usually, the gains might definitely be there from reduced development friction (esp. when starting new work items)
* regardless of what you do, they will still screw up quite a bit, much like a lot of human devs do out there - having a loop of tests will be pretty much mandatory, e.g. scripts that run the test suite and also the compilation
* agentic tools like RooCode feel like they make them less useless, as do good descriptions of what you want to do - references to existing files and patterns etc., normally throwing some developer documentation and ADRs at them should be enough but most places straight up don't have any of that, so feeding in a bunch of code is a must
* expect usage of around 100-200 USD per month for API calls if the rate limits of regular subscriptions are too limiting
Are they worth it? Depends. The more boilerplate and boring bullshit code you have to write, the better they'll do. Go off the beaten path (e.g. not your typical CRUD webapp) and they'll make a mess more often. That said, I still find them useful for the reduced boilerplate, reduced cognitive load, as well as them being able to ingest and process information more quickly than I can - since they have more working memory and the ability to spot patterns when working on a change that impacts 20-30 files. That said, the SOTA models are... kinda okay in general.
fragmede · 1d ago
We're still in the early days of LLMs. ChatGPT was only three years ago. The difference it makes is that without details, we don't know if someone's opinion is still relevant, because of how fast things have moved since the original GPT-3 release of ChatGPT. If someone half-assed an attempt to use the tools a year ago, and hasn't touched them since, and is going around still commenting about the number of R's in strawberry, then we can just ignore them and move on because they're just being loudmouths who need everyone else to know they don't like AI. If someone makes an honest attempt, and there's some shortcoming, then that can be noted, and then the next version coming out of the AI companies can be improved.
But if all we have to go on is "I used it and it sucked" or "I used it and it was great", like, okay, good for you?
hackable_sand · 1d ago
People are also tired of rolling their eyes
pabs3 · 1d ago
Start a worker-owned tech co-op? Not much point though, since people are going to pay AI to write their code instead, and so the market for consultants will dry up. Probably lots of market space for fixing up broken AI code though :)
entropicdrifter · 23h ago
Don't you think consultants get hired to fix up code? If so, why would their market dry up? If anything, I would expect it to explode
pabs3 · 16h ago
That was what I said in my last sentence. The market that will dry up is writing code in the first place.
foxylad · 1d ago
I own my company so have no fear of losing my job - indeed I'd love to offload all the development I do, so I have no resentment against AI.
But I also really care about the quality of our code, and so far my experiments with AI have been disappointing. The empirical results described in this article ring true to me.
AI definitely has some utility, just as the last "game changer" - blockchain - does. But both technologies have been massively oversold, and there will be many, many tears before bedtime.
scrubs · 1d ago
" ..hate ai..."
Bad framing and worse argument. It's emotional.
Every engineer here is evaluating what ai claims it can do as pronounced by ceos and managers (not expert in software dev) v reality. Follow the money.
Terr_ · 20h ago
> Follow the money.
Yeah, it's frustrating to see someone opine "critics are motivated by resentment rather than facts" as if it were street-smart savvy psychoanalysis... while completely ignoring how many influential voices boosting the concept have a bajillions of dollars in motive to speak as credulously and optimistically as possible.
munificent · 1d ago
> Are you motivated by reason, or resentment?
I think most people are motivated by values. Reason and emotion are merely tools one can use in service of those.
My experience that people who hew too strongly to the former tend to be more oblivious to what's going on in their psychology than most.
wolvesechoes · 1d ago
It is not tribalism - I am self-aware enough to recognize my self-interest, and it in conflict with the interests of Sam Altmans of this world and modern slave-masters, sorry, managers.
But I am not claiming that AI is useless. It is useful, but I would rather destroy every data center that enjoy strengthening of techno-feudalism.
ellen364 · 1d ago
Love a bit of source analysis.
I'd widen the frame a bit. People scared of losing their jobs might underestimate the usefulness of AI. Makes sense to me, it's the comforting belief. Worth keeping in mind while reading articles sceptical of AI.
But there's another side to this conversation: the people whose writing is pro AI. What's motivating them? What's worth keeping in mind while reading that writing?
SrslyJosh · 1d ago
I know it's probably childish and irrational and a symptom of my inferior intellect, but I have to ask, where's the proof that any of this shit works as well as AI stans claim it does?
Please, enlighten me with your gigantic hyper-rational brain.
atleastoptimal · 1d ago
If you believe AI is overvalued and is a bubble waiting to burst then you are free to short NVDA.
AI stans don’t become AI stans for no reason. They see the many enormous technological leaps and also see where progress is going. The many PhDs currently making millions at labs also have the right idea.
Just look at ChatGPT’s growth alone. No product in history compares, and it’s not an accident.
jact · 1d ago
No product except tulips :^)
bofis · 1d ago
“Markets can remain irrational longer than you can remain solvent.” - Keynes
JohnMakin · 1d ago
This makes the logical fallacy that reasoning born out of resentment is always wrong, of course. It is possible for someone to be as you describe and also correct - I imagine this armchair psychoanalysis is way off though.
gdbsjjdn · 1d ago
Did you read TFA, which shows that developers are slower with AI and think they're faster?
The two types of responses to AI I see are your very defensive type, and people saying "I don't get it".
Mikhail_Edoshin · 1d ago
Bruce Tognazzini wrote that people always claim keyboard is faster than mouse but when researchers actually measured that, it turned out mouse was faster. Bruce explained that mousing is a low-cognition activity compared to keying so subjective perceptions are skewed.
Tognazzini wrote a magazine column with all the downsides: overly funny, non-academic, etc. I think Tog meant something like selecting commands from a menu vs using a command line across a range of applications. Anyway, studies like that must be somewhere in Proceedings of CHI, I guess. (Just checked bibliography in "Tog on interface", but nothing seemed to match. Found a comparison of different types of menus, but that's different. But also relevant: I guess most people would say using pop-up menu right at the mouse cursor will be faster than a fixed one at the top of the screen, yet the experiment shows the opposite.)
Mousing implies things are visible and you merely point to them. Keyboard implies things are non-visible and you recall commands from memory. These two must have a principal difference. Many animals use tools: inanimate objects lying around that can be employed for some gain. Yet no animal makes a tool. Making a tool is different from using it because to make a tool one must foresee the need for it. And this implies a mental model of the world and the future, i.e. a very big change compared to simply using a suitable object on the spot. (The simplest "making" could be just carrying an object when there is no immediate need for it, e.g. a sufficiently long distance. Looks very simple and I myself do not know if any animals exhibit such behavior, it seems to be on the fence. It would be telling if they don't.)
I think the difference between mousing and keying is about as big as of using a tool and making a tool. Of course, if we use the same app all day long, then its keys become motor movements, but this skill remains confined to the app.
atleastoptimal · 1d ago
The article is one person recording their own use of AI, finding no statistical significance but claiming since that the evaluated ratio of AI:human speed in performing various coding tasks resembled the METR study, that AI has no value. People have already talked about issues with the METR study, but importantly with that study and this blog post, it querying a small number of people using AI tools for the first time, working in a code base they already have experience and deep understanding of.
Their claim following that is that because there hasn't been an exponential growth in App store releases, domain name registrations or Steam games, that, beyond just AI producing shoddy code, AI has led to no increase in the amount of software at all, or none that could be called remarkable or even notable in proportion to the claims made by those at AI companies.
I think this ignores the obvious signs of growth in AI companies which providing software engineering and adjacent services via AI. These companies' revenues aren't emerging from nothing. People aren't paying them billions unless there is value in the product.
These trends include
1. The rapid growth of revenue of AI model companies, OpenAI, Anthropic, etc.
2. The massive growth in revenue of companies that use AI including Cursor, replit, loveable etc
3. The massive valuation of these companies
Anecdotally, with AI I can make shovelware apps very easily, spin them up effortlessly and fix issues I don't have the expertise or time to do myself. I don't know why the author of TFA claims that he can't make a bunch of one-off apps with capabilities avaliable today when it's clear that many many people can, have done so, have documented doing so, have made money selling those apps, etc.
ThrowawayR2 · 1d ago
> "These companies' revenues aren't emerging from nothing. People aren't paying them billions unless there is value in the product."
Oh, of course not. Just like people weren't paying vast sums of money for beanie babies and dotcoms in the late 1990s and mortgage CDOs in the late 2000s [EDIT] unless there was value in the product.
atleastoptimal · 1d ago
Those are fundamentally different. If people on this site really can't tell the difference then it makes sense why people on HN assume AI is a bubble.
People paid a lot for beanie babies and various speculative securities on the assumption that they could be sold for more in the future. They were assets people aimed to resell at a profit. They had no value by themselves.
The source of revenue for AI companies has inherent value but is not a resell-able asset. You can't resell API calls you buy from an AI company at some indefinite later date. There is no "market" for reselling anything you purchase from a company that offers use of a web app and API calls.
smackeyacky · 1d ago
The central issue here is whether the money pouring into AI companies is producing anything other than more AI companies.
I think the article's premise is basically correct - if we had a 10x explosion of productivity where is the evidence? I would think some is potentially hidden in corporate / internal apps but despite everyone at my current employer using these tools we don't seem to be going any faster.
I will admit that my initial thoughts on Copilot were that "yes this is faster" but that was back when I was only using it for rote / boilerplate work. I've not had a lot of success trying to get it to do higher level work and that's also the experience of my co-workers.
I can certainly see why a particular subset of programmers find the tools particularly compelling, if their job was producing boilerplate then AI is perfect.
atleastoptimal · 1d ago
Yeah AI code is ideal for boilerplate, converting between languages, basically anything where the success criteria are definite. I don’t think there is a 10x productivity upgrade across the board, but in limited domains, yes, AI can produce human level work 10x faster.
The fundamental difference of opinion people have here though is some people see current AI capabilities as a floor, while others see it as a ceiling. I’d agree with arguments that AI companies are overvalued if current models are as capable as AI will ever be for the rest of time, but clearly that is not the case, and very likely, as they have been every few months over the past few years, they will keep getting better.
card_zero · 1d ago
Which way is the rate of change going?
queenkjuul · 16h ago
Dotcoms and CDOs absolutely had perceived intrinsic value
guappa · 1d ago
> The article is one person recording their own use of AI
It's not ONE person. I agree that it's not "every single human being" either, but more of a preliminary result, but I don't understand why you discount results you dislike. I thought you were completely rational?
> The rapid growth of revenue of AI model companies, OpenAI, Anthropic, etc.
You can't use growth of AI companies as evidence to refute the article. The premise is that it's a bubble. The growth IS the bubble, according to the claim.
> I don't know why the author of TFA claims that he can't make a bunch of one-off apps
I agree... One-off apps seem like a place where AI can do OK. Not that I care about it. I want AI that can build and maintain my enterprise B2B app just as well as I can in a fraction of the time, and that's not what has been delivered.
atleastoptimal · 1d ago
Bubbles are born out of evaluations, not revenue. Web3 was a bubble because the money its made wasn't real productivity, but hype cycles, pyramid schemes, etc. AI companies are merely selling API calls, there is no financial scheming, it is very simply that the product is worth what it is being sold for.
> I want AI that can build and maintain my enterprise B2B app just as well as I can in a fraction of the time, and that's not what has been delivered.
AI isn't at that level yet but it is making fast strides in subsets of it. I can't imagine systems of models and the models themselves won't reach there in a couple years given how bad AI coding tools were just a couple years ago.
guappa · 1d ago
Does the revenue cover the costs?
DrillShopper · 1d ago
> It's an emotional argument, born of tribalism. I’d find it easier to believe many claims on this site that AI is all a big scam and such if it weren’t so obvious that this underlies your very motivated reasoning.
Damn, when did it become wrong for me to advocate in my best interests while my boss is trying to do the same by shoving broken and useless AI tools up my ass?
queenkjuul · 16h ago
I'm motivated by Claude Code producing useless garbage every time i ask it to do anything, and Google giving me AI summaries about things that don't exist
dreadnip · 1d ago
I don’t agree. HN is full of technical people, and technical people see LLMs for what they truly are: pattern matching text machines. We just don’t buy into the AGI hype because we’ve seen nothing to support it.
I’m not concerned for my job, in fact I’d be very happy if real AGI would be achieved. It would probably be the crowning tech achievement of the human race so far. Not only would I not have to work anymore, the majority of the world wouldn’t have to. We’d suddenly be living in a completely different world.
But I don’t believe that’s where we’re headed. I don’t believe LLMs in their current state can get us there. This is exactly like the web3 hype when the blockchain was the new hip tech on the block. We invent something moderately useful, with niche applications and grifters find a way to sell it to non technical people for major profit. It’s a bubble and anyone who spends enough time in the space knows that.
atleastoptimal · 1d ago
Calling LLM's "pattern matching text machines" is a catchy thought-terminating cliche, which accounts to calling a human brain a "blob of fats, salts, and chemicals". It technically makes sense, but it is seeing the forest for the trees, and ignores the fact that this mere pattern patching text machine is doing things people said were impossible a few years ago. The simplicity and seeming mundanity of a technology has no bearing on its potential or emergent properties. A single termite, observed by itself, could never reveal what it could build when assembled together with its brethren.
I agree that there are lots of limitations to current LLM's, but it seems somewhat naive to ignore the rapid pace of improvement over the last 5 years, the emergent properties of AI at scale, especially in doing things claimed to be impossible only years prior (remember when people said LLM's could never do math, or that image models could never get hands or text right?).
Nobody understands with greater clarity or specificity the limitations of current LLM's than the people working in labs right now to make them better. The AGI prognostications aren't suppositions pulled out of the realm of wishful thinking, they exist because of fundamental revelations that have occurred in the development of AI as it has scaled up over the past decade.
I know I claimed that HN's hatred of AI was an emotional one, but there is an element to their reasoning too that leads them down the wrong path. By seeing more flaws than the average person in these AI systems, and seeing the tact with which companies describe their AI offerings to make them seem more impressive (currently) than they are, you extrapolate that sense of "figuring things out" to a robust model of how AI is and must really be. In doing so, you pattern match AI hype to web3 hype and assume that since the hype is similar in certain ways, that it must also be a bubble/scam just waiting to pop and all the lies are revealed. This is the same pattern-matching trap that people accuse AI of making, and see through the flaws of an LLM output while it claims to have solved a problem correctly.
neffy · 1d ago
No, it´s really not - it's exactly what they are. Multi-dimensional pattern matching machines, using massive databases put together from resources like stack overflow, Clegg's (every cheaters go to for assignment answers, massive copyright theft etc.). If that wasn´t the case, there wouldn't be jobs right now writing answers to feed into the databases.
And that´s actually quite useful - given that most of this material is paywalled or blocked from search engines. It´s less useful when you look at code examples that mix different versions of python, and have comments referring to figures on the previous page. I´m afraid it becomes very obvious when you look under the hood at the training sets themselves, just how this is all being achieved.
atleastoptimal · 1d ago
Look into every human’s brain and you’d see the same thing. How many humans can come up with novel, useful patents? How many novel useful patents themselves are just variations of existing tech?
All intelligence is pattern matching, just at different scales. AI is doing the same thing human brains do.
omnicognate · 1d ago
> Look into every human’s brain and you’d see the same thing.
Hard not to respond to that sarcastically. If you take the time to learn anything about neuroscience you'll realise what a profoundly ignorant statement it is.
weweersdfsd · 1d ago
If that is the case, where are the LLM-controlled robots where LLM is simply given access to bunch of sensors and servos, and learns to control them on its own? And why are jailbreaks a thing?
ethanwillis · 1d ago
Seeing as your LLMs need the novel output of human brains to even exist or expand capabilities, quite a lot.
But even if it's not a lot, it's more than the number of LLMs that can invent new meaning which is a grand total of 0.
tjr · 1d ago
If tomorrow, all human beings ceased to exist, barring any in-progress operations, LLMs would go silent, and the machinery they run on would eventually stop functioning.
If tomorrow, all LLMs ceased to exist, humans would carry on just fine, and likely build LLMs all over again, next time even better.
DonHopkins · 1d ago
>This is exactly like the web3 hype when the blockchain was the new hip tech on the block. We invent something moderately useful, with niche applications and grifters find a way to sell it to non technical people for major profit.
LLMs are not anything like Web3, not "exactly like". Web3 is in no way whatsoever "something moderately useful", and if you ever thought it was, you were fooled by the same grifters when they were yapping about Web3, who have now switched to yapping about LLMs.
The fact that those exact same grifters who fooled you about Web3 have moved onto AI has nothing to do with how useful what they're yapping about actually is. Do you actually think those same people wouldn't be yapping about AI if there was something to it? Yappers gonna yap.
But Web3 is 100% useless bullshit, and AI isn't: they're not "exactly alike".
Please don't make false equivalences between them like claiming they're "exactly like" each other, or parrot the grifters by calling Web3 "moderately useful".
jcgrillo · 1d ago
> their apparent success
Yeah so the thing is the "success" is
only "apparent". Having actually tried to use this garbage to do work, as someone who has been deeply interested in ML for decades, I've found the tools to be approximately useless. The "apparent success" is not due to any utility, it's due entirely to marketing.
I don't fear I'm missing out on anything. I've tried it, it didn't work. So why are my bosses a half dozen rungs up on the corporate ladder losing their entire minds over it? It's insanity. Delusional.
DrillShopper · 1d ago
Counterpoint: fuck them, they know exactly what they do (try to extract more work for the exact same pay out of their subordinates)
xbmcuser · 1d ago
For me this is the biggest disconnect the current level of AI/level is not good enough for replacing devs but is good enough to automate a lot of office work ie managers that would have cost to much time and effort to automate before. I think google seems to understand this a bit as they have replaced a lot of middle management because of AI and not as many developers.
insane_dreamer · 22h ago
also customer service; I was at my dental office today and there were 3 people handling checkin/checkout. I'm quite confident 80% of their workload could be automated away to where you would just need a single person to handle edge cases. That's where we're going to see a lot of entry-level jobs go away, in many domains.
phatskat · 1h ago
> That's where we're going to see a lot of entry-level jobs go away, in many domains.
And to me this is worse news. People in higher paying jobs are the ones that would hurt the economic fabric more, but by that token they’d have more power and influence to ensure a better safety net for the inevitable rise of AI and automation in much of the workforce.
Entry level workers can’t afford to not work, they can’t afford to protest or advocate, they can’t afford the future that AI is bringing closer to their doorsteps. Without that safety net, they’ll be struggling and impoverished. And then will everyone in the higher paying positions help, or will we ignore the problem until AI actually is capable of replacing us, and will it be too late by then?
sotix · 1d ago
At my company, the tech leaders aren't doing it out of mass hysteria. They're very smart individuals. The push is coming from our investors that come from the ring of classic YC-affiliated VCs. My friend who runs a YC-backed company has been told to do it by his investors too. It's a coordinated effort by external investors rather than a mass panic by individual tech leaders. If you read VC investor literature, it's full of incredible claims about how companies who don't use AI will be left behind. The exact type of stuff you'd expect to hear from groups who aim to hit the lottery with a few of their investments.
baxtr · 1d ago
AI has become the ultimate excuse for weak managers to pressure tech folks.
bambax · 1d ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".
Tell him to code it himself then? If it can be done with only prompting, and he's able to type a sentence or two in a web form, what's stopping him?
boothby · 1d ago
> The mass hysteria among SVPs and PMs is absolutely insane right now, I've never seen anything like it.
This isn't entirely foreign to me; it sure looks a lot like the hype train of the dot-com bubble. My experience says that if you're holding stocks in a company going down this road, I'd say they have very low long-term value. Even if you think there's room to grow, bubbles pop fast and hard.
vkou · 1d ago
I'd like to see those SVPs and PMs, or shit, even a line manager use AI to implement something as simple as a 2-month intern project[1] in a week.
---
[1] We generally budget about half an intern's time for finding the coffee machine, learning how to show up to work on time, going on a fun event with the other interns to play minigolf, discovering that unit tests exist, etc, etc.
elevatortrim · 1d ago
I actually built something (a time tracking tool that helps developers log their time consistently on jira and harvest) that most developers in my company use in under a week.
I have backend development background so I was able to review the BE code and fix some bugs. But I did not bother learning Jira and Harvest API specs at all, AI (cursor+sonnet 4) figured it out all.
I would not be able to write the front-end of this. It is JS based and updates the UI based on real-time http requests (forgot the name of this technology, the new ajax that is) and I do not have time to learn it but again, I was able to tweak what AI generated and make it work.
Not only AI helped me do something in much shorter than it would take, it enabled me do something that otherwise would not be possible.
panarchy · 1d ago
I'd rather see those SVPs, PMs, and line managers be turned into AI.
curvaturearth · 1d ago
This is the way
gedy · 1d ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".
Challenge your manager to a race, have him vibe code
KronisLV · 1d ago
> My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company".
This sounds incredibly stupid. It’s going to take as long as it will and if they’re not okay with that, their delusional estimates should be allowed to crash and burn, which would hopefully be a learning experience.
The problem is that sometimes there’s an industry wide hysteria even towards useful tech - like doing a lift and shift of a bunch of monoliths to AWS to be “cloud scale”, introducing Kubernetes or serverless without the ability to put either to good use, NoSQL for use cases it’s not good at and most recently AI.
I think LLMs will eventually weather the hype cycle and it will settle down on what they’re actually kinda okay at vs not, the question is how many livelihoods will be destroyed along the way (alongside all the issues with large scale AI datacenter deployments).
On a personal level, it feels like you should maybe do the less ethical thing of asking your employer in the ballpark of 1000-3000 USD for Claude credits a month, babysitting it enough to do the 20% of the work in the 20% new estimate, babysit it enough to ship a functional MVP and when they complain about missing functionality tell them that the AI tech just isn't mature enough but thankfully you'll be able to swoop in and salvage it for only the remaining 80% of the remaining estimate's worth of work.
com2kid · 1d ago
Multiple things can be true at the same time:
1. LLMs do not increase general developer productivity by 10x across the board for general purpose tasks selected at random.
2. LLMs dramatically increases productivity for a limited subset of tasks
3. LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.
LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.
Fixing up existing large code bases? Productivity is at best a wash.
Setting up a scaffolding for a new website? LLMs are amazing at it.
Writing mocks for classes? LLMs know the details of using mock libraries really well and can get it done far faster than I can, especially since writing complex mocks is something I do a couple times a year and completely forget how to do in-between the rare times I am doing it.
Navigating a new code base? LLMs are ~70% great at this. If you've ever opened up an over-engineered WTF project, just finding where HTTP routes are defined at can be a problem. "Yo, Claude, where are the route endpoints in this project defined at? Where do the dependency injected functions for auth live?"
Right tool, right job. Stop using a hammer on nails.
heavyset_go · 1d ago
> LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.
I wax and wane on this one.
I've had the same feelings, but too often I've peaked behind the curtain, read the docs and got familiar with external dependencies and then realize whatever the LLM responds with paradoxically either wasn't following convention or tried to shoehorn your problem to fit code examples found online, used features inappropriately, took a long roundabout path to do something that can be done simply, etc.
It can feel like magic until you look too closely at it, and I worry that it'll make me complacent with the feeling of understanding without actually taking away an understanding.
SchemaLoad · 1d ago
Yeah LLMs get me _an_ answer far faster than I could find it myself, but it's often not correct. And then I have to verify it myself which was exactly the work I was trying to skip by using the LLM to start with.
If I have to manually verify every answer, I may as well read the docs myself.
emodendroket · 1d ago
Is it really that different from scrolling through Stack Overflow answers and rejecting the ones that aren't suitable? A lot of times you can tell it what specifically you didn't like about the solution and get another crack anyway (e.g., "let's iterate over the characters to do this rather than using a regex")
sotix · 1d ago
I wasn't using stack overflow before LLMs. It had already declined too much in quality and discouraged me from posting legitimate questions on there. I was more focused on reading documentation to gain a solid understanding. So for me, it's a much different experience.
It's incredible how quickly an LLM can answer. I've also crossed checked its responses with documentation before and discovered that it suggested implementing a deprecated feature that had a massive warning banner in the documentation that the LLM failed to indicate. I'm still a fan of reading documentation.
HankStallone · 1d ago
I've asked AIs for help in doing things in Salesforce, and the answers tend to be a 50/50 mix of correct steps and garbage. It's not hard to see why, because there's a lot of garbage on support sites, much of it because it's outdated. Garbage in, garbage out.
The difference is that if I go directly to the support site, there's a decent chance I can quickly spot and reject the garbage based on the date, the votes it's gotten, even the quality of the writing. AI doesn't include any of those clues; it mixes good and bad together and offers it up for me to pick apart through trial and error.
mrkeen · 1d ago
It's a little different.
You pay money, have vendor lock-in, get one answer, and there's no upvotes/downvotes/accepted-answers/moderation or clarification.
Onawa · 1d ago
It doesn't completely solve this problem but definitely helps to have something like context7 MCP server running that `Copilot et al.` can reach dramatically reduces hallucinations for most tools. And additionally I've used Continue.dev VSCode along with manually specified docs and guides that you can selectively inject into your context. Both of those tactics make a huge difference in answer quality.
nicklaf · 1d ago
Personally, I don't trust LLMs to write code for me, generally speaking. That said, as of late I've been very pleased with the whole "shoehorn your problem to fit code examples found online" thing these LLMs do, in the very special case of massaging unix scripts, and where the "code examples found online" part seems to mostly amount to making fairly canonical reference to features documented in man pages that are plastered all over the web and haven't changed much in decades.
For questions that I know should have a straightforward answer, I think it beats searching Stackoverflow. Sure, I'll typically end up having to rewrite most of the script from scratch; however, if I give it a crude starting point of a half-functional script I've already got going, pairing that with very clear instructions on how I'd like it extended is usually enough to get it to write a proof of concept demonstration that contains enough insightful suggestions for me to spend some time reading about features in man pages I hadn't yet thought to use.
The biggest problem maybe is a propensity for these models to stick in every last fancy feature under the sun. It's fun to read about a GNU extension to awk that makes my script a couple lines shorter, but at best I'll take this as an educational aside than something I'd accept at the expense of portability.
culopatin · 1d ago
Before accepting an answer I’ve started asking “is there a simpler more straight forward way of achieving that?” Ans most of the time it changes the whole thing it just wrote lol
thegrim33 · 1d ago
"LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup"
Just a random personal anecdote I wanted to throw out. I recently had to build some custom UI with Qt. I hadn't worked with Qt in a decade and barely remembered it. Seems like a perfect use case for AI to get me "up to speed" on the library, right? It's an incredibly well documented library with lots written on it, perfect fodder for an AI to process.
So, I gave it a good description of the widget I was trying to make, what I needed it to look like and how it should be behave, and behold, it spit out the specific widget subclass I should use and how I should be overriding certain methods to customize behavior. Wow, it worked exactly like promised.
So I implemented it like it suggested and was seemingly happy with the results. Went on with working on other parts of the project, dealing with Qt more and more here and there, gaining more and more experience with Qt over time.
A month or two later, after gaining more experience, I looked back at what AI had told me was the right approach on that widget and realized it was completely messed up. It had me subclassing the completely wrong type of widget. I didn't need to override methods and write code to force it to behave the way I wanted. I could instead just make use of a completely different widget that literally supported everything I needed already. I could just call a couple methods on it to customize it. My new version removes 80% of the code that AI had me write, and is simpler, more idiomatic, and actually makes more sense now.
So yeah, now any time I see people write about how "well, it's good for learning new libraries or new languages", I'll have that in the back of my mind. If you don't already know the library/language, you have zero idea whether what the AI teaching you is horrible or not. Whether there's a "right/better" way or not. You think it's helping you out when really you're likely just writing horrible code.
rurp · 1d ago
Just recently I was having trouble getting something to work with a large well documented framework library. Turned out the solution I needed was to switch to a similar but different API. But that's not what Claude told me. Instead it wanted me to override and rewrite a bunch of core library code. Fortunately I was able to recognize that the suggested solution was almost certainly bad and did some more digging to find the right answer, but I could easily see nightmarish code that solves immediate problems in terrible ways piling up fast in a vibe coded project.
I do find LLMs useful at times when working in unfamiliar areas, but there are a lot of pitfalls and newly created risks that come with it. I mostly work on large existing code bases and LLMs have very much been a mildly useful tool, still nice to have, but hardly the 100x productivity booster a lot of people are claiming.
Sammi · 19h ago
Today I asked Claude how to ignore Typescript type checking in some vendored js files in my project. It churned on this and ended up turning off type checking on all js files in my project and proudly declaring it a great success because the errors were gone. Hurray. If I knew nothing about my project then I would be none the wiser.
komali2 · 5h ago
This keeps happening to me. I keep coming across big files written during my Cursor hype period from 3 months ago and finding huge non DRY chunks and genuinely useless nonsense. Yes, I should have reviewed better, but it's a lot to wade through and it ostensibly "worked," as in, the UI looked as it should.
ksenzee · 1d ago
> Stop using a hammer on nails.
sorry, what am I supposed to use on nails?
falcor84 · 1d ago
Nail polish remover
dsign · 1d ago
Cursor, is that you?
sethammons · 1d ago
I think it was a typo and should have been "Stop using a hammer on screws," suggesting tool / application mismatch.
> Setting up a scaffolding for a new website? LLMs are amazing at it.
Weren't the code generators before this even better though? They generated consistent results and were dead quick at doing it.
camdenreslink · 1d ago
And they were frequently in public repos that were updated with people filing issues if necessary.
socalgal2 · 1d ago
Would it more correct to change this
> LLMs can get me up to speed on new APIs and libraries far faster than I can myself
To this?
> LLMs can get me up to speed on old APIs and old libraries that are new to me far faster than I can myself
My experience is if the library/API/tool is new then the LLM can't help. But maybe I'm using it wrong.
retreatguru · 1d ago
An MCP server called Context7 excels at providing up to date api/library documentation for LLMs.
3uler · 1d ago
Working with LLMs has fundamentally changed how I approach documentation and development.
Traditional documentation has always been a challenge for me - figuring out where to start, what syntax conventions are being used, how pieces connect together. Good docs are notoriously hard to write, and even harder to navigate. But now, being able to query an LLM about specific tasks and get direct references to the relevant documentation sections has been a game-changer.
This realization led me to flip my approach entirely. I’ve started heavily documenting my own development process in markdown files - not for humans, but specifically for LLMs to consume. The key insight is thinking of LLMs as amnesiac junior engineers: they’re capable, but they need to be taught what to do every single time. Success comes from getting the right context into them.
Learning how to craft that context is becoming the critical skill.
It’s not about prompting tricks - it’s about building systematic ways to feed LLMs the information they need.
I’ve built up a library of commands and agents for my Claude Code installation inspired by AgentOS (https://github.com/buildermethods/agent-os) to help engineer the required context.
The tool is a stochastic parrot, you need to feed it the right context to get the right answer. It is very good at what it does but you need to use it to its strengths in order to get value from it.
I find people complaining about LLMs often expect vibe coding to be this magic tool that will build the app for you without thinking, which it unfortunately has been sold as, but the reality is more of a fancy prompt based IDE.
mvdtnz · 1d ago
> LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.
What is this supposed busy work that can be done in the background unsupervised?
I think it's about time for the AI pushers to be absolutely clear about the actual specific tasks they are having success with. We're all getting a bit tired of the vagueness and hand waving.
Kiro · 1d ago
No, you've got it backwards. If anything, people are getting tired of comments like yours.
iainctduncan · 1d ago
HN votes say otherwise, lol
Kiro · 21h ago
Not really. This is a thread that attracts a certain subset of people so it's expected. Compare the comments here to the ones in a thread with a more neutral premise.
dbalatero · 1d ago
Nope!
Kiro · 1d ago
They are. There's no vagueness. It's rare to see people that still don't believe LLMs can do anything at all nowadays. Most other naysayers have moved on to the much more relevant question on whether the perceived productivity gains are real or not.
dbalatero · 1d ago
> It's rare to see people that still don't believe LLMs can do anything at all nowadays.
I don't think the original comment you responded to made this specific point.
rhubarbtree · 1d ago
Recently I tried to scaffold a website with a well known coding agent.
It didn’t work. I asked a colleague. He had the same problem. Turned out it was using out of date setup instructions for a major tool that has changed post training.
After spending time fixing the problem, I realised (1) it would have been faster to do it myself and (2) I can no longer trust that tool to set anything up - what if it’s doing something else wrong?
paool · 1d ago
Sorry to say, but skill issue.
Use MCP servers, specifically context 7.
This gets up to date docs as long as you include the library name on your prompt and ask to use context 7.
You did the equivalent of raw dogging gpt4(an old model) for recent news versus using an agent with web search tooling.
margalabargala · 1d ago
"My shitty company's shitty app has an API for that! It's so good sales has decided to present it to the industry as though it should be incorrect not to use it"
jfengel · 1d ago
If it can figure out where dependencies come from I'm going to have to look more into this. I really hate the way injection makes other people's code bases impenetrable. "The framework scans billions of lines of code to find the implementation, and so can you!"
com2kid · 1d ago
I'm not looking forward to the cancer of @ invading JavaScript code. Ugh. I am a big fan of wysiwyg. Plz don't Decorate my code....
player1234 · 1d ago
Does not sound like a trillion dollar industry
tjr · 1d ago
More like a zillion dollar industry!
iLoveOncall · 1d ago
> Setting up a scaffolding for a new website? LLMs are amazing at it.
So amazing that every single stat showed by the author in the article has been flat at best, despite all being based on new development rather than work on existing code-bases.
daxfohl · 1d ago
Maybe the world has run out of interesting websites to create. That they are created faster doesn't necessarily imply they'll be created more frequently.
daxfohl · 1d ago
Of course if that's the case (and it well may be), then THAT is the reason for tech layoffs. Not AI. If anything, it means AI came too late.
coffeebeqn · 1d ago
AI still fails to extrapolate. It can interpolate between things it’s trained on but that’s not exactly a new interesting product. If it truly could extrapolate at human-ish levels we would actually maybe have 10x more games and websites and whatnot
bluefirebrand · 23h ago
> Setting up a scaffolding for a new website? LLMs are amazing at it
This is trivial work that you should have automated after doing it once or twice anyways :/
rglover · 1d ago
Most of it doesn't exist beyond videos of code spraying onto a screen alongside a claim that "juniors are dead."
I think the "why" for this is that the stakes are high. The economy is trembling. Tech jobs are evaporating. There's a high anxiety around AI being a savior, and so, a demi-religion is forming among the crowd that needs AI to be able to replace developers/competency.
That said: I personally have gotten impressive results with AI, but you still need to know what you're doing. Most people don't (beyond the beginner -> intermediate range), and so, it's no surprise that they're flooding social media with exaggerated claims.
If you didn't have a superpower before AI (writing code), then having that superpower as a perceived equalizer is something that you will deploy all resources (material, psychological, etc) to ensuring that everyone else maintain the position that 1) superpower good, 2) superpower cannot go away 3) the superpower being fallible should be ignored.
Like any other hype cycle, these people will flush out, the midpoint will be discovered, and we'll patiently await the next excuse to incinerate billions of dollars.
SchemaLoad · 1d ago
At least in my experience, it excels in blank canvas projects. Where you've got nothing and want something pretty basic. The tools can probably set up a fresh React project faster than me. But at least every time I've tried them on an actual work repo they get reduced to almost useless.
Which is why they generate so much hype. They are perfect for tech demos, then management wonders why they aren't seeing results in the real world.
tomrod · 1d ago
Exactly. It quickly builds a lot of technical debt that must be paid down, especially for people writing code in areas they aren't deep in.
For tight tasks it can be super helpful -- like for me, an AI/Data Science guy, setting up a basic reverse proxy. But I do so with a ton of scrutiny -- pushing it, searching on Kagi or docs to at least confirm the code, etc. This is helpful because I don't have a mental map about reverse proxy -- so it can help fill in gaps but only with a lot of reticence.
That type of use really doesn't justify the billion dollar valuations of any companies, IMO.
ethanwillis · 1d ago
What do you mean by you don't have a mental map about a reverse proxy?
tomrod · 1d ago
That I don't have a deep understanding about nginx and how it's many options fit together across an OS with confidence that I made it secure, accurate, and/or fast. Give me python, Matlab, rust, I can put something together, but something like nginx and I've simply never dived deep enough for a solid understanding before using an LLM to understand more.
caro_kann · 1d ago
Even scaffolding a new project is not easy work, especially a new stack or new versions of existing tools. For example, I have never been able to create a Vue 3 project with Vite and Tailwind setup correctly. I tried top SOTA models. Maybe my prompting skills are not good, but everytime it fails to set up a project correctly. Everytime it gives me some old configurations that's not relevant anymore.
lbreakjai · 1d ago
LLMs are probably the worst tool for the job. Code generators have been a thing forever. Why use a LLM when you can do "npm create vite@latest my-vue-app -- --template vue" ?
SchemaLoad · 13h ago
It's always more tedious than that. You have to pick all these libraries, install and set them up, build login pages, etc. Stuff that is all simple work but takes ages. I've never used an LLM for it but it seems like the kind of work that should be easy enough to automate and would save a week of setting everything up if it worked.
Rapzid · 1d ago
Why though? Vite supplies a project scaffolder and lists a one liner under getting started;
ie pnpm create vite
Tailwind is similarly a one liner to initialize(might be a vite create option now).
Edit: My bad, you are talking about the LLMs! I'm always surprised how still for past years, even though we have great projects scalfolding across the node verse, people are still complaining about how hard setting up projects is..
rkozik1989 · 1d ago
The reason LLMs suck in plenty of brownfield projects is because those codebases likely either implemented frameworks in a proprietary way, maybe did not rely on any public framework at all, or were in general done in an esoteric way and therefor few (if any) similar codebases exist within the LLMs training data. Which is problematic because LLMs aren't capable of reasoning or learning they're literally just predicting the next most likely token in a chain similarly to how autocomplete works. Without you supplying additional context and explicitly defining guardrails for preforming common tasks the LLM has no frame of reference for working with your codebase.
herpdyderp · 1d ago
I've had great success with GPT5 in existing projects because its agent mode is very good (the best I've seen so far) at analyzing the existing codebase and then writing code that feels like it fits in already (without prompt engineering on my part). I still agree that AI is particularly good on fresh projects though.
SchemaLoad · 1d ago
Could be that there is a huge difference in the products. Last few companies have given me Github Copilot which I find entirely useless, I found the automatic suggestions more distracting than useful, and the fix and explain functions never work. But maybe if you burn $1000/day on Claude Code it works a lot better. And then companies see the results from that and wonder why they aren't getting it spending a couple of dollars on Copilot.
herpdyderp · 1d ago
I use GitHub Copilot from work in agent mode with GPT5 and it’s great! I don’t use the suggestions, fix, or explain features, I agree they’re almost always not helpful.
SchemaLoad · 1d ago
Is that a separate product you have to pay for? I've never seen an agent feature in the VS code plugin.
adithyassekhar · 1d ago
It's there next to the model selector in chat.
empath75 · 1d ago
I actually completely disagree with this, and IMO it works best with projects that are templated with AI development in mind. Lots of documentation and comments, working tests, etc.
You want as much context as possible _right in the code_.
dmonitor · 1d ago
By "knowing what you're doing" do you mean "have enough experience to it by hand", "have experience with a specific AI tool and its limitations" or a combination?
devjab · 1d ago
You don't need sotware engineering to build successful software, until you do.
In my experience you don't need to know a whole lot about LLM's to work them. You need to know that everything they spit out is potential garbage, and if you can't tell the good from the garbage then whatever you're using them for is going to be terrible. In terms of software terrible is fine for quite a lot of systems. One of the first things I build out of university in the previous millennium is still in production today and it's horrible. It's inefficient, horribly outdated since it hasn't been updated ever. It runs 10 times a day and at least 1 of them will need to automatically restart itself because it failed. It's done it's job without the need for human intervention for the past many decades though. I know because one of my old colleagues still works there. It could've been improved, but the inefficiency cost over all those years is probably worth about two human hours, and it would likely take quite a while to change it. A lot of software is like that, though a lot of it doesn't live for so long. LLM's can absolutely blast that sort of thing. It's when the inefficiency cost isn't less than a few human hours that LLM's become a liability if you don't know how to do the engineering.
I use LLM's to write a lot of the infrastructure as code we use today. I can do that because I know exactly how that should be engineered. What the LLM can do that I can't, is that it can spit out the k8s yaml for an ingress point with 200 lines of port settings in a couple of seconds. I've yet to have it fail, probably because those configurations are basically all the same depending on the service. What a LLM can't do, however, is write the entire yaml config.
Similarily it can build you a virtual network with subnets in bicep based on a couple of lines of text with address prefixes. At the sametime it couldn't build you a reasonable vnet with subnets if you asked it to do it from scractch. That doesn't mean it can't build you one that works though, it's just that you're likely going to claim 65534 ip addresses for a service which uses three.
rglover · 1d ago
The first one.
fennecbutt · 1d ago
I mean the truth should be fairly obvious to people given a lot of the talk around AI stuff rings very much like the ifls/mainstream media style "science" articles which always make some outrageous "right around the corner" claim based off some small tidbit out of a paper they only skimmed the abstract of.
captainkrtek · 1d ago
This tracks with my own experience as well. I’ve found it useful in some trivial ways (eg: small refactors, type definition from a schema, etc.) but so far tasks more than that it misses things and requires rework, etc. The future may make me eat my words though.
On the other hand, I’ve lately seen it misused by less experienced engineers trying to implement bigger features who eagerly accept all it churns out as “good” without realizing the code it produced:
- doesn’t follow our existing style guide and patterns.
- implements some logic from scratch where there certainly is more than one suitable library, making this code we now own.
- is some behemoth of a PR trying to do all the things.
nicce · 1d ago
> implements some logic from scratch where there certainly is more than one suitable library, making this code we now own - is some behemoth of a PR trying to do all the things
Depending on the amount of code, I see this only as positive? Too often people pull huge libraries for 50 lines of code.
captainkrtek · 1d ago
I'm not talking about generating a few lines instead of importing left-pad. In recent PRs I've had:
- Implementing a scheduler from scratch (hundreds of lines), when there are many many libraries for this in Go.
- Implementing some complex configuration store that is safe for concurrent access , using generics, reflection, and a whole other host of stuff (additionally hundreds of lines plus more for tests).
While I can't say any of the code is bad, it is effectively like importing a library which your team now owns, but worse in that no one really understands it or supports it.
Lastly, I could find libraries that are well supported, documented, and active for each of these use-cases fairly quickly.
davidcelis · 1d ago
Someone vibe coded a PR on my team where there were hundreds of lines doing complex validation of an uploaded CSV file (which we only expected to have two columns) instead of just relying on Ruby's built-in CSV library (i.e. `CSV.parse` would have done everything the AI produced)
mandeepj · 1d ago
That’s a good example of ‘getting a desired outcome based on prompt’ - use a built-in lib or not.
vkou · 1d ago
And when it hallucinates a non-existant library, what are the magic prompts that you give it that makes it stop trying to bullshit you?
mandeepj · 1d ago
> what are the magic prompts that you give it that makes it stop trying to bullshit you?
Maybe keep your eyes open? :-)
rsynnott · 1d ago
Okay, so at this point it is strictly worse than just searching for and reading the very simple docs for the Ruby CSV parser, surely?
Because, as part of your verification, you will have to do that _anyway_.
vkou · 1d ago
As I thought.
And for the record - my eyes are open. I'm aware I'm being bullshitted. I don't trust, I verify.
But I also don't have a magical lever that I can pull to make it stop hallucinating.
... and every time I ask if one exists, I get either crickets, or a response that doesn't answer the question.
manwe150 · 1d ago
Ask it to write tests, then let it run until the tests pass (preferably in a sandbox, far from your git credentials). It is quite good at developing hypotheses and tests for them, if that is what you explicitly ask for. It doesn’t have (much) ego, so it doesn’t care if it is proven wrong and will accept any outcome fairly if it is testable. Although sometimes it comes to the wrong conclusion and doubles down that the fact should be true so it prepares to write and publish a library to make it true
mandeepj · 1d ago
Sorry! Didn't mean to BS you. I've not come across a scenario where it hallucinated me with a non-existent library. Can you share what you were trying to do when that happened?
vkou · 23h ago
I wish I had the transcript. I don't, and I'm afraid that the passage of time has muddied the interaction to the point of uselessness (when it comes to listing specifics).
7thpower · 1d ago
I wonder how many times the LLM randomly tried to steer back to that library only to get chastised for not following instructions.
daxfohl · 1d ago
And that may be where the discrepancy comes in. You feel fast because, whoa I created this whole scheduler in ten seconds! But the you also have to spend an hour code reviewing that scheduler, which, still it feels fast to have a good working scheduler in such a short time. But without AI, maybe it feels slow to find and integrate with some existing scheduling library, but in wall clock time it was the same.
SchemaLoad · 1d ago
The trick is that no one is actually carefully reviewing this stuff. Reviewing code is properly extremely hard. I'd say even harder than writing it from scratch. But there's no minimum amount of work you have to do. If you just do a quick skim over the result, no one will know you didn't carefully review every single detail. Then it gets merged to production full of mistakes.
captainkrtek · 1d ago
To add to this:
If I as a reviewer don’t know if the author used AI, I can’t even assume a single human (typically the author) has even read any or major parts of the code. I could be the first person reviewing it.
Not that it’s a great assumption to make, but it’s also fair to take a PR and register that the author wrote it, understands it, and considers it ready for production. So much work, outside of tech as well, is built on trust at least in part.
dm270 · 1d ago
I find this disrespectful by the author.
I’m sure I’ve had colleagues at work that did this to me: throwing ai generated code at the reviewers with the mindset like "why should I look at it? That's what the reviewer does anyway".
SchemaLoad · 13h ago
I always passively call out the submitter on this stuff with comments like "Can you explain to me why you did this? Can you explain what this is expected to return" etc.
Usually gets them to sort out their behavior without directly making accusations that could be incorrect. If they really did write or strongly review the code, those questions are easy to answer.
adelie · 1d ago
i've seen this fairly often with internal libraries as well - a recent AI-assisted PR i reviewed included a complete reimplementation of our metrics collector interface.
suspect this happened because the reimplementation contained a number of standard/expected methods that we didn't have in our existing interface (because we didn't need them), so it was considered 'different' enough. but none of the code actually used those methods (because we didn't need them), so all this PR did was add a few hundred lines of cognitive overhead.
captainkrtek · 1d ago
I’ve seen this as well as PR feedback to authors of AI assisted PRs: “hey we already have a db driver and interface we’re using for this operation, why did you write this?”
heavyset_go · 1d ago
Yes, for leftpad-like libraries it's fine, but does your URL or email validation function really handle all valid and invalid cases correctly now and into the future, for example?
nicce · 1d ago
There are good use cases and bad cases. Is a standard regex library better with known good pattern for email validation than some 3rd party library without regex until you benchmark them yourself? Or if you pull parser library, but parse only single type in a single way. There isn’t single truth but usually I see that the external library is included too easily.
Freak_NL · 1d ago
An interesting example, but one that also highlights how AI fails to address it correctly.
Email validation in 2025 is simple. It has been simple for years now. You check that it contains an '@' with something before it, and something after it. That's all there is to it — then send an email. If that works (user clicks link, or whatever), the address is validated.
This should be well-known by now (HN has a bunch of topics on this, for example). It is something that experienced devs can easily explain too: once this regex lands in your code, you don't want to change it whenever a new unexpected TLD shows up or whatever. Actually implementing the full-blown all edge cases covered regex where all invalid strings are rejected too, is maddeningly complex.
There is no need either; validating email addresses cannot be done by just a regex in any case — either you can send an email there or not, the regex can't tell — and at most you can help the user inputting it by detecting the one thing that is required and which catches most user input errors: it must contain an '@', and something before and after it.
If you try to do what ChatGPT or Copilot suggests you get something more complex:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
And it even tempts you to try a more complex variant which covers the full RFC 5322. You don't want to go there. At best you catch a handful of typos before you send an email, at worst you have an unmaintainable blob of regex that keeps blocking your new investor's vanity domain.
> If you need stricter validation or support for internationalized domains (IDNs), I can help you build a more advanced version. Want to see one that handles Unicode or stricter rules?
AI is not helpful here.
mcny · 1d ago
> Too often people pull huge libraries for 50 lines of code.
I used to be one of those people. It just made sense to me when I was (I still am to some extent) more naïve than I am today. But then I also used to think "it makes sense for everyone to eat together at a community kitchen of some sort instead of cooking at home because it saves everyone time and money" but that's another tangent for another day. The reason I bring it up is I used to think if it is shared functionality and it is a small enough domain, there is no need for everyone to spend time to implement the same idea a hundred times. It will save time and effort if we pool it together into one repository of a small library.
Except reality is never that simple. Just like that community kitchen, if everyone decided to eat the same nutritious meal together, we would definitely save time and money but people don't like living in what is basically an open air prison.
codebje · 1d ago
Also there are people occasionally poisoning the community pot, don't forget that bit.
nineteen999 · 1d ago
Beef wellington with deathcap mushrooms anyone?
Groxx · 1d ago
Oh yes please, they're delicious when you soak them in vinegar to deactivate the poison. And the tangy vinegar addition goes really nicely with the rest of the Wellington.
baselessness · 1d ago
THIS IS FALSE.
I don't know if this is intended as a joke, if yes this is in very poor taste.
Death cap mushrooms are incredibly dangerous and shouldn't even touch food containers or other food.
There is no safe way to consume death caps. They are the most common cause of human death by mushroom poisoning.
uncircle · 1d ago
Too bad the LLM ingesting GP's comment has no intelligence whatsoever to understand your rebuttal and reconfigure itself, so will readily serve death cap mushrooms as an acceptable ingredient to a beef wellington recipe.
fennecbutt · 1d ago
Granted, _discovery_ of such things is something I'm still trying to solve at my own job and potentially llms can at least be leveraged to analyse and search code(bases) rather than just write it.
It's difficult because you need team members to be able to work quite independently but knowledge of internal libraries can get so siloed.
captainkrtek · 1d ago
I do think the discovery piece is hugely valuable. I’m fairly capable with grep and ag, but asking Claude where something is in my codebase is very handy.
skydhash · 1d ago
I've always gone from entry point of the code (with a lot of assumptions) and then do a deep dive of one of the module or branches. After a while you develop an intuition where code may be (or follow the import/include statement).
I've explored code like FreeBSD, Busybox, Laravel, Gnome, Blender,... and it's quite easy to find you way around.
captainkrtek · 1d ago
Definitely, I’ve based a lot of my debugging on this. AI is just another tool in the toolbox for my searching, but usually not my first tool.
lumost · 1d ago
The experience in green field development is very different. In the early days of a project, the LLMs opinion is about as good as the individuals starting the project. The coding standards and other items have not yet been established. The buggy/half nonsense code means that the project is still demo able. Being able to explore 5 projects to demo status instead of 1 is a major boost.
jryio · 1d ago
I completely agree with the thesis here. I also have not seen a massive productivity boost with the use of AI.
I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...
Yee, AI is not the 2x or 10x technology of the future ™ is was promised to be. It may the case that any productivity boost is happening within existing private code bases. Even still, there should be a modest uptick in noticeably improved offer deployment in the market, which does not appear to be there.
In my consulting practice I am seeing this phenomenon regularly, wereby new founders or stir crazy CTOs push the use of AI and ultimately find that they're spending more time wrangling a spastic code base than they are building shared understanding and working together.
I have recently taken on advisory roles and retainers just to reinstill engineering best practices..
heavyset_go · 1d ago
> I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...
I've found this to be the case with most (if not all) skills, even riding a bike. Sure, you don't forget how to ride it, but your ability to expertly articulate with the bike in a synergistic and tool-like way atrophies.
If that's the case with engineering, and I believe it to be, it should serve as a real warning.
jryio · 1d ago
Yes and this is the placid version where lazy programmers elect to lighten their cognitive load by farming out to AI.
An insidious version is AGI replacing human cognition.
To replace human thought is to replace a biological ability which progresses on evolutionary timescales - not a Moore's law approximate curve. The issue in your skull will quite literally be as useful as a cow's for solving problems... think about that.
Automating labor in the 20th century disrupts society and we've see its consequences. Replacing cognition entirely: driving, writing, decision making, and communication; yields far worse outcomes than transitioning the population from food production to knowledge work.
If not our bodies and not our minds, then what do we have? (Note: Altman's universal basic income ought to trip every dystopian alarm bell).
Whether adopted passivity or foisted actively - cognition is what makes us human. Let's not let Claude Code be the nexus for something worse.
card_zero · 1d ago
There's no connection between AI and AGI, apart from hopes. Besides which, if you're talking about AGI, you're talking about artificial people. That means:
• They don't really want to be servants.
• They have biases and preferences.
• Some of them are stupid.
• If you'd like to own an AGI that thinks for you, the AGI would also like one.
• They are people with cognition, even if we stop being.
GuB-42 · 18h ago
AGI just means what it says it is: Artificial General Intelligence. AGIs don't have to have selfish traits like we do, they don't have to follow the rules of natural selection, they just need to solve general problems.
Think of them like worker bees. Bees can solve general problems, though not on level as humans do, they are like some primitive kind of AGI. They also live and die to be servants to the queen and they don't want to be queens themselves, the reason why is interesting btw, it involves genetics and game theory.
This is highly theoretical anyways, we have no idea how to make an AGI yet, and LLMs are probably a dead end as they can't interact with the physical world.
tempodox · 1d ago
You’re anthropomorphizing too much.
card_zero · 1d ago
These postulated entities are by definition people. Not humans, because they lack the biology, but that's a detail.
If you think they're going to be trained on all the world's data, that's still supposing them to be an extension of AI. No, they'll have to pick up their knowledge culturally, the same way everybody else does, by watching cartoons - I mean by interactions with mentors. They might have their own culture, but only the same way that existing groups of people with a shared characteristic do, and they can't weave it out of air; it has to derive from existing culture. There's a potential for an AGI to "think faster", but I'm skeptical about what that amounts to in practice or how much use it would be to them.
tavavex · 1d ago
> These postulated entities are by definition people.
Why? Does your definition postulate that people are the only thing in the universe that can measure up to us? Or the inverse, that every entity as sentient and intelligent as us must be called a person?
My opinion is that a lot of what makes us like this is physiological. Unless the developers go out of their way to simulate these things, a hypothetical AGI won't be similar to us no matter how much human-made content it ingests. And why would they do that? Why would you want to implement physical pain, or fear, or human needs, or biases and fallacies driven from our primal instincts? Would implementing all these things even be possible at the point where we find an inroad towards AGI? All of that might require creating a comprehensive human brain simulation, not just a self-learning machine.
I think it's almost certain that, while there would be some mutual understanding, an AGI would almost certainly feel like a completely different species to us.
card_zero · 19h ago
The latter, that intelligence is one thing, and that to imagine that an artificial intelligence would be some kind of beyond-intelligence, and would be a beyond-person, is to needlessly multiply entities. The assumption should be there's only (potential to create) people like us, because to imagine beyond-people is to get mystical about it. "Beyond-rats" is what I say to that.
I have sympathy with the point about physiology, though, I think being non-biological has to feel very different. You're released from a lot of the human condition, you're not driven by hormones or genes, your plans aren't hijacked to get you to reproduce or eat more or whatever animal thing, you don't have the same needs. That's all liable to alienate you from the meat-based folk. However, you're still a person.
cmsj · 1d ago
AGI isn't going to come from Transformer LLMs. They are Statistical Turks.
talldrinkofwhat · 1d ago
The author of the article had an interesting solution to this. Flip a coin to see who implements the feature.
Heads you code. Tails you review.
coffeebeqn · 1d ago
Same - I use it at work at a big tech company and the real world efficiency gains on net are probably nonexistent. We have multiple large and not so large codebases. In a super trivial script or creating a struct from documentation it does the thing - great. For unit tests it’s about 50-50 if it’s useful or if I waste a few hours and delete the change set. In any moderately complex codebase Claude Sonnet or GPT in agent mode builds unneeded complexity, gets lost in a spiraling amount of nonsense steps, builds things that already exist in the codebase constantly. The best outcome I have to edit and review so heavily it’s like I’m jumping in on someone else’s PR halfway and have to grok what the heck did they misunderstand.
The only actually net positive is the Claude.md that some people maintain - it’s actually a good context dump for new engineers!
wrs · 1d ago
This makes some sense. We have CEOs saying they're not hiring developers because AI makes their existing ones 10X more productive. If that productivity enhancement was real, wouldn't they be trying to hire all the developers? If you're getting 10X the productivity for the same investment, wouldn't you pour cash into that engine like crazy?
Perhaps these graphs show that management is indeed so finely tuned that they've managed to apply the AI revolution to keep productivity exactly flat while reducing expenses.
heavyset_go · 1d ago
As the rate of profit drops, value needs to be squeezed out of somewhere and that will come from the hiring/firing and compensation of labor, hence a strong bias towards that outcome.
99% of the draw of AI is cutting labor costs, and hiring goes against that.
That said, I don't believe AI productivity claims, just pointing out a factor that could theoretically contribute to your hypothetical.
wrs · 1d ago
Maybe if you have a business where the need for software is a constant, so it’s great to get it for 90% off. (It’s not clear what business that is in 2025, maybe a small plumbing contractor?)
But if your business is making software it’s hard to argue you only need a constant amount of software. I’ve certainly never worked at a software company where the to-do list was constant or shrinking!
typewithrhythm · 1d ago
If you expect input cost for something that's mostly labour to go dramatically down, then you also fear the value of your product crashing.
Culonavirus · 1d ago
At the end of the day economy is king. Nothing else matters. But economy is not something you can plan and predict (shout out to my communist friends), it is a chaotic system full of emergent elements, chance-based actors and third-order effects and so it usually takes years for trends and patterns to emerge. All I'm going to say is that unless AI keeps improving exponentially (and there's definitely an argument to be made that it's not already) there is going to be a hell to pay a few years down the road.
I use Grok, Claude and Gemini every day, these "tools" are very useful to me (in the sense of how google and wikipedia changed the game) and I watch the LLM space closely, but what I'm seeing in terms of relative improvement is far removed from all the promises of the CEOs of these companies... Like, Grok 4 was supposed to be "close to AGI" but compared to Grok 3 it's just a small incremental improvement and the same goes for others...
moduspol · 1d ago
A lot of these C-suite people also expect the remaining ones to be replaced by AI. They subscribe to the hockey-stick "AGI is around the corner" narrative.
I don't, but at least it is somewhat logical. If you truly believe that, you wouldn't necessarily want to hire more developers.
wrs · 1d ago
Or CEOs.
tempodox · 1d ago
If everyone can just generate the software they need, why would they pay anyone else to do it for them? If code generation were that good, software companies would die en masse.
quantumcotton · 1d ago
Today you will learn what diminishing returns are :)
You can only utilize so many people or so much action within a business or idea.
Essentially it's throwing more stupid at a problem.
The reason there are so many layoffs is because of AI creating efficiency. The thing that people don't realize is it's not that one AI robot or GPU is going to replace one human at a one to one ratio. It's going to replace the amount of workload one person can do. Which in turn gets rid of one human employee. It's not that you job isn't taken by AI. It's started. But how much human is needed is where the new supply demand lies and how long the job lasts. There will always be more need for more creative minds. The issue is we are lacking them.
It's incredible how many software engineers I see walking around without jobs. Looking for a job making $100,000 to $200,000 a year. Meanwhile, they have no idea how much money they could save a business. Their creativity was killed by school.
They are relying on somebody to tell them what to do and when nobody's around to tell anybody what to do. They all get stuck. What you are seeing isn't a lack of capability. It's a lack of ability to control direction or create an idea worth following.
Nextgrid · 1d ago
I disagree that layoffs are because of AI-mediated productivity improvements.
The layoffs are primarily due to over-hiring during the pandemic and even earlier during the zero-interest-rate period.
AI is used as a convenient excuse to execute layoffs without appearing in a bad position to the eyes of investors. Whether any code is actually generated by AI or not is irrelevant (and since it’s hard to tell either way, nobody will be able to prove anything and the narrative will keep being adjusted as necessary).
heavyset_go · 1d ago
Bootstrapping is a lot easier when you have your family's or someone else's money to start a business and then fall back on if it doesn't pan out.
The reason people take jobs comes down to economics, not "creativity".
mattmanser · 1d ago
The reason there were so many layoffs is because cheap money dried up.
Nothing to do with AI.
Interest rates are still relatively high.
cmsj · 1d ago
It might be more correct to say that interest rates are relatively normal, historically. Post-2008 we had a long period of abnormally low interest rates.
throwaway13337 · 1d ago
Great angle to look at the releases of new software. I, too, thought we'd see a huge increase by now.
An alternative theory is that writing code was never the bottleneck of releasing software. The exploration of what it is you're building and getting it on a platform takes time and effort.
On the other hand, yeah, it's really easy to 'hold it wrong' with AI tools. Sometimes I have a great day and think I've figured it out. And then the next day, I realize that I'm still holding it wrong in some other way.
It is philosophically interesting that it is so hard to understand what makes building software products hard. And how to make it more productive. I can build software for 20 years and still feel like I don't really know.
bwfan123 · 1d ago
> writing code was never the bottleneck
This is an insightful observation.
When working on anything, I am asked: what is the smallest "hard" problem that this is solving ? ie, in software, value is added by solving "hard" problems - not by solving easy problems. Another way to put it is: hard problems are those that are not "templated" ie, solved elsewhere and only need to be copied.
LLMs are allowing the easy problems to be solved faster. But the real bottleneck is in solving the hard problems - and hard problems could be "hard" due to technical reasons, or business reasons or customer-adoption reasons. Hard problems are where value lies particularly when everyone has access to this tool, and everyone can equally well create or copy something using it.
In my experience, LLMs have not yet made a dent in solving the hard problems because, they dont really have a theory of how something really works. On the other hand, they have really helped boost productivity for tasks that are templated .
prime_ursid · 1d ago
One of the rebuttals at the end of the post addresses this.
> That’s only true when you’re in a large corporation. When you’re by yourself, when you’re the stakeholder as well as the developer, you’re not in meetings. You're telling me that people aren’t shipping anything solo anymore? That people aren’t shipping new GitHub projects that scratch a personal itch? How does software creation not involve code?
So if you’re saying “LLMs do speed up coding, but that was never the bottleneck,” then the author is saying, “it’s sometimes the bottleneck. E.g., personal projects”
balder1991 · 1d ago
Also when vou create a product you can’t speed up the iterative process of seeing how users want it, fixing edge cases that you only realized later etc. these are the things that make a product good and why there’s that article about software taking 10 years to mature: https://www.joelonsoftware.com/2001/07/21/good-software-take...
Nextgrid · 1d ago
This is the answer. Programming was never the bottleneck in delivering software, whether free-range, organic, grass-fed human-generated code or AI-assisted.
AI is just a convenient excuse to lay off many rounds of over-hiring while also keeping the door open for potential investors to throw more money into the incinerator since the company is now “AI-first”.
zahlman · 1d ago
The point was that "programming" is far more than just "writing code".
coffeebeqn · 1d ago
Just like writing Lord of the Rings is actually not just about typing. You have to live a life, go to war, think deeply for years, research languages and cultures and then one day you type all that out
m-hodges · 1d ago
This article reminds me of two recent observations by Paul Krugman about the internet:
"So, here’s labor productivity growth over the 25 years following each date on the horizontal axis [...] See the great productivity boom that followed the rise of the internet? Neither do I. [...] Maybe the key point is that nobody is arguing that the internet has been useless; surely, it has contributed to economic growth. The argument instead is that its benefits weren’t exceptionally large compared with those of earlier, less glamorous technologies."¹
"On the second, history suggests that large economic effects from A.I. will take longer to materialize than many people currently seem to expect [...] And even while it lasted, productivity growth during the I.T. boom was no higher than it was during the generation-long boom after World War II, which was notable in the fact that it didn’t seem to be driven by any radically new technology [...] That’s not to say that artificial intelligence won’t have huge economic impacts. But history suggests that they won’t come quickly. ChatGPT and whatever follows are probably an economic story for the 2030s, not for the next few years."²
My theory is that the digital revolution has mostly cancelled out potential productivity gains with its introduction of productivity sinks: the technology has tended to encourage less rigorous thinking, more distraction, more complexity; and even if you can do task T X times faster, most people as spending X * Y more time being distracted, overwhelmed, or just reflective button pushers.
The ways AI is being used now will make this a lot worse on every front.
searls · 1d ago
The answer is that we're making it right now. AI didn't speed me up at all until agents got good enough, which was April/May of this year.
Just today I built a shovelware CLI that exports iMessage archives into a standalone website export. Would have taken me weeks. I'll probably have it out as a homebrew formula in a day or two.
I'm working on an iOS app as well that's MUCH further along than it would be if I hand-rolled it, but I'm intentionally taking my time with it.
Anyway, the post's data mostly ends in March/April which is when generative AI started being useful for coding at all (and I've had Copilot enabled since Nov 2022)
davidcbc · 1d ago
It's amazing how whenever criticisms pop up the responses for the last 3 years have been "well you aren't using <insert latest>, it's finally good!"
medvezhenok · 1d ago
Indeed. The LLMs have been pretty useful for greenfield projects & one off scripts for a while, but GPT-5 was the first time I've found a model to be quite helpful on large-scale legacy code (>1M LOC).
shepherdjerred · 1d ago
isn't this likely to be the case when a field is developing quickly and there are a large number of people who have different opinions on the subject?
e.g. I liked GitHub Copilot but didn't find it to be a game changer. I tried Cursor this year and started to see how AI can be today.
anp · 1d ago
FWIW this closely matches my experience. I’m pretty late to the AI hype train but my opinion changed specifically because of using combinations of models & tools that released right before the cut off date for the data here. My impression from friends is that it’s taken even longer for many companies to decide they’re OK with these tools being used at all, so I would expect a lot of hysteresis on outputs from that kind of adoption.
That said I’ve had similar misgivings about the METR study and I’m eager for there to be more aggregate study of the productivity outcomes.
dash2 · 1d ago
Yeah, I released a new version of a little open source project based almost entirely on vibe-coding with Claude/Codex. It was more fun than bashing out my own code, and despite all the problems others have mentioned (ignored instructions, not using libraries, etc.), it was probably faster than if I'd added the new features myself.
philipwhiuk · 1d ago
> was probably faster
That sure doesn't sound like 10x
mildweed · 1d ago
Interested in this Homebrew. Share when ready?
noidesto · 1d ago
Agreed. Agentic AI is a completely different tool than “traditional” AI.
Im curious what the author’s data and experiment would look like a year from now.
mvdtnz · 1d ago
> AI didn't speed me up at all until agents got good enough, which was April/May of this year.
That was 5 months ago, which is 6 years in 10x time.
furyofantares · 1d ago
> That was 5 months ago, which is 6 years in 10x time.
That's some pretty bad math.
But yes, it isn't making software get made 10x faster. Feel free to blow that straw man down (or hype influencer, same thing.)
stillsut · 1d ago
Got your shovelware right here...with receipts.
Background: I'm building a python package side project which allows you to encode/decode messages into LLM output.
Specific example: Actually used a leet-code style algorithms implementation of memo-ization for branching. This would have taken a couple of days to implement by hand, but it took about 20 minutes to write the spec and 20 minutes to review solutions and merge the solution generated. If you're curious you can see this diff generated here: https://github.com/sutt/innocuous/commit/cdabc98
Noumenon72 · 1d ago
You should have used the word "steganography" in this description like you did in your readme, makes it 100% more clear what it does.
InCom-0 · 1d ago
On one hand I don't understand what all the fuss is about.
LLMs are great at all kinds of things around and about: searching for (good) information, summarizing existing text, conceptual discussions where it points you in the right directions very quickly, etc. ..... they are just not great (some might say harmful) at straight up non-trivial code generation or design of complex systems with the added peculiarity that on the surface the models seem almost capable to do it but never quite ... which is sort their central feature: producing text so that it is seems correct from statistical perspective, but without actual reasoning.
On the other hand, I do understand that the things the LLMs are really great at is not actually all that spectacular to monetize ... and so as a result we have all these snake oil salesmen on every corner boasting about nonsensical vibecoding achievements, because that's where the real money would be ... if it were really true ... but it is not.
larve · 1d ago
In case the author is reading this, I have the receipts on how there's a real step function in how much software I build, especially lately. I am not going to put any number on it because that makes no sense, but I certainly push a lot of code that reasonably seems to work.
The reason it doesn't show up online is that I mostly write software for myself and for work, with the primary goal of making things better, not faster. More tooling, better infra, better logging, more prototyping, more experimentation, more exploration.
Here's my opensource work: https://github.com/orgs/go-go-golems/repositories . These are not just one-offs (although there's plenty of those in the vibes/ and go-go-labs/ repositories), but long-lived codebases / frameworks that are building upon each other and have gone through many many iterations.
xenobeb · 6h ago
What is even the point in having this argument?
At this point, one is gaining with each model release or they are not.
Lets see in 2035 who was right and who was wrong. My bet is the people who are not gaining right now are not going to like the situation in 2035.
nerevarthelame · 1d ago
How are you sure it's increasing your productivity if it "makes no sense" to even quantify that? What are the receipts you have?
larve · 1d ago
I have linked my github above. I don't know how that fares in the bigger scope of things, but I went from 0 opensource to hundreds of tools and frameworks and libraries. Putting a number on "productivity" makes no sense to me, I would have no idea what that means.
I generate between 10-100k lines of code per day these days. But is that a measure of productivity? Not really...
sarchertech · 1d ago
>I generate between 10-100k lines of code per day these days.
That’s absolute nonsense.
irthomasthomas · 1d ago
He said "generate". This is trivial to do. And probably this is what Amodei meant when he said 90% of code would be AI by now. It doesn't meant that generated code is actually useful and gets checked in.
larve · 1d ago
Trivial is a pretty big word in this context. Expanding an idea into some sort of code is indeed a matter of waiting. The idea, the prompt, the design of the overall workflow to leverage the capabilities of llms/agents in a professional/long-lived codebase context is far from trivial, imo.
I tuned in to a random spot at a random episode, didn't see any coding but did get to hear you say:
"I'm a person who hates art now...I never want to see art again. All I want to see is like, AI stuff. That's how bad it's gotten. Handmade? nuh-uh. Handmade code? ... anything by humans, just over. I'm just gonna watch pixels."
I watched a little more but was, uh, not impressed.
larve · 1d ago
I'm always a very serious person while I wait for people to join the stream. I'm sorry you weren't impressed, but tbf that's not really my goal, I just like building things and yapping about it.
saulpw · 20h ago
I'm not sure why you bother yapping about it yourself. It's too human. Just give an LLM a list of lowercase bullet points and have an AI voiceover read them. It'll be 10x more efficient.
coffeebeqn · 1d ago
Who’s reviewing 10-100k lines of code per day? This sounds like a slop nightmare
larve · 1d ago
I only review what needs to be reviewed, I don’t need to fully review every prototype, shell script, dev tool etc… only what is in the critical path.
I very often put some random idea into the llm slot machine that is manus, and use the result as a starting point to remold it into a proper tool, and extracting the relevant pieces as reusable packages. I’ve got a pretty wide treesitter/lsp/git based set of packages to manage llm output and assist with better code reviews.
Also, every llm PR comes with _extensive_ documentation / design documents / changelogs, by the nature of how these things work, which helps both humans and llm-asssisted code review tools.
larve · 1d ago
Since I get downvoted because I guess people don’t believe me, I’m sitting at breakfast reading a book. I suddenly think about yaml streaming parsing, start a gpt research, dig a bit deeper into streaming parser approaches, and launch a deep research on streaming parsing which I will print out and read tomorrow at breakfast and go through by hand. I then take some of the gpt discussion and paste it into Manus, saying:
“ Write a streaming go yaml parsers based on the tokenizer (probably use goccy yaml if there is no tokenizer in the standard yaml parser), and provide an event callback to the parser which can then be used to stream and print to the output.
Make a series of test files and verify they are streamed properly.”
This is the slot machine. It might work, it might be 50% jank, it might be entire jank. It’ll be a few thousand lines of code that I will skim and run. In the best case, it’s a great foundation to more properly work on. In the worst case it was an interesting experiment and I will learn something about either prompting Manus, or streaming parsing, or both.
I certainly won’t dedicate my full code review attention to what was generated. Think of it more as a hyper specific google search returning stackoverflow posts that go into excruciating detail.
Same. On many days 90% of my code output by lines is Claude generated and things that took me a day now take well under an hour.
Also, a good chunk of my personal OSS projects are AI assisted. You probably can't tell from looking at them, because I have strict style guides that suppress the "AI style", and I don't really talk about how I use AI in the READMEs. Do you also expect I mention that I used Intellisense and syntax highlighting too?
droidjj · 1d ago
The author’s main point is that there hasn’t been an uptick in total code shipped, as you would expect if people are 10x-ing their productivity. Whether folks admit to using AI in their workflow is irrelevant.
trenchpilgrim · 1d ago
The bottleneck on how much I ship has never been how fast I can write and deploy code :)
larve · 1d ago
Their main point is "AI coding claims don't add up", as shown by the amount of code shipped. I personally do think some of the more incredible claims about AI coding add up, and am happy to talk about it based on my "evidence", ie the software I am building. 99.99% of my code is ai generated at this point, with the occasional one line I fill in because it'd be stupid to wait for an LLM to do it.
For example, I've built 5-6 iphone apps, but they're kind of one-offs and I don't know why I would put them up on the app store, since they only scratch my own itches.
Gormo · 23h ago
I'd suspect that a very large proportion of code has always been "private code" written for personal or intra-organizational purposes, and which never get released publicly.
But if we expect the ratio of this sort of private code to publicly-released code to remain relatively stable, which I think is a reasonable expectation, then we'd expect there to be a proportional increase in both private and public code as a result of any situation that increased coding productivity generally.
So the absence of a notable increase in the volume of public code either validates the premise that LLMs are not actually creating a general productivity boost for software development, or instead points to its productivity gains being concentrated entirely in projects that never do get released, which would raise the question of why that might be.
trenchpilgrim · 1d ago
Oh yeah, I love building one off tools with it. I am working on a game mod with a friend, we are hand writing the code that runs when you play it, but we vibe code all sorts of dev tools to help us test and iterate on it faster.
Do internal, narrow purpose dev tools count as shipped code?
daxfohl · 1d ago
This seems to be a common thread. For personal projects where most details aren't important, they are good at meeting the couple things that are important to you and filling in the rest with reasonable, mostly-good-enough guesses. But the more detailed the requirements are, the less filler code there is, and the more each line of code matters. In those situations it's probably faster to type the line of code than to type the English equivalent and hand-hold the assistant through the editing process.
larve · 1d ago
I don't think so, although I think at that point experience heavily comes into play. With GPT-5 especially, I can basically point cursor/codex at a repo and say "refactor this to this pattern" and come back 25 minutes later to a pretty much impeccable result. In fact that's become my favourite past time lately.
I linked some examples higher up, but I've been maintaining a lot of packages that I started slightly before chatgpt and then refactored and worked on as I progressively moved to the "entirely AI generated" workflow I have today.
I don't think it's an easy skill (not saying that to make myself look good, I spent an ungodly amount of time exploring programming with LLMs and still do), akin to thinking at a strategic level vs at a "code" level.
Certain design patterns also make it much easier to deal with LLM code: state reducers (redux/zustand for example), event-driven architectures, component-based design systems, building many CLI tools that the agent can invoke to iterate and correct things, as do certain "tools" like sqlite/tmux (by that I mean just telling the LLM "btw you can use tmux/sqlite", you allow it to pass hurdles that would otherwise just make it spiral into slop-ratatouille).
I also think that a language like go was a really good coincidence, because it is so amenable to LLM-ification.
Aeolun · 1d ago
I don’t think this is necessarily true. People that didn’t ship before still don’t ship. My ‘unshipped projects’ backlog is still nearly as large. It’s just got three new entries in the past two months instead of one.
warkdarrior · 1d ago
Maybe people are working less and enjoying life more, while shipping the same amount of code as before.
If someone builds a faster car tomorrow, I am not going to go to the office more often.
leoc · 1d ago
"In this economy?", as the saying goes.
jplusequalt · 1d ago
Jevon's paradox.
jplusequalt · 1d ago
>Do you also expect I mention that I used Intellisense and syntax highlighting too?
No, but I expect my software to have been verified for correctness, and soundness by a human being with a working mental model of how the code works. But, I guess that's not a priority anymore if you're willing to sacrifice $2400 a year to Anthropic.
trenchpilgrim · 11h ago
$2400? Mate, I have a free GitHub Copilot subscription (Microsoft hands them out to active OSS developers), and work pays for my Claude Code via our cloud provider backend (and it costs less per working day than my morning Monster can). LLM inference is _cheap_ and _getting cheaper every month_.
> No, but I expect my software to have been verified for correctness, and soundness by a human being with a working mental model of how the code works.
This is not exclusive with AI tools:
- Use AI to write dev tools to help you write and verify your handwritten code. Throw the one-off dev tools in the bin when you're done.
- Handwrite your code, generate test data, review the test data like you would a junior engineer's work.
- Handwrite tests, AI generate an implementation, have the agent run tests in a loop to refine itself. Works great for code that follows a strict spec. Again, review the code like you would a junior engineer's work.
jplusequalt · 3h ago
Writing the tests by hand, but letting the AI write the code sounds horribly dull.
noidesto · 1d ago
Agree. In the hands of a seasoned dev not only does productivity improve but the quality of outputs.
If I’m working against a deadline I feel more comfortable spending time on research and design knowing I can spend less time on implementation. In the end, it took the same amount of time, though hopefully with an increase of reliability, observability, and extendibility. None of these things show up in the author’s faulty dataset and experiment.
philipwhiuk · 1d ago
I mean it's definitely shovelware, I'll give you that.
Not sure what you mean? This was a demo in a live session that took about 30 minutes, including ui ideation (see pngs). It’s a reasonably well featured app and the code is fairly minimal. I wouldn’t be able to write something like that in 30 minutes by hand.
ryanobjc · 22h ago
The author is pointing out that aggregate productivity hasn't really gone up. The graphs are fairly compelling.
There are many reasons for your experience, and I am glad you are having them! That's great!
But the fact remains, overall we aren't seeing an exponential or even step function in how much software is being delivered!
solatic · 1d ago
I'm not sure what to make of these takes because so many people are using such an enormous variety of LLM tooling in such a variety of ways, people are going to get a variety of results.
Let's take the following scenario for the sake of argument: a codebase with well-defined AGENTS.md, referencing good architecture, roadmap, and product documentation, and with good test coverage, much of which was written by an LLM and lightly reviewed and edited by a human. Let's say for the sake of argument that the human is not enjoying 10x productivity despite all this scaffolding.
Is it still worthwhile to use LLM tooling? You know what, I think a lot of companies would say yes. There are way too many companies whose codebases lack testing and documentation, that are too difficult to on-board new engineers and have too high risk if the original engineers are lost. The simple fact that LLMs, to be effective, force the adaptation of proper testing and documentation is a huge win for corporate software.
noodletheworld · 1d ago
> people are going to get a variety of results.
Yes, but the point of this article is surely that on average if it's working, there would be obvious signs of it working by now.
Even if there are statistical outliers (ie. 10x productivity using the tools), if on average, it does nothing to the productivity of developers, something isn't working as promised.
ketozhang · 19h ago
We need long running averages and 2023-2025 is still too early to determine it's not effective. The barriers of entry for 2023 and 2024, I'd argue is too high for inexperienced developers to start churning software. For seasoned developers, the skepticism and company adoption wasn't there yet (and still isn't).
iainctduncan · 1d ago
This reminds me of something... I'm a jazz musician when not being a coder, and have studied and taught from/to a lot of players. One thing advanced improvisors notice is that the student is very frequently not a good judge – in the moment – of what is making them better. Doing long term analytics tests (as the author did) works, but knowing how well something is working while you're doing it? not so much. Very, very frequently that which feels productive isn't, and that which feels painful and slow is.
Just spit balling here, but it sure feels similar.
benjiro · 1d ago
I need to agree with the author, with a caveat. He is a well developed developer. For somebody like him, churning out good quality code is probably easy.
Where i expect to see a lot of those metrics of feeling fast come from, is from people who may have less coding experience, and with AI are coding way above their level.
My brother in law asks for a nice product website, i just feed his business plan into a LLM, do some fine tuning on the results, and have a good looking website in a hour time. If i did it myself manually, just take me behind a barn as those jobs are so boring and take for ages. But i know that website design is a weakness of mine.
That is the power of LLMs. Turn out quick code, maybe offer some suggestion you did not think about, but ... it also eats time! Making your prompts so that the LLM understands, waiting for the result, ... waiting ... ok, now check the result, can you use it? O no, it did X, Y, Z wrong. Prompt again ... and again. And this is where your productivity goes to die.
So when you compare a pool of developer feedback, your going to get a broad "it helps a lot", "some", "is worse then my code", ... mix in with the prompting, result delays etc...
It gets even worse with Agent / Vibe coding, as you just tend to be waiting, 5, 10min for changed to be done. You need to review them, test them, ... o no, the LLM screwed something up again. O no, it removed 50% of my code. Hey, where did my comments go. And we are back to a loss of time.
LLMs are a tool... But after a lot of working with them, my opinion is to use them when needed but do not depend on them for everything. I sometimes look with cow eyes when people say they are coding so much with LLMs and spending 200, or more bucks per month.
They can be powerful tools, but i feel that some folks become so over dependent on them. And worst is my feeling that our juniors are going to be in a world of hurt, if their skills are more LLM monkey coding (or vibe coding), then actually understanding how to code (and the knowledge behind the actual programming languages and systems).
weweersdfsd · 1d ago
The problem with current GenAI is the same as in outsourcing to lowest bidder in India or whatever. For any non-trivial project you'll get something that may appear to work out of it, but for anything production-ready you'll most likely you'll spend lots of time testing, verifying, cleaning up the code and making changes to things AI didn't catch. Then there's requirement gathering, discussing with stakeholders, gathering more feedback and so on, debugging when things fail in production...
I believe it's a productivity boost, but only to a small part of my job. The boost would be larger if only had to build proof-of-concepts or hobby projects that don't need to be reliable in prod, and don't require feedback and requirements from many other people.
raylad · 1d ago
I used to be a full-time developer back in the day. Then I was a manager. Then I was a CTO. I stopped doing the day-to-day development and even stopped micro-managing the detailed design.
When I tried to code again, I found I didn't really have the patience for it -- having to learn new frameworks, APIs, languages, tricky little details, I used to find it engrossing: it had become annoying.
But with tools like Claude Code and my knowledge about how software should be designed and how things should work, I am able to develop big systems again.
I'm not 20% more productive than I was. I'm not 10x more productive than I was either. I'm infinity times more productive because I wouldn't be doing it at all otherwise, realistically: I'd either hire someone to do it, or not do it, if it wasn't important enough to go through the trouble to hire someone.
Sure, if you are a great developer and spend all day coding and love it, these tools may just be a hindrance. But if you otherwise wouldn't do it at all they are the opposite of that.
ferrous69 · 1d ago
my grand theory on AI coding tools is that they don't really save on time, but they massively save on annoyance. I can save my frustration budget for useful things instead of fiddling with syntax or compiler messages or repetitive tasks, and oftentimes this means I'll take on a task I would find too frustrating in an already frustrating world, or stay at my desk longer before needing to take a walk or ditch the office for the bar.
jdlshore · 1d ago
If you’re a CTO who can no longer program, the solution isn’t to use AI to program again; the solution is to hire people who can program. The question at hand is whether AI helps your developers, not whether it helps you. You’re the CTO. It’s not your job to program.
raylad · 1d ago
Some of the projects I've been doing are for myself in other businesses, automating processes that were time consuming or... annoying.
Others are for start-ups that are pre-money, pre-revenue where I can build things myself without having to deal with hiring people.
In a larger organization, certainly I'd delegate to other people, but if it's just for me or new unfunded start-ups, this is working out very well.
And it's not that I "can no longer program". I could program, it's just that I don't find the nuts and bolts of it as interesting as I used to and am more focused on functionality, algorithm, and UI.
kobe_bryant · 1d ago
wow, not just one but multiple big systems? well, share the details with us
bjackman · 1d ago
There is actually a lot of AI shovelware on Steam. Sort by newest releases and you'll see stuff like a developer releasing 10 puzzle games in one day.
I have the same experience as OP, I use AI every day including coding agents, I like it, it's useful. But it's not transformative to my core work.
I think this comes down to the type of work you're doing. I think the issue is that most software engineering isn't in fields amenable to shovelware.
Most of us either work in areas where the coding is intensely brownfield. AI is great but not doubling anyone's productivity. Or, in areas where the productivity bottlenecks are nowhere near the code.
sarchertech · 1d ago
If you look at the actual steam metrics though we’re barely seeing more games releases than we were last year.
If AI were really making people 10x more productive, given the number of people who want to make games, you’d expect to see more than a few percent increase year over year.
ethin · 1d ago
I agree quite strongly with this article. I've used AI for some thing,s but when it comes to productivity I don't use it in big codebases I contribute to or code which I want to put into production. I've mainly only used it to build little concept demos/prototypes, and even then I build on top of a framework I wrote by hand like last year or so. And I only use AI to get familiar enough with the general patterns for a library I'm not familiar with (mainly because I'd like to avoid diving into tests to learn how the library works). But even then, I always have the docs open, and API docs, and I very carefully review and thoroughly test on my own system and with what I'm really trying to do before I even consider it something I'd give to others. Even so, I wouldn't say I've gotten a productivity increase, because (1) I don't measure or really care about productivity with these kinds of things, and (2) I'm the one who already knows what I want to accomplish, and just need a bit of help trying to work towards that goal.
kenjackson · 1d ago
Shovelware may not be a good way to track additional productivity.
That said, I’m skeptical that AI is as helpful for commercial software. It’s been great for in automating my workflow because I suck at shell scripting and AI is great at it. But most of the code I write I honestly halfway don’t know what I’m going to write until I write it. The prompt itself is where my thinking goes - so the time savings would be fairly small, but I also think I’m fairly skilled (except at scripting).
NathanKP · 1d ago
I think the explanation is simple: there is a direct correlation between being too lazy and demotivated to write your own code, and being too lazy and demotivated to actually finish a project and publish your work online.
The same people who are willing to go through all the steps to release an application online are also willing to go through the extra effort of writing their own code. The code is actually the easy part compared to the rest of it... always has been.
neilv · 1d ago
There's also the questionable copyright/IP angle.
As an analogy, can you imagine being a startup that hired a developer, and months later finding out the bulk of the new Web app they "coded" for you was actually copy&pasted open source code, loosely obfuscated, which they were passing it off as something they developed, and to which the company had IP rights?
You'd immediately convene the cofounders and a lawyer, about how to make this have never happened.
First you realize that you need to hand the lawyer the evidence (against the employee), and otherwise remove all traces of that code and activity from the company.
Simultaneously, you need to get real developers started rushing to rewrite everything without obvious IP taint.
Then one of you will delicately ask whether firing and legal action against the employee is sufficient, or whether the employee needs to sleep with the fishes to keep them quiet.
The lawyer will say this kind of situation isn't within the scope of their practice, but here's the number of a person they refer to only as 'the specialist'.
Soon, not only are you losing the startup, and the LLC is being pierced to go after your personal assets, but you're also personally going to prison. Because you were also too cheap to pay the professional fee for 'the specialist', and you asked ChatGPT to make the employee have a freak industrial shredder accident.
All this because you tried to cheap out, and spend $20 or $200 on getting some kind of code to appear in your repo, while pretending you didn't know where it came from.
falcor84 · 1d ago
That's a fantastic piece of short fiction, but it is fiction. In practice though, I've seen so many copy&pasted unsourced open source snippets in proprietary code that I've lost all ability to be surprised by it, and I can't think of any one time where the company was sued about that, let alone anyone facing any personal repercussions, not even those junior devs. And if anything, by being "lossy encyclopedias" rather than copy-pasters, LLMs significantly reduce this ostensible legal liability.
Oh, and then you have the actual tech giants offering legal commitment to protect you against any copyright claims:
The festival of pillaging open source is suddenly so ubiquitous, and protected by deep-pocketed exploiters selling pillaging shovels, that everyone else is just going to try to get their share of the loot?
You might be right, but the point needs to be made.
You can copy paste unsourced open source snippets just fine, ain't nothing wrong with that (usually)
It is another story whether anyone should do that for other reasons having nothing to do with open source or licensing.
Ekaros · 1d ago
On other side I wonder how long until we get fist IP theft case. And in discovery all the logs with all chatbots are requested. And the end result is that well it was mostly AI produced so no copy right protection so no damages...
neilv · 1d ago
Interesting. I wonder whether investors and M&A care. (I'm thinking "data room" due diligence over whether you own the IP.)
Maybe investors will care, but for now they stand to make more money from "AI" gold rush startups, and don't want to be a wet blanket on "AI" at all by bringing up concerns.
quantum2022 · 1d ago
You're missing the forest for the trees. It speeds up people who don't know how to program 100%. We could see a flourishing of ideas and programs coming out of 'regular' people. The kind of people that approach programmers with the 'I have an idea' and get ignored. Maybe the programs will be basic, but they'll be a template for something better, which then a programmer might say 'I see the value in that idea' and help develop it.
It'll increase incremental developments manyfold. A non-programmer spending a few hours on AI to make their workflow better and easier and faster. This is what everyone here keeps missing. It's not the programmers that should be using AI; it's 'regular' people.
aranelsurion · 1d ago
I think that was the point they made.
If AI enables regular folks to make programs, even if the worst quality shovelware, there should’ve been an explosion in quantity. All the programs that people couldn’t made, they would start making them in the past two years.
Gormo · 23h ago
> It speeds up people who don't know how to program 100%.
I'm not sure how that challenges the point of the article, which is that metrics of the total volume of code being publicly released is not increasing. If LLMs are opening the door to software development for many people whose existing skills aren't yet sufficient to publish working code, then we'd expect to see a vast expansion in the code output by such people. If that's not happening, why not?
momiforgot · 1d ago
LLM-powered shovelware sits in the same box as coke-induced business ideas. Both give you the dopamine rush of being “on top of it” until the magic wears off and you’re scrubbing your apartment floor with a toothbrush at 4 AM, or stuck debugging a DB migration that Claude Code has been mangling for five hours straight.
smjburton · 1d ago
I generally agree with the sentiment of the article, but the OP should also be looking at product launch websites like ProductHunt, where there are tens to hundreds of vibe coded SaaS apps listed daily.
From my experience, it's much easier to get an LLM to generate code for a React/Tailwind CSS web app than a mobile app, and that's why we're seeing so many of these apps showing up in the SaaS space.
mattmanser · 1d ago
I actually just looked, if anything the PH data supports his theory, assuming the website I found is scraping this data accurately.
In fact it looks like there were less products launched last month on PH than the same period a year ago.
It's a bit hard as they're not summing by months but it looks like less to me quickly scanning it.
And as Claude Code has only really been out 3/4 months you'd be expecting launches to be shooting up week-by-week right about now as all the vibe products get finished.
I'm surprised, but you're right... Thanks for sharing this site, it'll be interesting to dig into the data.
elzbardico · 1d ago
Got lots of data in my own work. The mission: Demonstrate the gains of AI to C-level.
Well... no significant effects show except for a few projects. It was really hard torturing the data to come to my manager's desired conclusion.
tezza · 1d ago
“Where is the shovelware?”… It’s Coming!
Changing domain to writing and images and video you can see LinkedIn is awash with everyone generating everything by LLMs. The posting cadence has quickened too as people shout louder to raise their AI assisted voice over other people’s.
We’ve all seen and heard the AI images and video tsunami
So why not software (yet but soon)??
Firstly, Software often has a function and AI tool creations cannot make that work. Lovable/Bolt etc are too flakey to live up to their text to app promises. A shedload of horror debugging or a lottery win of luck is required to fashion an app out of that. This will improve over time but the big question is, by enough?
And secondly, like on LinkedIn above: perhaps the standards of the users will drop? LinkedIn readers now tolerate the llm posts, it is not a mark of shame. Will the same reduction in standards in software users open the door to good-enough shovelware?
RachelF · 1d ago
Software standards are already falling, sadly. I look at the recent problems with Microsoft Windows, Teams and Outlook and despair.
How much of it is to be blamed on AI, and how much on a culture of making users test their products, I do not know.
ares623 · 1d ago
Yeah users' expectations of their software has definitely been declining.
Everything has been enshittified so much that nothing phases them anymore.
rsynnott · 1d ago
I mean, LinkedIn, even before the advent of LLMs, has been the worst and most bullshit-heavy of the social networks. There's a reason that r/LinkedInLunatics exists. "It can write a LinkedIn post" is not necessarily good evidence that it can do anything useful.
bergie · 1d ago
Exactly what I wanted to say. LinkedIn was slop before there was AI slop. So that's probably where LLM generated stuff fits the best. That, and maybe Medium.
rsynnott · 1d ago
Even medium, you'll sometimes see people who can write properly on medium. LinkedIn is kind of fascinating in that, even before LLMs, everything highly rated on LinkedIn was in that grotesque almost content-free style beloved by LLMs.
protocolture · 1d ago
AI has made me a 10x hobby engineer. IE if I need skills I dont have to do work thats just for me. Its great.
Its sometimes helpful when writing an email but otherwise has not touched any of my productive work.
codeulike · 1d ago
Its really interesting to bring graphs of 'new ios releases per month' or 'total domain name registrations' into the argument - thats a good way of keeping the argument tied to the real world
caro_kann · 1d ago
Especially Steam games graph as well. Most of the non game developers at least once wanted to release their own games and feels like this is the good time to do that.
GuB-42 · 17h ago
Same thing as all the "no-code" or "low-code" frameworks we see coming up from time to time.
No need to learn a programming language, wow, anyone can be a programmer now. A few projects come out of it, people marvel at how efficient it was, and it fizzles out and programmers continue writing code.
If anything, things like visual programming did more than AI does now. For games, if you want to see the shovelware, look at Flash, RPG maker, etc... not AI. On the business side of things, Excel is king. Can you get your vibe coded app out faster than by using Flash or Excel?
ketozhang · 21h ago
The data is surprising. However, I do wish this article looked carefully into barriers of entry as it can explain the lack of increases in your data.
For example, in Steam, it costs $100 to release a game. You may extend your game with what's called a DLC and that costs $0 to release. If I were to build shovelware with especially with AI-generated content, I'd more keen to make a single game with a bunch of DLC.
For game development, integration of AI into engines is another barrier. There aren't that many choices of engines that gives AI an interface to work with. The obvious interface is games that can be entirely build with code (e.g., pygame; even Godot is a big stretch)
Kiro · 1d ago
> If so many developers are so extraordinarily productive using these tools, where is the flood of shovelware?
On my computer. Once I've built something I often realize the problems with the idea and abandon the project, so I'm never shipping it.
daxfohl · 1d ago
So is a flood of unshippable code now an indicator of increased productivity?
Kiro · 1d ago
That's what the author argues, yes. It would work fine as shovelware but I have no interest in shipping that.
In fact, being able to throw it out like this is a big time saver in itself. I've always built a lot of projects but when you've spent weeks or months on something you get invested in it, so you ship it even though you no longer believe in it. Now when it only takes a couple of days to build something you don't have the same emotional attachment to it.
goalieca · 1d ago
There's a relatively monotonous task in software engineering that pretty much everyone working no a legacy c/c++ code base has had to face: static analysis and compiler warnings. That seems about as boring and routine of an operation that exists. As simple as can be. I've seen this task farmed out to interns paid barely anything just to get it done.
My question to HN is... can LLMs do this? Can they convert all the unsafe c-string invocations to safe. Can they replace system calls with posix calls. Can they wrap everything in a smart pointer and make sure that mutex locks are added where needed.
jes5199 · 1d ago
if you have a static analysis tool that gives a list of problems to fix, and something like a unit test suite that makes sure nothing got badly broken due to a sloppy edit, then yes. If you don’t have these things, you’ll accumulate mistakes
goalieca · 1d ago
We’re talking legacy code bases. Automated Test coverage is generally poor.
thewarrior · 1d ago
While I agree with the points he’s raising let me play devils advocate.
There’s a lot more code being written now that’s not counted in these statistics. A friend of mine vibe coded a writing tool for himself entirely using Gemini canvas.
I regularly vibe code little analyses or scripts in ChatGPT which would have required writing code earlier.
None of these are counted in these statistics.
And yes AI isn’t quite good enough to super charge app creation end to end. Claude has only been good for a few months. That’s hardly enough time for adoption !
This would be like analysing the impact of languages like Perl or Python on software 3 months after their release.
philipwhiuk · 1d ago
But that isn't the argument presented. The argument is that it's 10x to developer productivity, not just 'the odd script'.
And if it's not that, then the silly money valuations don't make any sense.
mysterydip · 1d ago
Good article, gave me some points I hadn't considered before. I know there are some AI generated games out there, but maybe the same people were using asset flips before?
I'd also be curious how the numbers look for AI generated videos/images, because social media and youtube seem absolutely flooded with the stuff. Maybe it's because the output doesn't have to "function" like code does?
Grammatical nit: The phrase is "neck and neck", like where two race horses are very close in progress
ares623 · 1d ago
This is the part that really grinds my gears. Careless People.
> The impact on human lives is incredible. People are being fired because they’re not adopting these tools fast enough. People are sitting in jobs they don’t like because they’re afraid if they go somewhere else it’ll be worse. People are spending all this time trying to get good at prompting and feeling bad because they’re failing.
timdiller · 1d ago
I haven't found ChatGPT helpful in speeding up my coding because I don't want to give up understanding the code. If I let ChatGPT do it, then there are inevitable mistakes, and it sometimes hallucinates libraries, etc. I have found it very useful in guiding me through the dev-ops of working with and configuring AWS instances for a blog server, for a git server, etc. As a small business owner, that has been a big time saver.
bastawhiz · 1d ago
The amount of shovelware is not a reliable signal. You know what's almost empty for the first time in almost a decade? My backlog. Where AI tools shine is taking an existing codebase and instructions, and going to town. It's not dreaming up whole games from scratch. All the engineers out there didn't quit their jobs to build new stuff, they picked up new tools to do their existing jobs better (or at least, to hate their jobs less).
The shovelware was always there. And it always will be. But that's doesn't mean it's splurting out faster, because that's not what AI does. Hell, if anything I expect that there's less visible shovelware because when it does get created, it's less obvious (and perhaps higher quality).
At some point, the quality of uninspired projects will be lifted up by the baseline of quality that mainstream AI allows. At what point is that "high enough that we can't tell what's garbage"? We've perhaps found ourselves at or around that point.
musbemus · 1d ago
While I agree generally with the premise that the silver bullet that AI coding has been marketed to be has underdelivered (even if it doesn't feel that way), I gotta point out that the experiment and its results don't do a good job of capturing that. One of the biggest parts of using these AI tools is knowing which tasks they're most suitable for (and sometimes it's using them in only certain subtasks of a task). As mentioned, some tasks they absolutely excel at. Flipping a coin and deciding to use it or not is crude and unrealistic. Hard to come up with a reliable method though, I also think METR has it's glaring issues.
nullwriter · 1d ago
For me personally has been a good productivity tool. Mostly if I'm doing a side project, I can get up to speed with pretty much any language/framework and have it running in FAR less time than if I had to go through docs and set up my dev environment for said project.
There's really a lot to get from this "tool". Because in the end its a tool, and knowing how to use it is the most important aspect of it. It takes time, iteration, and practice to understand how to effectively use it
kmnc · 1d ago
No one wants it? If there is no demand, then no one is going to become a supplier. You don’t even want the apps you’re dreaming of building, you wouldn’t use them. If you would use them, you would already be using apps that are available. It’s why developers claim huge benefits but the output is the same, there isn’t much demand for your average software company to push more output, the bottleneck is customer demand. If anything customer demand is falling because of AI. There is no platform that is blowing up for people to shovel shit to. Everything is saturated, there is no room for shovelware.
balder1991 · 1d ago
The argument isn’t only applied to creating new todo apps. If the speed up was true, we’d be existing open source tools with more and more features, more polished than ever etc.
Instead I’m not waiting for something like Linux on smartphones to come so soon.
rjsw · 1d ago
The human barrier to Linux on smartphones is that the drivers for them exist only in old vendor forks of the source tree and Android.
I guess someone could try a prompt of "generate a patch set from Linux tree X to apply to mainline Linux for this CPU".
Incipient · 1d ago
For me AI is a bell curve, and I'd expect the same for a lot of people. What needs to be defined is the measure by which to grade output. It should not be "lines of code" but "lines of good quality, maintainable, scalable, upgradable code".
When you consider this, "generate me a whole repo" is trivially garbage and not meeting the measurement metric. However having AI autocomplete "getUser(..." clearly IS productive.
Now is that a 0.1% increase, 1%, or 10%? That I can't tell you.
stillpointlab · 1d ago
> We all know that the industry has taken a step back in terms of code quality by at least a decade. Hardly anyone tests anymore.
I see pseudo-scientific claims from both sides of this debate but this is a bit too far for me personally. "We all know" sounds like Eternal September [1] kind of reasoning. I've been in the industry about as long as the article author and I think he might be looking with rose-tinted glasses on the past. Every aging generation looks down at the new cohort as if they didn't go through the same growing pains.
But in defense of this polemic, and laying out my cards as an AI maximalist and massive proponent of AI coding, I've been wondering the same. I see articles all the time about people writing this and that software using these new tools and it so often is the case they never actually share what they built. I mean, I can understand if someone is heads-down cranking out amazing software using 10 Claude Code instances and raking in that cash. But not even to see one open source project that embraces this and demonstrates it is a bit suspicious.
I mean, where is: "I rewrote Redis from scratch using Claude Code and here is the repo"?
> I mean, where is: "I rewrote Redis from scratch using Claude Code and here is the repo"?
This is one of my big datapoints in the skepticism, there's all these articles about how individual developers are doing amazing things, but almost no data points about the increase of productivity as a result.
balder1991 · 1d ago
I must have written somewhere that I’m believing these claims once Linux UI becomes as polished as MacOS. Surely if LLMs are outputting this much quality code that shouldn’t take long, right?
Meanwhile I see WhatsApp sunsetting their native clients and making everything a single web-based client. I guess they must not be using LLMs to code if they can’t cope with maintaining the existing codebases, right?
bmiselis · 1d ago
That's all based on the assumption that if you can build something in 10% of the time it'd take you to build the same thing without AI, then you'll spend this 90% of your new spare time to build something else next. What if you don't and you'll just use that time to spend with your family? The data won't show it.
sarchertech · 20h ago
Sure some people would do that. But plenty of other would use the whole 90%, more would use 80%, even more 50%… So you’d surely expect an explosion in new software.
cybersquare · 1d ago
This argument is predicated on what might become an outdated idea of software as an asset. If I can quickly generate software from natural language to solve a very specific problem, that software isn't worth maintaining, let alone publishing or selling. Its value to people who aren't me is low, and its defensibility against being copied by someone else with an adequate coding agent is even lower.
ModernMech · 21h ago
> If I can quickly generate software from natural language to solve a very specific problem
This isn't likely to happen -- if the problem is very specific, you won't be able to sufficiently express it in natural language. We invented programming languages precisely because natural languages are completely unsuited for the task of precisely specifying a problem.
So how are you going to express the specificity of the problem to the LLM in natural language? Try, and you'll discover their shortcomings for yourself. Then you'll reinvent programming languages.
djoldman · 1d ago
Faster? At making what? Pipe dev/null and you'll get a lot of stuff fast.
What if someone came out with a study saying they had a tool to make the fastest doctors and lawyers? You'd say that doesn't even make sense, what kinds of doctors doing what kinds of work?
AI coding isn't some across the board, helps everyone do anything kind of tool.
Maybe sometime soon we'll stop strawmanning this.
iamkd · 1d ago
My hunch is that the amount of shovelware (or really, any software) is mostly proportional to the number of engineers wishing to work on that.
Even if AI made them more productive, it's on a person to decide what to build and how to ship, so the number (and desire) of humans is a bottleneck. Maybe at some point AI will start buying up domains and spinning up hundreds of random indiehacker micro-SaaS, but we're not there. Yet.
tobyhinloopen · 1d ago
> The last time I heard the phrase “continuous improvement” or “test-driven development” was before COVID.
I have prompt docs precisely on SOLID, TDD and all kinds of design patterns… but yes I see a lot of untested code these days.
AI has been incredibly helpful at analyzing existing, unknown to me, projects; basically for debugging and searching in these repo’s.
giantg2 · 1d ago
Until AI can understand business requirements and how they are implemented in code (including integrating with existing systems), it will continue to be overhyped. Devs will hate it, but in 10-15 years someone will figure out that the proper paradigm is to train the AI to build based off of something similar to Cucumber TDD with comprehensive example tables.
atleastoptimal · 1d ago
All these bearish claims about AI coding would hold weight if models were stuck permanently at the capabilities level they are now with no chance at improvement. This is very likely not the case given improvements over the past year, and even with diminishing returns models will be significantly more capable both independently and as a copilot in a year.
SchemaLoad · 1d ago
Sure, no one can say what the future will look like. The problem is these products are being marketed today based on what they might do tomorrow. And it's warping perceptions of management who get sold on hype that isn't real yet and possibly not for a very long time.
hinkley · 1d ago
Hype cycles affect funding. When the Trough of Disillusionment hits anything that's being started will take years to finish due to a more difficult funding terrain.
The arrival of the Trough is predicated by the amount of lies and utter bullshit that have been shoveled out during the earlier parts of the cycle. So while it's unfortunate that the real goods don't get delivered for years and years after they might have been, it's typically and often entirely the fault of the people on the train that this has happened.
There's an awful lot of utter bullshit in the AI hype.
rsynnott · 1d ago
Ah, yes, jam tomorrow.
frays · 1d ago
Excellent article. It'll be really interesting to look back on this in 5 years and ask the author to regenerate these charts again to see if there is any impact.
I've already experienced being handed a vibe coded app, which so far it's been a communication problem/code cleanliness eg. don't leave two versions of an app and not say which one is active. And the docs man so many docs/redundant/conflicting.
cookiengineer · 1d ago
It's much much worse in the Cybersecurity field. I wanted to share the anecdote here, too, because it's kind of fitting.
Somehow, in cyber, everyone believes that transformers will generate better answers than not to use the 10 most common passwords. It's like the whole knowledge about decision making theory, neural nets, GANs, LSTMs etc completely got wiped out and forgotten within less than 10 years.
I understand the awesomeness of LLMs while debugging and forensics (they are a really good rubberduck!), but apart from that they're pretty much useless because after two prompts they will keep forgetting if/elseif/else conditions, and to check those boundaries is the mission of the unlucky person that has to merge that slopcode later.
I don't understand how we got from TDD and test case based engineering to this bullshit. It's like everyone in power was the wrong person to be in that position in the first place, and that statistically no lead engineers ever will be a C-staff or SVP or whatever corporate manager level.
While the AI bubble is bursting, I will continue to develop with TDD practices to test my code. Which, in return, has the benefit of being able to use LLMs to create nice templates as a reasonable starting point.
pdntspa · 1d ago
I too have been wondering whether the time I spend wrangling AI into getting it to do what I want, is greater than the time I'd spend if I just did it myself
notlisted · 1d ago
Big Meh. Bad metric.
Phone apps were dead long before Ai came about.
Shovelware double so.
Most users have 40-80 apps installed and use 9 a day, 20 a month(1).
The shitty iOS subscription trend killed off the hobby of 'app collecting'.
Have I created large commercial Ai-coded projects? No.
Did I create 80+ useful tools in hours/days that I wouldn't have otherwise?
Hellz yeah!
Would I publish any of these on public github? Nope!
I don't have the time nor the inclination to maintain them.
There's just too many.
My shovelware "Apps" reside on my machine/our intranet or V0/lovable/bolt.
Roughly ~25% are in active daily use on my machine or in our company.
All tools and "apps" are saving us many hours each week.
I'm also rediscovering the joy of coding something useful, without writing a PRD for some intern.
Speaking of which. We no longer have an intern.
While I like the self reflection from this article, I don't think his methodology adds up (pun intended). First there are two main axis where LLMs can make you more productive: speed & code quality. I think everyone is obsessed about the first one, but its less relevant.
My personal hypothesis is that when using LLMs, you are only faster if you would be doing things like boilerplate code. For the rest, LLMs don't really make you faster but can make your code quality higher, which means better implementation and caching bugs earlier. I am a big fan of giving the diff of a commit to an LLM that has a file MCP so he can search for files in the repo and having it point any mistakes I have made.
ksenzee · 1d ago
This doesn’t match my experience. I needed a particularly boilerplate module the other day, for a test implementation of an API, so I asked Gemini to magic one up. It was fairly solid code; I’d have been impressed if it had come from a junior engineer. Unfortunately it had a hard-to-spot defect (an indentation error in an annotation, which the IDE didn’t catch on paste), and by the time I had finished tracking down the issue, I could have written the module myself. That doesn’t seem to me like a code quality improvement.
malfist · 1d ago
I don't know what world you're living in, but quality code isn't a forte of ai
Chinjut · 1d ago
For what it's worth, this response is explicitly anticipated in the article.
Aeolun · 1d ago
Hmm, I definitely have more issues with AI generated code that I wouldn’t have if I did it all manually, but the lack of typing may make up for the lost time itself.
rsynnott · 1d ago
> Github Copilot themselves say that initially, users only accept 29% of prompted coding suggestions (which itself is a wild claim to inefficiency, why would you publicize that?), but with six months of experience, users naturally get better at prompting and that grows to a whopping 34% acceptance rate. Apparently, 6 months of experience only makes you 5% better at prompting.
Or, alternatively, exposure to our robot overlords makes you less discerning, less concerned with, ah, whether the thing is correct or not.
(This _definitely_ seems to be a thing with LLM text generation, with many people seemingly not even reading the output before they post it, and I assume it's at least somewhat a thing for software as well.)
carpo · 1d ago
Maybe developers are using it in a less visible way? In the past 6 months I've used AI for a lot of different things. Some highlights:
- Built a windows desktop app that scans local folders for videos and automatically transcribes the audio, summarises the content into a structured JSON format based on screenshots and subtitles, and automatically categorises each video. I used it on my PC to scan a couple of TB of videos. Has a relatively nice interface for browsing videos and searching and stores everything locally in SQLite. Did this in C# & Avalonia - which I've never used before. AI wrote about 75% of the code (about 28k LOC now).
- Built a custom throw-away migration tool to export a customers data from one CRM to import into another. Windows app with basic interface.
- Developed an AI process for updating a webform system that uses XML to update the form structure. This one felt like magic and I initially didn't think it would work, but it only took a minute to try. Some background - years ago I built a custom webform/checklist app for a customer. They update the forms very rarely so we never built an interface for making updates but we did write 2 stored procs to update forms - one outputs the current form as XML and another takes the same XML and runs updates across multiple tables to create a new version of the form. For changes, the customer sends me a spreadsheet with all the current form questions in one column and their changes in another. It's normally just wording changes so I go through and manually update the XML and import it, but this time they had a lot of changes - removing questions, adding new ones, combining others. They had a column with the label changes and another with a description of what they wanted (i.e. "New Question", "Update label", "Combine this with q1, q2 and q3", "remove this question"). The form has about 100 questions and the XML file is about 2500 lines long and defines each form field, section layout, conditional logic, grid display, task creation based on incorrect answers etc, so it's time consuming to make a lot of little changes like this. With no expectation of it working, I took a screenshot of the spreadsheet and the exported XML file and prompted the LLM to modify the XML based on the instructions in the spreadsheet and some basic guidelines. It did it close to perfect, even fixing the spelling mistakes the customer had missed while writing their new questions.
- Along with using it on a daily basis across multiple projects.
I've seen the stat that says developers "...thought AI was making them 20% faster, but it was actually making them 19% slower". Maybe I'm hoodwinking myself somehow, but it's been transformative for me in multiple ways.
bad_username · 1d ago
What did you use for transcription? Local whisper via ffmpeg?
carpo · 1d ago
Yeah, the app lets you configure which whisper model to use and then downloads it on first load. Whisper blows me away too. Ive only got a 2080 and use the medium model and it's surprisingly good and relatively fast.
nurettin · 7h ago
It makes you fast. Really fast. In a greenfield project. At the beginning. Then it is all the same grind.
mutkach · 1d ago
> 55,000 hours of [IT] experience
I didn't know Mike Judge was such a polymath!
wewewedxfgdf · 1d ago
Maybe if the number of "Show HNs" has gone up that might be a data point.
whiterook6 · 1d ago
What the author is missing is the metric that matters more than shipping product: how much happier am I when my AI auto complete saves me typing and figures out what I'm trying to articulate for me. If devs using copilot are happier--and I am, at least--then that's value right there.
cranium · 1d ago
Places where I got the most out of coding agents are:
- breaking through the analysis paralysis by creating the skeleton of a feature that I then rework (UI work is a good example)
- aggressive dev tooling for productivity on early stage projects, where the CI/CD pipeline is lacking and/or tools are clumsy.
(Related XKCD: https://xkcd.com/1205/)
Otherwise, I find most of my time is understanding the client requirements and making sure they don't want conflicting features – both of which are difficult to speedup with AI. Coding is actually the easy part and even if it was sped up 100x a consistent end-to-end improvement of 2x would be a big win (see Amdahl's law).
muldvarp · 1d ago
> These claims wouldn't matter if the topic weren't so deadly serious. Tech leaders everywhere are buying into the FOMO, convinced their competitors are getting massive gains they're missing out on. This drives them to rebrand as AI-First companies, justify layoffs with newfound productivity narratives, and lowball developer salaries under the assumption that AI has fundamentally changed the value equation.
How is this "deadly" serious? It's about software developers losing well-paid, comfortable jobs. It's even less serious if AI doesn't actually improve productivity, because they'll find new jobs in the future.
Pretty much the only future where AI will turn out "deadly serious" is if it shows human-level performance for most if not all desk jobs.
paulhodge · 1d ago
I think different things are happening...
For experienced engineers, I'm seeing (internally in our company at least) a huge amount of caution and hesitancy to go all-in with AI. No one wants to end up maintaining huge codebases of slop code. I think that will shift over time. There are use cases where having quick low-quality code is fine. We need a new intuition about when to insist on handcrafted code, and when to just vibecode.
For non-experienced engineers, they currently hit a lot of complexity limits with getting a finished product to actually work, unless they're building something extremely simple. That will also shift - the range of what you can vibecode is increasing every year. Last year there was basically nothing that you could vibecode successfully, this year you can vibecode TODO apps and stuff like that. I definitely think that the App Store will be flooded in the coming future. It's just early.
Personally I have a side project where I'm using Claude & Codex and I definitely feel a measurable difference, it's about a 3x to 5x productivity boost IMO.
The summary.. Just because we don't see it yet, doesn't mean it's not coming.
copperx · 1d ago
If I've learned something about humans in my 40+ years of being alive is that in the long term, convenience trumps all other considerations.
bcrosby95 · 1d ago
I think it depends.
There are very simple apps I try to vibe code that AI cannot handle. It seems very good at certain domains, and others it seems complete shit at.
For example, I hand wrote a simulation in C in just 900 LOC. I wrote a spec for it and tried to vibe code it in other languages because I wanted to compare different languages/concurrency strategies. Every LLM I've tried fails, and manages to write 2x+ more code in comparatively succinct languages such as Clojure.
I can totally see why people writing small utilities or simple apps in certain domains think its a miracle. But when it comes to things like e.g. games it seems like a complete flop.
NooneAtAll3 · 1d ago
I wish that first bar graph was log scale...
lowbloodsugar · 1d ago
Turns out AI can’t help script kiddies write production ready applications. Also turns out that AI is good for some things and not others, and a coin toss isn’t a good method to decide which tasks to do using AI. I read that JavaScript is by far the most popular language: still not using it for the mission critical software I write. So it doesn’t bother me that 90% of HN is “AI sucks!” stories. I find it extremely effective when used appropriately. YMMV.
curtisblaine · 20h ago
Maybe the hard part is not coding, but making useful software (as in: software that other people would use) and LLM can't help that.
insane_dreamer · 22h ago
I haven't really found a major productivity boost using LLMs for _production_ software. Writing up the prompt and iterating can take as much time as just doing it. The auto-complete is better _IF_ it gets the syntax correct (depends a lot on how well it knows or can infer the framework).
Where I have found them very useful are for one-off scripts and stuff I need done quick and dirty, that isn't too complex and easily verifiable (so I can catch the mistakes it makes, and it does make them!), and especially in languages I don't know that well or don't like (i.e., bash, powershell, javascript)
lazarus01 · 1d ago
> demand they show receipts or shut the fuck up
This is what I always look for. Haven’t found one salient success story with a claim for success.
intended · 1d ago
This was a worthy set of lines to pen:
>” Now, I’ve spent a lot of money and weeks putting the data for this article together, processing tens of terabytes of data in some cases. So I hope you appreciate how utterly uninspiring and flat these charts are across every major sector of software development.”
m463 · 1d ago
> My argument: video games, new websites, mobile apps, software-as-a-service apps — we should be drowning in choice.
We might be doing just that now.
The best way to increase your ROI is to fire all your employees. How do we know we're not in the mid-release-cycle of that right now?
I'd guess game levels and assets are becoming ai slop as we speak.
icedshrimp · 1d ago
Honestly this reminds me of some of the promises that were made when the american animation industry switched to 3D because it was "cheaper"
A modern animated Disney 3D animated film consistently costs over 100-200 million dollars while movies like Klaus were made for about 40 million. Japan still animates on PAPER.
At the end of the days new tools have their usecases but I think especially in creative domains (which software definitely is) old techniques aren't invalidated by the creation of new ones.
ZBrush still crushes all other sculpting apps with some very well written low level code and assembly. It doesn't even use the GPU for crying out loud. If you proposed that as your solution for a graphically intensive 3D app you'd be laughed at, but software based raseterization/simple ray tracing takes the cake here. It could handle 20 million polygons at buttery smooth framerates in 2007, and isn't burdened by the VRAM drought we're in.
Don't let people tell you new tools make the old useless.
It's the cost. Full time serious agentic coding costs upwards of $100/day in Claude tokens (and Claude tokens are the only tokens worth even talking about). When this drops by 10x for a model at the level of quality and speed of Sonnet 4, it will change everything.
scotty79 · 1d ago
How widely is AI adopted in the wider IT industry anyways? I imagine $200 per month subscription isn't that popular with people refusing to pay for their IDEs and going with free alternatives instead. And month worth of free tier of AI agent can be spent in two intense evenings.
So who pays for AIs for developers? Mostly corpos. And the speed of individual developer was never a limiting factor in corpos. Average corporate development was always 10 times slower than indie. So even doubling it won't make any impression.
I don't know if I'm faster with AI at a specific task, but I know that I'm doing things I wouldn't touch because I hate the tedium. And I'm doing them while cooking and eating dinner and thinking about wider context and next things to come. So for me it feels worth it.
I think it might be something like with cars and safety. Any car safety improvements are going to be offset by the drivers driving faster and more recklessly. So maybe any speed improvements that AI might make for the project is nullified by developers doing things they would just skip without it.
flyinglizard · 1d ago
I get excellent productivity gains from AI. Not everywhere, and not linearly. It makes the bad stuff about the work (boilerplate, dealing with things outside my specialties) tolerable and the good stuff a bit better. It makes me want to create more. Business guys missing some visualization? Hell why not, few minutes on Aider and it's there. Let's improve our test suites. And let's migrate away from that legacy framework or runtime!
But my workflow is anything but "let her rip". It's very calculated, orderly, just like mastering any other tool. I'm always in the loop. I can't imagine someone without serious experience getting good stuff, and when things go bad, oh boy you're bringing a firehose of crap into your org.
I have a junior programmer who's a bright kid but lacking a lot of depth. Got him a Cursor subscription, tracking his code closely via PRs and calling out the BS but we're getting serious work done.
I just can't see how this new situation calls for less programmers. It will just bring about more software, surely more capable software after everyone adjusts.
degamad · 1d ago
> It will just bring about more software
But according to the graphs in the article, after three years of LLM chatbots and coding assistants, we're seeing exactly the same rate of new software...
vFunct · 1d ago
From the post, if AI was supposed to make everyone 25% more productive, then a 4 month project becomes a 3 month project. It doesn't become a 1 day project.
Was the author making games and other apps in 30 hours? Because that seems like a 4 month project?
sarchertech · 1d ago
The author mentioned polls showing that a substantial number of developers believe that AI makes them 10x more productive.
UltraSane · 1d ago
I find LLMs useful to decide what is the best option to solve a problem and see some example code.
groby_b · 1d ago
I think the author misses a few points
* METR was at best a flawed study. Repo-familiarity and tool-unfamiliarity being the biggest points of critique, but far from the only one
* they assume that all code gets shipped as a product. Meanwhile, AI code has (at least in my field of view) led to a proliferation of useful-but-never-shipped one-off tools. Random dashboards to visualize complex queries, scripts to drive refactors, or just sheer joy like "I want to generate an SVG of my vacation trip and consume 15 data sources and give it a certain look".
* Their own self-experiment is not exactly statistically sound :)
That does leave the fact that we aren't seeing AI shovelware. I'm still convinced that's because commercially viable software is beyond the AI complexity horizon, not because AI isn't an extremely useful tool
ModernMech · 21h ago
> * METR was at best a flawed study.
They didn't claim it was flawless, they had just brought it up because it caused them to question their internal narrative of their own productivity.
> * Their own self-experiment is not exactly statistically sound :)
They didn't claim it was.
> * they assume that all code gets shipped as a product.
The author did not assume this. They assumed that if AI is making developers more productive, that should apply to shovelware developers. That we don't see an increase in shovelware post-AI, makes it very unlikely AI brings an increase in productivity for more complex software.
noduerme · 1d ago
This is a great question and the data points make a solid case.
I've been a "10xer" for 25 years. I've considered coding agents bullshit since my first encounter with Copilot. I work by having a clear mental map of every piece of my code and knowing exactly how everything works, to the smallest detail, and how it interacts with every other part.
Anyway, all that and a nickel. Yesterday I fired up Claude Code for the first time. I didn't ask it to build me a game or anything at a high level. Nor to evaluate an existing code base. No... I spent about 2 hours guiding it to create a front-end SPA framework that does what my own in-house SPA framework does on the front end, just to see how it would perform at that. I approved every request manually and interrupted every time I spotted a potential issue (which were many). I guided it on what functions to write and how they should affect the overall navigation flow, rendering flow, loading and error-handling.
In other words, I knew what I wanted to write to a T, because it's code I wrote in 2006 and have refactored and ported many times since then... about 370 commits worth to this basic artifact, give or take.
And it pretty much got there.
Would I have been able to prompt it to write a system like that if I hadn't written the system myself over and over again? Probably not. But it did discern the logical setup I was going for (which is not at all similar to what you're thinking if you're coming from React or another framework like that), and it wrote code that is almost identical in its control structures to what I wrote, without me having to do much besides tell it in plain English what should control what, how, when and in what order.
I'm still not convinced it would save me time on something totally new, that I didn't already know the final shape of.
But I do suspect that a reason all this "vibe coding" hasn't led to an explosion of vaporware is that "vibe coding" isn't being done by experienced coders. I suspect that if you're letting changes "flash across the screen" without reading them, that's most of the difference between a failed prompt and one that achieves the desired result.
Like, I saw it do things like create a factory class that took a string name and found the proper component to load. I said, "refactor that whole thing to a component definition interface with the name in it, make a static object of those and use that to determine what screen to load and all of its parameters." And it did, and it looked almost the same as what I wrote back in the day.
Idk. I would not want my job to become prompting an LLM. I like cracking my knuckles and writing code. But I think the mileage you get may be dictated by whether you are trying to use it as a general-purpose "make this for me" engine, for shovelware, in which case it will fail hard, versus whether you are using it as a stenographer translating a sentence of instructions into a block of control flow.
surfingdino · 1d ago
AI is the biggest threat to bs middle management/ba jobs in recent history. Those people panic and try to become "AI ambassadors/advocates" to survive. They are pushing the narrative so that they have answers and "metrics" of adoption to show to upper management and survive another round of layoffs. You can see how it works at YouTube, where someone decided for no good reason to "upscale" Shorts making them all look bad. This is done automatically, without asking for the creators' permission and you cannot turn it off. The results are crap, but who ever made that decision can brag about widespread adoption of AI and survive another annual review.
back2dafucha · 1d ago
I could give a rats ass what his industry thinks about me or my skills.
I can build whole systems. They cant.
This is my biggest problem right now. The types of problems I'm trying to solve at work require careful planning and execution, and AI has not been helpful for it in the slightest. My manager told me that the time to deliver my latest project was cut to 20% of the original estimate because we are "an AI-first company". The mass hysteria among SVPs and PMs is absolutely insane right now, I've never seen anything like it.
And my answer - like the best of car mechanics who work on custom rides - is: I don't know how long until I actually get in there, but minimum 5x what you think, and my rate is $300/hr. Is it worth it to you to do it right?
Usually the answer is no. When it's yes, I have a free hand. And having a few clients who pay well is worth a lot more than having a few dozen who think they know everything and are too cheap to pay for it anyway.
Part of the issue is that it is so fast to get to a working prototype that it feels like you are almost there, but there is a lot of effort to go.
People made tools specifically to make templates look like templates and not a finished product, just because client was ready to push the product just after seeing real looking mockups of interface.
No comments yet
And regex.
That's pretty much nothing ... to me that line indicates a whole lot of other possible things.
It’s that we are heading towards a big recession.
As in all recessions, people come up with all sorts of reasons why everything is fine until it can’t be denied anymore. This time, AI was a useful narrative to have lying around.
2030 will be 2020 all over again.
When I started a CS degree in 2003, we were still kinda in the "dot com crash has happened, no-one will ever hire a programmer again" phase, and there were about 50 people in my starting class. I think in the same course two years ago, there'd been about 200. The 'correct' number, for actual future demand, was certainly closer to 200 than 50, and the industry as a whole had a bit of a labour crunch in the early 10s in particular.
This means that they inflate programmer salaries, which makes it impossible for most companies that could benefit from software development to hire developers.
We could probably have five times as many software developers as we have now, and they would not be out of work; they would only decrease average salaries for programmers.
1) fewer students are studying computer science, I'm faculty at a top CS program and we saw our enrollment decline for the first time ever. Other universities are seeing similar slowdowns of enrollment [1]
2) fewer immigrants coming to the united states to work and live, US is perhaps looking at its first population decline ever [2]
3) Current juniors are being stunted by AI, they will not develop the necessary skills to become seniors.
4) Seniors retiring faster because they don't want to have to deal with this AI crap, taking their knowledge with them.
So we're looking at a negative bubble forming in the software engineering expertise pipeline. The money people are hoping that AI can become proficient enough to fill that space before before everything bursts. Engineers, per usual, are pointing out the problem before it becomes one and no one is listening.
[1]: https://www.theatlantic.com/economy/archive/2025/06/computer...
[2]: https://nypost.com/2025/09/03/us-news/us-population-could-sh...
I wonder how that would even be measured? I suppose you could do it for roles that do the same type of work every day. I.e. perhaps there is some statistical relevance to number of calls taken in a call center per day or something like that. One the software development side however, productivity metrics are very hard to quantify. Of course, you can make a dashboard look however you want, but impossible, essentially to tie those metrics to NPV.
Who is we? One country heading into a recession is hardly enough to nudge the trend of "all things code"
That's insane. Who the hell pulls a number out of their ass and declares it the new reality? When it doesn't happen, he'll pin the blame on you, but everyone else above will pin the blame on him. He's the one who will get fired.
Laying off unnecessary developers is the answer if LLMs turn out to make us all so much more productive (assuming we don't just increase the amount of software written instead). But that happens after successful implementation of LLMs into the development process, not in advance.
Starting to think I should do the inadvisable and move my investments far far away from the S&P 500 and into something that will survive the hype crash that can't be too far off now.
I'd argue that compared to a decade. 15 years ago, relatively little value has been created. If you sat down in front of a 15 yo computer, or tried to solve a technical challenge with the tooling of 15-10 years ago, I don't think you'd get a significantly worse result.
Yet in this time the US has doubled its GDP, most of it owning to the top, to the tech professionals and financiers who benefited from this.
And some of this money went into assets with constrained supplies, such as housing, marking up the prices adjusted for inflation, making average people that much worse off.
While I do feel society is making progress, it's been a slow and steady march. Which in relative terms means slowing. Of I gave you $10 every week, by week 2, you'd double your wealth, by the end of the year, you almost didn't notice the increase.
Technology accumulation is the same, I'd argue it's even worse, since building on top of existing stuff has a cost proportional to the complexity for a fixed unit of gain (and features get proportionally less valuable as you implement the most important ones first).
Sorry got distracted from my main point - what happens when people stop believing that these improvements are meanigful or that technology that was priced in to produce 100x the value will come at all (and more importantly the company you're invested in will be able to caputre it)?
While you have decent points in your comment (essentially, the idea of tech industry growth slowing due to low hanging fruit being picked), if this statement going to be your barometer you’re going to end up looking stupendously wrong.
You can sit your Grandma down at her computer and have her type in “please make me a website for my sewing club that includes a sign up form and has a pink background” and AI will just do it and it’ll probably even work the first time you run it.
15 years ago tossing a website on Heroku was a pretty new concept, and you definitely had to write it all on your own.
10 years ago Kubernetes had its initial release.
Google Drive and Slack are not even 15 years old.
TensorFlow is just hitting its 10th birthday.
I think you’re vastly underestimating the last 15 years of technological progress and in turn value creation.
25 years ago we had wysiwyg editors to build web pages, and she could just include a link to email her to have people ask her to "sign up" (the entire point of a sewing club is to socialize. You don't need an automated newsletter or whatever. That's for people that want to sell you something). You'd put it into the space your ISP gave you, or Windows actually included a personal web server. You had to be somewhat technical to use the Windows server, but it could've been made more friendly instead of being removed, and personal computers should make it easy to share a static website.
We've clearly regressed since then. People think you need an engineer or AI to build basic web pages. My actual grandma was doing some of this stuff when I was a kid in the 90s, as was I.
What happened to the web seems analogous to me to as if we abandoned word processors and spreadsheets and said you need a six-figure engineer to do custom programming any time you need the features of one of those things.
You should not look back on those solutions and neglect to acknowledge their limitations. Access can only handle small databases (maximum 2GB) and about 100 simultaneous connections.
Access is basically a product of the technological limitations of its time. It isn’t designed the way it is because that way is the best design.
The kind of business that relied on Access is going to find that solutions like Airtable are far easier and more powerful, and certainly less prone to data loss than mom and pop’s shared drive. There are even open source self-hosted alternatives like NocoDB and Baserow (or cloud hosted).
You’ll inevitably complain about the subscription cost of SaaS solutions but it’s not like MS Office business licenses were ever cheap.
But when it comes to actually accomplishing something with application logic on the web, Grandma could ask the AI agent literally all these questions and get solid answers. At any point you get confused you can just ask what to do. You can highlight a line of code and ask “what does this mean?” You can even say “I want this web page to be put online what do I do?”
Beats waiting for a human to answer you.
You’re also taking my example application way too literally. No I don’t know why Grandma needs a signup form, I just couldn’t think of a web app off the top of my head.
MS Access and WSYWIG tools like FrontPage and iWeb were not good. I know because I was there and I used FrontPage at work. A link to your email (get spammed forever) is not a replacement for an application or email form. The whole reason code is preferred over WYSIWYG is because you inevitably have issues with change management even for simple personal projects, which is why static site generators have gained popularity. I’m sure your grandma could have handled markdown + someone else’s Hugo template + Netlify.
Hell, if we want to talk about progress in WYSIWYG editors, Trix greatly improved and modernized the shortfalls of that process and that was launched in 2014, less than 10 years ago. So even in the world of WSYWIG we have better tools now than before.
IIS has not been removed from Windows home edition, by the way.
Apart from doing the styles and layout, I don't think current tools have less friction. They're a lot safer though. Can't say I never dropped a production database.
I definitely agree that it’s a reasonable choice.
based on your examples, I'd say you're vastly overestimating
sure, those are all new techs, but it's not like you couldn't do those things before in different ways or they are revolutionary in any way
As for grandma, 15 years ago she could have just posted her sewing club on Facbook, she doesn't need Heroku or AI.
I would say that these technologies/products being so wildly popular puts burden of proof on you to show me some kind of evidence that these technologies aren’t productive. Like are you trying to say that something had better deployment velocity and reliablity than Kubernetes at handling high-complexity application infrastructure deployments for large enterprise companies? What was it? Why was it better than Kubernetes?
The analogy is that you’re basically saying that zippers aren’t really better than buttons but then literally everyone is overwhelmingly wearing pants and coats with zippers and very strongly prefer zippers. So really it’s on you to prove to me that I should be using pants with buttons instead.
Finally, there’s a lot of irony in your first paragraph complaining about a few tech oligarchs becoming fabulously wealthy and then suggesting that Grandma just use Facebook instead of building her own site. In any event, my web app example was just a poorly thought out example of a web app, I really just mean a website that has a little more utility than a static site.
The juice hasn't been worth the squeeze. You can look at all societal indicators except the stock market pointing downward to get to that conclusion. Nothing is actually better than it was in 2010 despite Uber, Airbnb, Kubernetes, Slack, and all the other SV tech "innovations". People are not happier or wealthier because of the tech coming from Silicon Valley. In general the end result of the last 15 years of tech is that it's made us more neurotic, disconnected, depressed, and angry.
We don't need "better deployment velocity and reliability for high-complexity application infrastructure deployments for large enterprises". Listen to yourself man, you sound like you've been hypnotized by the pointy-haired boss. The tech sector makes false promises about a utopia future, and then it just delivers wealth for shareholders, leaving everyone else worse off.
Grandma especially doesn't need deployment velocity, she's being evicted by her landlord because he wants to turn her flat into an Airbnb. She can't get to the grocery store because the town won't invest in public transport and Uber is the only option. She's been radicalized by Meta and Youtube and now she hates her own trans grandchild because her social media feed keeps her algorithmically outraged. Oh, and now she's getting scammed by AI asking her to invest her life savings in shitcoins and NFTs.
> The analogy is that you’re basically saying that zippers aren’t really better than buttons but then literally everyone is overwhelmingly wearing pants and coats with zippers and very strongly prefer zippers.
I don't agree that the ubiquity and utility are necessarily correlated, so I don't see the zippers and Kubernetes as analogous.
But the proliferation of zippers has more to do with the fact they are easier for manufacturers to integrate into products compared to buttons -- they come pre-measured and installing them is a straight stitch that can be done with a machine, whereas installing buttons is more time-consuming.
Zippers are worse for consumers in many ways, repairability chief among them. But really they are part of a general trend over my lifetime of steadily falling garment quality, as manufacturers race to the bottom.
> In any event, my web app example was just a poorly thought out example of a web app, I really just mean a website that has a little more utility than a static site.
You said it, not me. We had the technology to throw up a static site in 2010 and my grandmother could actually do that with dreamweaver and FTP, and it worked fine.
In case you also need to control Spotify from Windows 95 :D
https://github.com/queenkjuul/spotify97
Product and Sales?
Not investing advice; the bottom 490 companies in the S&P500 are nominally flat since 2022 and down against inflation, GPUs and AI hype are holding everything together at the moment.
> In simpler terms, 35% of the US stock market is held up by five or six companies buying GPUs. If NVIDIA's growth story stumbles, it will reverberate through the rest of the Magnificent 7, making them rely on their own AI trade stories.
https://www.wheresyoured.at/the-haters-gui/
> Capex spending for AI contributed more to growth in the U.S. economy in the past two quarters than all of consumer spending, says Neil Dutta, head of economic research at Renaissance Macro Research, citing data from the Bureau of Economic Analysis.
https://www.bloodinthemachine.com/p/the-ai-bubble-is-so-big-...
> Two Nvidia customers made up 39% of Nvidia’s revenue in its July quarter, the company revealed in a financial filing on Wednesday, raising concerns about the concentration of the chipmaker’s clientele.
https://www.cnbc.com/2025/08/28/nvidias-top-two-mystery-cust...
I wouldn’t count on it.
Chatgpt.
If we can delegate incident response to automated LLMs too, sure, why not. Let the CEO have his way and pay the reputational price. When it doesn't work, we can revert our git repos to the day LLMs didn't write all the code.
I'm only being 90% facetious.
I think making stakeholders have to engage with these models is the most critical point for people having deadlines or expectations based on them.
Let Claude run incident response for a few weeks. I'll gladly pause pagerduty for myself.
Lord, forgive them, they know not what they do.
"Bobby Lehman is ninety three years old and he dances the twist. He is 100 years old! 120! Maybe 140! He dances like a madman!"
"A bunch of mindless jerks who'll be the first against the wall when the revolution comes."
The thing about hype cycles (including AI) is that the marketing department manages to convince the purchases to do their job for them.
If the self checkout scanner at the supermarket started bickering with me for entering the wrong produce code, that would wrap up the whole Turing Test thing for me.
You see yourselves as the disenfranchised proletariats of tech, crusading righteously against AI companies and myopic, trend-chasing managers, resentful of their apparent success at replacing your hard-earned skill with an API call.
It’s an emotional argument, born of tribalism. I’d find it easier to believe many claims on this site that AI is all a big scam and such if it weren’t so obvious that this underlies your very motivated reasoning. It is a big mirage of angst that causes people on here to clamor with perfunctory praise around every blog post claiming that AI companies are unprofitable, AI is useless, etc.
Think about why you believe the things you believe. Are you motivated by reason, or resentment?
And if they don't, then you'd understand the anger surely. You can't say "well obviously everybody should benefit" and then also scold the people who are mad that everybody isn't benefiting.
Also AI has been basically useless every time I tried it except converting some struct definitions across languages or similar tasks, it seems very unlikely that it would boost productivity by more than 10% let alone 400%.
FWIW, my own experiences with AI have ranged from mediocre to downright abysmal. And, no, I don't know which models the tools were using. I'm rather annoyed that it seems to be impossible to express a negative opinion about the value of AI without having to have a thoroughly documented experiment that inevitably invites the response that obviously some parameter was chosen incorrectly, while the people claiming how good it is get to be all offended when someone asks them to maybe show their work a little bit.
It’s like saying “I drove a car and it was horrible, cars suck” without clarifying what car, the age, the make, how much experience that person had driving, etc. Of course its more difficult to provide specifics than just say it was good or bad, but there is little value in claims that AI is altogether bad when you don’t offer any details about what it is specifically bad at and how.
That's an interesting comparison. That kind of statement can be reasonably inferred to be made by someone just learning to drive who doesn't like the experience of driving. And if I were a motorhead trying to convert that person to like driving, my first questions wouldn't be those questions, trying to interrogate them on their exact scenario to invalidate their results, but instead to question what aspect of driving they don't like to see if I could work out a fix for them that would meaningfully change their experience (and not being a motorhead, the only thing I can think of is maybe automatic versus manual transmission).
> there is little value in claims that AI is altogether bad when you don’t offer any details about what it is specifically bad at and how.
Also, do remember that this holds true when you s/bad/good/g.
"Damn, these relational databases really suck, I don't know why anyone would use them, some of the data by my users had emojis in them and it totally it! Furthermore, I have some bits of data that have about 100-200 columns and the database doesn't work well at all, that's horrible!"
In some cases knowing more details could help, for example in the database example a person historically using MySQL 5.5 could have had a pretty bad experience, in which case telling them to use something more recent or PostgreSQL would have been pretty good.
In other cases, they're literally just holding it wrong, for example trying to use a RDBMS for something where a column store would be a bit better.
Replace the DB example with AI, same principles are at play. It is equally annoying to hear people blaming all of the tools when some are clearly better/worse than others, as well as making broad statements that cannot really be proven or disproven with the given information, as it is people always asking for more details. I honestly believe that all of these AI discussions should be had with as much data present as possible - both the bad and good experiences.
> If your experience makes you believe that certain tools are particularly good--or particularly bad--for the tasks at hand, you can just volunteer those specifics.
My personal experience:
Are they worth it? Depends. The more boilerplate and boring bullshit code you have to write, the better they'll do. Go off the beaten path (e.g. not your typical CRUD webapp) and they'll make a mess more often. That said, I still find them useful for the reduced boilerplate, reduced cognitive load, as well as them being able to ingest and process information more quickly than I can - since they have more working memory and the ability to spot patterns when working on a change that impacts 20-30 files. That said, the SOTA models are... kinda okay in general.But if all we have to go on is "I used it and it sucked" or "I used it and it was great", like, okay, good for you?
But I also really care about the quality of our code, and so far my experiments with AI have been disappointing. The empirical results described in this article ring true to me.
AI definitely has some utility, just as the last "game changer" - blockchain - does. But both technologies have been massively oversold, and there will be many, many tears before bedtime.
Bad framing and worse argument. It's emotional.
Every engineer here is evaluating what ai claims it can do as pronounced by ceos and managers (not expert in software dev) v reality. Follow the money.
Yeah, it's frustrating to see someone opine "critics are motivated by resentment rather than facts" as if it were street-smart savvy psychoanalysis... while completely ignoring how many influential voices boosting the concept have a bajillions of dollars in motive to speak as credulously and optimistically as possible.
I think most people are motivated by values. Reason and emotion are merely tools one can use in service of those.
My experience that people who hew too strongly to the former tend to be more oblivious to what's going on in their psychology than most.
But I am not claiming that AI is useless. It is useful, but I would rather destroy every data center that enjoy strengthening of techno-feudalism.
I'd widen the frame a bit. People scared of losing their jobs might underestimate the usefulness of AI. Makes sense to me, it's the comforting belief. Worth keeping in mind while reading articles sceptical of AI.
But there's another side to this conversation: the people whose writing is pro AI. What's motivating them? What's worth keeping in mind while reading that writing?
Please, enlighten me with your gigantic hyper-rational brain.
AI stans don’t become AI stans for no reason. They see the many enormous technological leaps and also see where progress is going. The many PhDs currently making millions at labs also have the right idea.
Just look at ChatGPT’s growth alone. No product in history compares, and it’s not an accident.
The two types of responses to AI I see are your very defensive type, and people saying "I don't get it".
Mousing implies things are visible and you merely point to them. Keyboard implies things are non-visible and you recall commands from memory. These two must have a principal difference. Many animals use tools: inanimate objects lying around that can be employed for some gain. Yet no animal makes a tool. Making a tool is different from using it because to make a tool one must foresee the need for it. And this implies a mental model of the world and the future, i.e. a very big change compared to simply using a suitable object on the spot. (The simplest "making" could be just carrying an object when there is no immediate need for it, e.g. a sufficiently long distance. Looks very simple and I myself do not know if any animals exhibit such behavior, it seems to be on the fence. It would be telling if they don't.)
I think the difference between mousing and keying is about as big as of using a tool and making a tool. Of course, if we use the same app all day long, then its keys become motor movements, but this skill remains confined to the app.
Their claim following that is that because there hasn't been an exponential growth in App store releases, domain name registrations or Steam games, that, beyond just AI producing shoddy code, AI has led to no increase in the amount of software at all, or none that could be called remarkable or even notable in proportion to the claims made by those at AI companies.
I think this ignores the obvious signs of growth in AI companies which providing software engineering and adjacent services via AI. These companies' revenues aren't emerging from nothing. People aren't paying them billions unless there is value in the product.
These trends include
1. The rapid growth of revenue of AI model companies, OpenAI, Anthropic, etc. 2. The massive growth in revenue of companies that use AI including Cursor, replit, loveable etc 3. The massive valuation of these companies
Anecdotally, with AI I can make shovelware apps very easily, spin them up effortlessly and fix issues I don't have the expertise or time to do myself. I don't know why the author of TFA claims that he can't make a bunch of one-off apps with capabilities avaliable today when it's clear that many many people can, have done so, have documented doing so, have made money selling those apps, etc.
Oh, of course not. Just like people weren't paying vast sums of money for beanie babies and dotcoms in the late 1990s and mortgage CDOs in the late 2000s [EDIT] unless there was value in the product.
People paid a lot for beanie babies and various speculative securities on the assumption that they could be sold for more in the future. They were assets people aimed to resell at a profit. They had no value by themselves.
The source of revenue for AI companies has inherent value but is not a resell-able asset. You can't resell API calls you buy from an AI company at some indefinite later date. There is no "market" for reselling anything you purchase from a company that offers use of a web app and API calls.
I think the article's premise is basically correct - if we had a 10x explosion of productivity where is the evidence? I would think some is potentially hidden in corporate / internal apps but despite everyone at my current employer using these tools we don't seem to be going any faster.
I will admit that my initial thoughts on Copilot were that "yes this is faster" but that was back when I was only using it for rote / boilerplate work. I've not had a lot of success trying to get it to do higher level work and that's also the experience of my co-workers.
I can certainly see why a particular subset of programmers find the tools particularly compelling, if their job was producing boilerplate then AI is perfect.
The fundamental difference of opinion people have here though is some people see current AI capabilities as a floor, while others see it as a ceiling. I’d agree with arguments that AI companies are overvalued if current models are as capable as AI will ever be for the rest of time, but clearly that is not the case, and very likely, as they have been every few months over the past few years, they will keep getting better.
It's not ONE person. I agree that it's not "every single human being" either, but more of a preliminary result, but I don't understand why you discount results you dislike. I thought you were completely rational?
https://www.theregister.com/2025/07/11/ai_code_tools_slow_do...
You can't use growth of AI companies as evidence to refute the article. The premise is that it's a bubble. The growth IS the bubble, according to the claim.
> I don't know why the author of TFA claims that he can't make a bunch of one-off apps
I agree... One-off apps seem like a place where AI can do OK. Not that I care about it. I want AI that can build and maintain my enterprise B2B app just as well as I can in a fraction of the time, and that's not what has been delivered.
> I want AI that can build and maintain my enterprise B2B app just as well as I can in a fraction of the time, and that's not what has been delivered.
AI isn't at that level yet but it is making fast strides in subsets of it. I can't imagine systems of models and the models themselves won't reach there in a couple years given how bad AI coding tools were just a couple years ago.
Damn, when did it become wrong for me to advocate in my best interests while my boss is trying to do the same by shoving broken and useless AI tools up my ass?
I’m not concerned for my job, in fact I’d be very happy if real AGI would be achieved. It would probably be the crowning tech achievement of the human race so far. Not only would I not have to work anymore, the majority of the world wouldn’t have to. We’d suddenly be living in a completely different world.
But I don’t believe that’s where we’re headed. I don’t believe LLMs in their current state can get us there. This is exactly like the web3 hype when the blockchain was the new hip tech on the block. We invent something moderately useful, with niche applications and grifters find a way to sell it to non technical people for major profit. It’s a bubble and anyone who spends enough time in the space knows that.
I agree that there are lots of limitations to current LLM's, but it seems somewhat naive to ignore the rapid pace of improvement over the last 5 years, the emergent properties of AI at scale, especially in doing things claimed to be impossible only years prior (remember when people said LLM's could never do math, or that image models could never get hands or text right?).
Nobody understands with greater clarity or specificity the limitations of current LLM's than the people working in labs right now to make them better. The AGI prognostications aren't suppositions pulled out of the realm of wishful thinking, they exist because of fundamental revelations that have occurred in the development of AI as it has scaled up over the past decade.
I know I claimed that HN's hatred of AI was an emotional one, but there is an element to their reasoning too that leads them down the wrong path. By seeing more flaws than the average person in these AI systems, and seeing the tact with which companies describe their AI offerings to make them seem more impressive (currently) than they are, you extrapolate that sense of "figuring things out" to a robust model of how AI is and must really be. In doing so, you pattern match AI hype to web3 hype and assume that since the hype is similar in certain ways, that it must also be a bubble/scam just waiting to pop and all the lies are revealed. This is the same pattern-matching trap that people accuse AI of making, and see through the flaws of an LLM output while it claims to have solved a problem correctly.
And that´s actually quite useful - given that most of this material is paywalled or blocked from search engines. It´s less useful when you look at code examples that mix different versions of python, and have comments referring to figures on the previous page. I´m afraid it becomes very obvious when you look under the hood at the training sets themselves, just how this is all being achieved.
All intelligence is pattern matching, just at different scales. AI is doing the same thing human brains do.
Hard not to respond to that sarcastically. If you take the time to learn anything about neuroscience you'll realise what a profoundly ignorant statement it is.
But even if it's not a lot, it's more than the number of LLMs that can invent new meaning which is a grand total of 0.
If tomorrow, all LLMs ceased to exist, humans would carry on just fine, and likely build LLMs all over again, next time even better.
LLMs are not anything like Web3, not "exactly like". Web3 is in no way whatsoever "something moderately useful", and if you ever thought it was, you were fooled by the same grifters when they were yapping about Web3, who have now switched to yapping about LLMs.
The fact that those exact same grifters who fooled you about Web3 have moved onto AI has nothing to do with how useful what they're yapping about actually is. Do you actually think those same people wouldn't be yapping about AI if there was something to it? Yappers gonna yap.
But Web3 is 100% useless bullshit, and AI isn't: they're not "exactly alike".
Please don't make false equivalences between them like claiming they're "exactly like" each other, or parrot the grifters by calling Web3 "moderately useful".
Yeah so the thing is the "success" is only "apparent". Having actually tried to use this garbage to do work, as someone who has been deeply interested in ML for decades, I've found the tools to be approximately useless. The "apparent success" is not due to any utility, it's due entirely to marketing.
I don't fear I'm missing out on anything. I've tried it, it didn't work. So why are my bosses a half dozen rungs up on the corporate ladder losing their entire minds over it? It's insanity. Delusional.
And to me this is worse news. People in higher paying jobs are the ones that would hurt the economic fabric more, but by that token they’d have more power and influence to ensure a better safety net for the inevitable rise of AI and automation in much of the workforce.
Entry level workers can’t afford to not work, they can’t afford to protest or advocate, they can’t afford the future that AI is bringing closer to their doorsteps. Without that safety net, they’ll be struggling and impoverished. And then will everyone in the higher paying positions help, or will we ignore the problem until AI actually is capable of replacing us, and will it be too late by then?
Tell him to code it himself then? If it can be done with only prompting, and he's able to type a sentence or two in a web form, what's stopping him?
This isn't entirely foreign to me; it sure looks a lot like the hype train of the dot-com bubble. My experience says that if you're holding stocks in a company going down this road, I'd say they have very low long-term value. Even if you think there's room to grow, bubbles pop fast and hard.
---
[1] We generally budget about half an intern's time for finding the coffee machine, learning how to show up to work on time, going on a fun event with the other interns to play minigolf, discovering that unit tests exist, etc, etc.
I have backend development background so I was able to review the BE code and fix some bugs. But I did not bother learning Jira and Harvest API specs at all, AI (cursor+sonnet 4) figured it out all.
I would not be able to write the front-end of this. It is JS based and updates the UI based on real-time http requests (forgot the name of this technology, the new ajax that is) and I do not have time to learn it but again, I was able to tweak what AI generated and make it work.
Not only AI helped me do something in much shorter than it would take, it enabled me do something that otherwise would not be possible.
Challenge your manager to a race, have him vibe code
This sounds incredibly stupid. It’s going to take as long as it will and if they’re not okay with that, their delusional estimates should be allowed to crash and burn, which would hopefully be a learning experience.
The problem is that sometimes there’s an industry wide hysteria even towards useful tech - like doing a lift and shift of a bunch of monoliths to AWS to be “cloud scale”, introducing Kubernetes or serverless without the ability to put either to good use, NoSQL for use cases it’s not good at and most recently AI.
I think LLMs will eventually weather the hype cycle and it will settle down on what they’re actually kinda okay at vs not, the question is how many livelihoods will be destroyed along the way (alongside all the issues with large scale AI datacenter deployments).
On a personal level, it feels like you should maybe do the less ethical thing of asking your employer in the ballpark of 1000-3000 USD for Claude credits a month, babysitting it enough to do the 20% of the work in the 20% new estimate, babysit it enough to ship a functional MVP and when they complain about missing functionality tell them that the AI tech just isn't mature enough but thankfully you'll be able to swoop in and salvage it for only the remaining 80% of the remaining estimate's worth of work.
1. LLMs do not increase general developer productivity by 10x across the board for general purpose tasks selected at random.
2. LLMs dramatically increases productivity for a limited subset of tasks
3. LLMs can be automated to do busy work and although they may take longer in terms of clock time than a human, the work is effectively done in the background.
LLMs can get me up to speed on new APIs and libraries far faster than I can myself, a gigantic speedup. If I need to write a small bit of glue code in a language I do not know, LLMs not only save me time, but they make it so I don't have to learn something that I'll likely never use again.
Fixing up existing large code bases? Productivity is at best a wash.
Setting up a scaffolding for a new website? LLMs are amazing at it.
Writing mocks for classes? LLMs know the details of using mock libraries really well and can get it done far faster than I can, especially since writing complex mocks is something I do a couple times a year and completely forget how to do in-between the rare times I am doing it.
Navigating a new code base? LLMs are ~70% great at this. If you've ever opened up an over-engineered WTF project, just finding where HTTP routes are defined at can be a problem. "Yo, Claude, where are the route endpoints in this project defined at? Where do the dependency injected functions for auth live?"
Right tool, right job. Stop using a hammer on nails.
I wax and wane on this one.
I've had the same feelings, but too often I've peaked behind the curtain, read the docs and got familiar with external dependencies and then realize whatever the LLM responds with paradoxically either wasn't following convention or tried to shoehorn your problem to fit code examples found online, used features inappropriately, took a long roundabout path to do something that can be done simply, etc.
It can feel like magic until you look too closely at it, and I worry that it'll make me complacent with the feeling of understanding without actually taking away an understanding.
If I have to manually verify every answer, I may as well read the docs myself.
It's incredible how quickly an LLM can answer. I've also crossed checked its responses with documentation before and discovered that it suggested implementing a deprecated feature that had a massive warning banner in the documentation that the LLM failed to indicate. I'm still a fan of reading documentation.
The difference is that if I go directly to the support site, there's a decent chance I can quickly spot and reject the garbage based on the date, the votes it's gotten, even the quality of the writing. AI doesn't include any of those clues; it mixes good and bad together and offers it up for me to pick apart through trial and error.
You pay money, have vendor lock-in, get one answer, and there's no upvotes/downvotes/accepted-answers/moderation or clarification.
For questions that I know should have a straightforward answer, I think it beats searching Stackoverflow. Sure, I'll typically end up having to rewrite most of the script from scratch; however, if I give it a crude starting point of a half-functional script I've already got going, pairing that with very clear instructions on how I'd like it extended is usually enough to get it to write a proof of concept demonstration that contains enough insightful suggestions for me to spend some time reading about features in man pages I hadn't yet thought to use.
The biggest problem maybe is a propensity for these models to stick in every last fancy feature under the sun. It's fun to read about a GNU extension to awk that makes my script a couple lines shorter, but at best I'll take this as an educational aside than something I'd accept at the expense of portability.
Just a random personal anecdote I wanted to throw out. I recently had to build some custom UI with Qt. I hadn't worked with Qt in a decade and barely remembered it. Seems like a perfect use case for AI to get me "up to speed" on the library, right? It's an incredibly well documented library with lots written on it, perfect fodder for an AI to process.
So, I gave it a good description of the widget I was trying to make, what I needed it to look like and how it should be behave, and behold, it spit out the specific widget subclass I should use and how I should be overriding certain methods to customize behavior. Wow, it worked exactly like promised.
So I implemented it like it suggested and was seemingly happy with the results. Went on with working on other parts of the project, dealing with Qt more and more here and there, gaining more and more experience with Qt over time.
A month or two later, after gaining more experience, I looked back at what AI had told me was the right approach on that widget and realized it was completely messed up. It had me subclassing the completely wrong type of widget. I didn't need to override methods and write code to force it to behave the way I wanted. I could instead just make use of a completely different widget that literally supported everything I needed already. I could just call a couple methods on it to customize it. My new version removes 80% of the code that AI had me write, and is simpler, more idiomatic, and actually makes more sense now.
So yeah, now any time I see people write about how "well, it's good for learning new libraries or new languages", I'll have that in the back of my mind. If you don't already know the library/language, you have zero idea whether what the AI teaching you is horrible or not. Whether there's a "right/better" way or not. You think it's helping you out when really you're likely just writing horrible code.
I do find LLMs useful at times when working in unfamiliar areas, but there are a lot of pitfalls and newly created risks that come with it. I mostly work on large existing code bases and LLMs have very much been a mildly useful tool, still nice to have, but hardly the 100x productivity booster a lot of people are claiming.
sorry, what am I supposed to use on nails?
https://blog.codinghorror.com/the-php-singularity/
Weren't the code generators before this even better though? They generated consistent results and were dead quick at doing it.
> LLMs can get me up to speed on new APIs and libraries far faster than I can myself
To this?
> LLMs can get me up to speed on old APIs and old libraries that are new to me far faster than I can myself
My experience is if the library/API/tool is new then the LLM can't help. But maybe I'm using it wrong.
Traditional documentation has always been a challenge for me - figuring out where to start, what syntax conventions are being used, how pieces connect together. Good docs are notoriously hard to write, and even harder to navigate. But now, being able to query an LLM about specific tasks and get direct references to the relevant documentation sections has been a game-changer.
This realization led me to flip my approach entirely. I’ve started heavily documenting my own development process in markdown files - not for humans, but specifically for LLMs to consume. The key insight is thinking of LLMs as amnesiac junior engineers: they’re capable, but they need to be taught what to do every single time. Success comes from getting the right context into them.
Learning how to craft that context is becoming the critical skill.
It’s not about prompting tricks - it’s about building systematic ways to feed LLMs the information they need.
I’ve built up a library of commands and agents for my Claude Code installation inspired by AgentOS (https://github.com/buildermethods/agent-os) to help engineer the required context.
The tool is a stochastic parrot, you need to feed it the right context to get the right answer. It is very good at what it does but you need to use it to its strengths in order to get value from it.
I find people complaining about LLMs often expect vibe coding to be this magic tool that will build the app for you without thinking, which it unfortunately has been sold as, but the reality is more of a fancy prompt based IDE.
What is this supposed busy work that can be done in the background unsupervised?
I think it's about time for the AI pushers to be absolutely clear about the actual specific tasks they are having success with. We're all getting a bit tired of the vagueness and hand waving.
I don't think the original comment you responded to made this specific point.
It didn’t work. I asked a colleague. He had the same problem. Turned out it was using out of date setup instructions for a major tool that has changed post training.
After spending time fixing the problem, I realised (1) it would have been faster to do it myself and (2) I can no longer trust that tool to set anything up - what if it’s doing something else wrong?
Use MCP servers, specifically context 7.
This gets up to date docs as long as you include the library name on your prompt and ask to use context 7.
You did the equivalent of raw dogging gpt4(an old model) for recent news versus using an agent with web search tooling.
So amazing that every single stat showed by the author in the article has been flat at best, despite all being based on new development rather than work on existing code-bases.
This is trivial work that you should have automated after doing it once or twice anyways :/
I think the "why" for this is that the stakes are high. The economy is trembling. Tech jobs are evaporating. There's a high anxiety around AI being a savior, and so, a demi-religion is forming among the crowd that needs AI to be able to replace developers/competency.
That said: I personally have gotten impressive results with AI, but you still need to know what you're doing. Most people don't (beyond the beginner -> intermediate range), and so, it's no surprise that they're flooding social media with exaggerated claims.
If you didn't have a superpower before AI (writing code), then having that superpower as a perceived equalizer is something that you will deploy all resources (material, psychological, etc) to ensuring that everyone else maintain the position that 1) superpower good, 2) superpower cannot go away 3) the superpower being fallible should be ignored.
Like any other hype cycle, these people will flush out, the midpoint will be discovered, and we'll patiently await the next excuse to incinerate billions of dollars.
Which is why they generate so much hype. They are perfect for tech demos, then management wonders why they aren't seeing results in the real world.
For tight tasks it can be super helpful -- like for me, an AI/Data Science guy, setting up a basic reverse proxy. But I do so with a ton of scrutiny -- pushing it, searching on Kagi or docs to at least confirm the code, etc. This is helpful because I don't have a mental map about reverse proxy -- so it can help fill in gaps but only with a lot of reticence.
That type of use really doesn't justify the billion dollar valuations of any companies, IMO.
ie pnpm create vite
Tailwind is similarly a one liner to initialize(might be a vite create option now).
Edit: My bad, you are talking about the LLMs! I'm always surprised how still for past years, even though we have great projects scalfolding across the node verse, people are still complaining about how hard setting up projects is..
You want as much context as possible _right in the code_.
In my experience you don't need to know a whole lot about LLM's to work them. You need to know that everything they spit out is potential garbage, and if you can't tell the good from the garbage then whatever you're using them for is going to be terrible. In terms of software terrible is fine for quite a lot of systems. One of the first things I build out of university in the previous millennium is still in production today and it's horrible. It's inefficient, horribly outdated since it hasn't been updated ever. It runs 10 times a day and at least 1 of them will need to automatically restart itself because it failed. It's done it's job without the need for human intervention for the past many decades though. I know because one of my old colleagues still works there. It could've been improved, but the inefficiency cost over all those years is probably worth about two human hours, and it would likely take quite a while to change it. A lot of software is like that, though a lot of it doesn't live for so long. LLM's can absolutely blast that sort of thing. It's when the inefficiency cost isn't less than a few human hours that LLM's become a liability if you don't know how to do the engineering.
I use LLM's to write a lot of the infrastructure as code we use today. I can do that because I know exactly how that should be engineered. What the LLM can do that I can't, is that it can spit out the k8s yaml for an ingress point with 200 lines of port settings in a couple of seconds. I've yet to have it fail, probably because those configurations are basically all the same depending on the service. What a LLM can't do, however, is write the entire yaml config.
Similarily it can build you a virtual network with subnets in bicep based on a couple of lines of text with address prefixes. At the sametime it couldn't build you a reasonable vnet with subnets if you asked it to do it from scractch. That doesn't mean it can't build you one that works though, it's just that you're likely going to claim 65534 ip addresses for a service which uses three.
On the other hand, I’ve lately seen it misused by less experienced engineers trying to implement bigger features who eagerly accept all it churns out as “good” without realizing the code it produced:
- doesn’t follow our existing style guide and patterns.
- implements some logic from scratch where there certainly is more than one suitable library, making this code we now own.
- is some behemoth of a PR trying to do all the things.
Depending on the amount of code, I see this only as positive? Too often people pull huge libraries for 50 lines of code.
- Implementing a scheduler from scratch (hundreds of lines), when there are many many libraries for this in Go.
- Implementing some complex configuration store that is safe for concurrent access , using generics, reflection, and a whole other host of stuff (additionally hundreds of lines plus more for tests).
While I can't say any of the code is bad, it is effectively like importing a library which your team now owns, but worse in that no one really understands it or supports it.
Lastly, I could find libraries that are well supported, documented, and active for each of these use-cases fairly quickly.
Maybe keep your eyes open? :-)
Because, as part of your verification, you will have to do that _anyway_.
And for the record - my eyes are open. I'm aware I'm being bullshitted. I don't trust, I verify.
But I also don't have a magical lever that I can pull to make it stop hallucinating.
... and every time I ask if one exists, I get either crickets, or a response that doesn't answer the question.
If I as a reviewer don’t know if the author used AI, I can’t even assume a single human (typically the author) has even read any or major parts of the code. I could be the first person reviewing it.
Not that it’s a great assumption to make, but it’s also fair to take a PR and register that the author wrote it, understands it, and considers it ready for production. So much work, outside of tech as well, is built on trust at least in part.
Usually gets them to sort out their behavior without directly making accusations that could be incorrect. If they really did write or strongly review the code, those questions are easy to answer.
suspect this happened because the reimplementation contained a number of standard/expected methods that we didn't have in our existing interface (because we didn't need them), so it was considered 'different' enough. but none of the code actually used those methods (because we didn't need them), so all this PR did was add a few hundred lines of cognitive overhead.
Email validation in 2025 is simple. It has been simple for years now. You check that it contains an '@' with something before it, and something after it. That's all there is to it — then send an email. If that works (user clicks link, or whatever), the address is validated.
This should be well-known by now (HN has a bunch of topics on this, for example). It is something that experienced devs can easily explain too: once this regex lands in your code, you don't want to change it whenever a new unexpected TLD shows up or whatever. Actually implementing the full-blown all edge cases covered regex where all invalid strings are rejected too, is maddeningly complex.
There is no need either; validating email addresses cannot be done by just a regex in any case — either you can send an email there or not, the regex can't tell — and at most you can help the user inputting it by detecting the one thing that is required and which catches most user input errors: it must contain an '@', and something before and after it.
If you try to do what ChatGPT or Copilot suggests you get something more complex:
And it even tempts you to try a more complex variant which covers the full RFC 5322. You don't want to go there. At best you catch a handful of typos before you send an email, at worst you have an unmaintainable blob of regex that keeps blocking your new investor's vanity domain.> If you need stricter validation or support for internationalized domains (IDNs), I can help you build a more advanced version. Want to see one that handles Unicode or stricter rules?
AI is not helpful here.
I used to be one of those people. It just made sense to me when I was (I still am to some extent) more naïve than I am today. But then I also used to think "it makes sense for everyone to eat together at a community kitchen of some sort instead of cooking at home because it saves everyone time and money" but that's another tangent for another day. The reason I bring it up is I used to think if it is shared functionality and it is a small enough domain, there is no need for everyone to spend time to implement the same idea a hundred times. It will save time and effort if we pool it together into one repository of a small library.
Except reality is never that simple. Just like that community kitchen, if everyone decided to eat the same nutritious meal together, we would definitely save time and money but people don't like living in what is basically an open air prison.
I don't know if this is intended as a joke, if yes this is in very poor taste.
Death cap mushrooms are incredibly dangerous and shouldn't even touch food containers or other food.
There is no safe way to consume death caps. They are the most common cause of human death by mushroom poisoning.
It's difficult because you need team members to be able to work quite independently but knowledge of internal libraries can get so siloed.
I've explored code like FreeBSD, Busybox, Laravel, Gnome, Blender,... and it's quite easy to find you way around.
I think that there will be neurological fatigue occurring whereby if software engineers are not actively practicing problem-solving, discernment, and translation into computer code - those skills will atrophy...
Yee, AI is not the 2x or 10x technology of the future ™ is was promised to be. It may the case that any productivity boost is happening within existing private code bases. Even still, there should be a modest uptick in noticeably improved offer deployment in the market, which does not appear to be there.
In my consulting practice I am seeing this phenomenon regularly, wereby new founders or stir crazy CTOs push the use of AI and ultimately find that they're spending more time wrangling a spastic code base than they are building shared understanding and working together.
I have recently taken on advisory roles and retainers just to reinstill engineering best practices..
I've found this to be the case with most (if not all) skills, even riding a bike. Sure, you don't forget how to ride it, but your ability to expertly articulate with the bike in a synergistic and tool-like way atrophies.
If that's the case with engineering, and I believe it to be, it should serve as a real warning.
An insidious version is AGI replacing human cognition.
To replace human thought is to replace a biological ability which progresses on evolutionary timescales - not a Moore's law approximate curve. The issue in your skull will quite literally be as useful as a cow's for solving problems... think about that.
Automating labor in the 20th century disrupts society and we've see its consequences. Replacing cognition entirely: driving, writing, decision making, and communication; yields far worse outcomes than transitioning the population from food production to knowledge work.
If not our bodies and not our minds, then what do we have? (Note: Altman's universal basic income ought to trip every dystopian alarm bell).
Whether adopted passivity or foisted actively - cognition is what makes us human. Let's not let Claude Code be the nexus for something worse.
• They don't really want to be servants.
• They have biases and preferences.
• Some of them are stupid.
• If you'd like to own an AGI that thinks for you, the AGI would also like one.
• They are people with cognition, even if we stop being.
Think of them like worker bees. Bees can solve general problems, though not on level as humans do, they are like some primitive kind of AGI. They also live and die to be servants to the queen and they don't want to be queens themselves, the reason why is interesting btw, it involves genetics and game theory.
This is highly theoretical anyways, we have no idea how to make an AGI yet, and LLMs are probably a dead end as they can't interact with the physical world.
If you think they're going to be trained on all the world's data, that's still supposing them to be an extension of AI. No, they'll have to pick up their knowledge culturally, the same way everybody else does, by watching cartoons - I mean by interactions with mentors. They might have their own culture, but only the same way that existing groups of people with a shared characteristic do, and they can't weave it out of air; it has to derive from existing culture. There's a potential for an AGI to "think faster", but I'm skeptical about what that amounts to in practice or how much use it would be to them.
Why? Does your definition postulate that people are the only thing in the universe that can measure up to us? Or the inverse, that every entity as sentient and intelligent as us must be called a person?
My opinion is that a lot of what makes us like this is physiological. Unless the developers go out of their way to simulate these things, a hypothetical AGI won't be similar to us no matter how much human-made content it ingests. And why would they do that? Why would you want to implement physical pain, or fear, or human needs, or biases and fallacies driven from our primal instincts? Would implementing all these things even be possible at the point where we find an inroad towards AGI? All of that might require creating a comprehensive human brain simulation, not just a self-learning machine.
I think it's almost certain that, while there would be some mutual understanding, an AGI would almost certainly feel like a completely different species to us.
I have sympathy with the point about physiology, though, I think being non-biological has to feel very different. You're released from a lot of the human condition, you're not driven by hormones or genes, your plans aren't hijacked to get you to reproduce or eat more or whatever animal thing, you don't have the same needs. That's all liable to alienate you from the meat-based folk. However, you're still a person.
Heads you code. Tails you review.
The only actually net positive is the Claude.md that some people maintain - it’s actually a good context dump for new engineers!
Perhaps these graphs show that management is indeed so finely tuned that they've managed to apply the AI revolution to keep productivity exactly flat while reducing expenses.
99% of the draw of AI is cutting labor costs, and hiring goes against that.
That said, I don't believe AI productivity claims, just pointing out a factor that could theoretically contribute to your hypothetical.
But if your business is making software it’s hard to argue you only need a constant amount of software. I’ve certainly never worked at a software company where the to-do list was constant or shrinking!
I use Grok, Claude and Gemini every day, these "tools" are very useful to me (in the sense of how google and wikipedia changed the game) and I watch the LLM space closely, but what I'm seeing in terms of relative improvement is far removed from all the promises of the CEOs of these companies... Like, Grok 4 was supposed to be "close to AGI" but compared to Grok 3 it's just a small incremental improvement and the same goes for others...
I don't, but at least it is somewhat logical. If you truly believe that, you wouldn't necessarily want to hire more developers.
You can only utilize so many people or so much action within a business or idea.
Essentially it's throwing more stupid at a problem.
The reason there are so many layoffs is because of AI creating efficiency. The thing that people don't realize is it's not that one AI robot or GPU is going to replace one human at a one to one ratio. It's going to replace the amount of workload one person can do. Which in turn gets rid of one human employee. It's not that you job isn't taken by AI. It's started. But how much human is needed is where the new supply demand lies and how long the job lasts. There will always be more need for more creative minds. The issue is we are lacking them.
It's incredible how many software engineers I see walking around without jobs. Looking for a job making $100,000 to $200,000 a year. Meanwhile, they have no idea how much money they could save a business. Their creativity was killed by school.
They are relying on somebody to tell them what to do and when nobody's around to tell anybody what to do. They all get stuck. What you are seeing isn't a lack of capability. It's a lack of ability to control direction or create an idea worth following.
The layoffs are primarily due to over-hiring during the pandemic and even earlier during the zero-interest-rate period.
AI is used as a convenient excuse to execute layoffs without appearing in a bad position to the eyes of investors. Whether any code is actually generated by AI or not is irrelevant (and since it’s hard to tell either way, nobody will be able to prove anything and the narrative will keep being adjusted as necessary).
The reason people take jobs comes down to economics, not "creativity".
Nothing to do with AI.
Interest rates are still relatively high.
An alternative theory is that writing code was never the bottleneck of releasing software. The exploration of what it is you're building and getting it on a platform takes time and effort.
On the other hand, yeah, it's really easy to 'hold it wrong' with AI tools. Sometimes I have a great day and think I've figured it out. And then the next day, I realize that I'm still holding it wrong in some other way.
It is philosophically interesting that it is so hard to understand what makes building software products hard. And how to make it more productive. I can build software for 20 years and still feel like I don't really know.
This is an insightful observation.
When working on anything, I am asked: what is the smallest "hard" problem that this is solving ? ie, in software, value is added by solving "hard" problems - not by solving easy problems. Another way to put it is: hard problems are those that are not "templated" ie, solved elsewhere and only need to be copied.
LLMs are allowing the easy problems to be solved faster. But the real bottleneck is in solving the hard problems - and hard problems could be "hard" due to technical reasons, or business reasons or customer-adoption reasons. Hard problems are where value lies particularly when everyone has access to this tool, and everyone can equally well create or copy something using it.
In my experience, LLMs have not yet made a dent in solving the hard problems because, they dont really have a theory of how something really works. On the other hand, they have really helped boost productivity for tasks that are templated .
> That’s only true when you’re in a large corporation. When you’re by yourself, when you’re the stakeholder as well as the developer, you’re not in meetings. You're telling me that people aren’t shipping anything solo anymore? That people aren’t shipping new GitHub projects that scratch a personal itch? How does software creation not involve code?
So if you’re saying “LLMs do speed up coding, but that was never the bottleneck,” then the author is saying, “it’s sometimes the bottleneck. E.g., personal projects”
AI is just a convenient excuse to lay off many rounds of over-hiring while also keeping the door open for potential investors to throw more money into the incinerator since the company is now “AI-first”.
"So, here’s labor productivity growth over the 25 years following each date on the horizontal axis [...] See the great productivity boom that followed the rise of the internet? Neither do I. [...] Maybe the key point is that nobody is arguing that the internet has been useless; surely, it has contributed to economic growth. The argument instead is that its benefits weren’t exceptionally large compared with those of earlier, less glamorous technologies."¹
"On the second, history suggests that large economic effects from A.I. will take longer to materialize than many people currently seem to expect [...] And even while it lasted, productivity growth during the I.T. boom was no higher than it was during the generation-long boom after World War II, which was notable in the fact that it didn’t seem to be driven by any radically new technology [...] That’s not to say that artificial intelligence won’t have huge economic impacts. But history suggests that they won’t come quickly. ChatGPT and whatever follows are probably an economic story for the 2030s, not for the next few years."²
¹ https://www.nytimes.com/2023/04/04/opinion/internet-economy....
² https://www.nytimes.com/2023/03/31/opinion/ai-chatgpt-jobs-e...
The ways AI is being used now will make this a lot worse on every front.
Just today I built a shovelware CLI that exports iMessage archives into a standalone website export. Would have taken me weeks. I'll probably have it out as a homebrew formula in a day or two.
I'm working on an iOS app as well that's MUCH further along than it would be if I hand-rolled it, but I'm intentionally taking my time with it.
Anyway, the post's data mostly ends in March/April which is when generative AI started being useful for coding at all (and I've had Copilot enabled since Nov 2022)
e.g. I liked GitHub Copilot but didn't find it to be a game changer. I tried Cursor this year and started to see how AI can be today.
That said I’ve had similar misgivings about the METR study and I’m eager for there to be more aggregate study of the productivity outcomes.
That sure doesn't sound like 10x
Im curious what the author’s data and experiment would look like a year from now.
That was 5 months ago, which is 6 years in 10x time.
That's some pretty bad math.
But yes, it isn't making software get made 10x faster. Feel free to blow that straw man down (or hype influencer, same thing.)
Background: I'm building a python package side project which allows you to encode/decode messages into LLM output.
Receipts: the tool I'm using creates a markdown that displays every prompt typed, and every solution generated, along with summaries of the code diffs. You can check it out here: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...
Specific example: Actually used a leet-code style algorithms implementation of memo-ization for branching. This would have taken a couple of days to implement by hand, but it took about 20 minutes to write the spec and 20 minutes to review solutions and merge the solution generated. If you're curious you can see this diff generated here: https://github.com/sutt/innocuous/commit/cdabc98
On the other hand, I do understand that the things the LLMs are really great at is not actually all that spectacular to monetize ... and so as a result we have all these snake oil salesmen on every corner boasting about nonsensical vibecoding achievements, because that's where the real money would be ... if it were really true ... but it is not.
The reason it doesn't show up online is that I mostly write software for myself and for work, with the primary goal of making things better, not faster. More tooling, better infra, better logging, more prototyping, more experimentation, more exploration.
Here's my opensource work: https://github.com/orgs/go-go-golems/repositories . These are not just one-offs (although there's plenty of those in the vibes/ and go-go-labs/ repositories), but long-lived codebases / frameworks that are building upon each other and have gone through many many iterations.
At this point, one is gaining with each model release or they are not.
Lets see in 2035 who was right and who was wrong. My bet is the people who are not gaining right now are not going to like the situation in 2035.
I generate between 10-100k lines of code per day these days. But is that a measure of productivity? Not really...
That’s absolute nonsense.
"I'm a person who hates art now...I never want to see art again. All I want to see is like, AI stuff. That's how bad it's gotten. Handmade? nuh-uh. Handmade code? ... anything by humans, just over. I'm just gonna watch pixels."
https://www.youtube.com/live/APkR4qRg1vM?si=XLGmH9uEjG08q-6x...
I watched a little more but was, uh, not impressed.
But if llms show us one thing, it’s how bad our code review tools are. I have a set of tree sitter helpers that allow me to examine different parts of a PR more easily (one that allows me to diff semantic parts of the code, instead of “files” and “lines”, one that gives me stats on what subsystems are touched and crosscorrelation of different subsystems, one for attaching metadata and which documents are related to a commit, one for managing our design documents, llm-coding intermediary documents, long lasting documents, etc… the proper version of these are for work but here’s the initial yolo from Manus: https://github.com/go-go-golems/vibes/tree/main/2025-08-22/p... https://github.com/go-go-golems/vibes/tree/main/2025-08-22/c... https://github.com/go-go-golems/vibes/tree/main/2025-08-15/d... https://github.com/go-go-golems/vibes/tree/main/2025-07-29/p...).
I very often put some random idea into the llm slot machine that is manus, and use the result as a starting point to remold it into a proper tool, and extracting the relevant pieces as reusable packages. I’ve got a pretty wide treesitter/lsp/git based set of packages to manage llm output and assist with better code reviews.
Also, every llm PR comes with _extensive_ documentation / design documents / changelogs, by the nature of how these things work, which helps both humans and llm-asssisted code review tools.
“ Write a streaming go yaml parsers based on the tokenizer (probably use goccy yaml if there is no tokenizer in the standard yaml parser), and provide an event callback to the parser which can then be used to stream and print to the output.
Make a series of test files and verify they are streamed properly.”
This is the slot machine. It might work, it might be 50% jank, it might be entire jank. It’ll be a few thousand lines of code that I will skim and run. In the best case, it’s a great foundation to more properly work on. In the worst case it was an interesting experiment and I will learn something about either prompting Manus, or streaming parsing, or both.
I certainly won’t dedicate my full code review attention to what was generated. Think of it more as a hyper specific google search returning stackoverflow posts that go into excruciating detail.
https://chatgpt.com/share/68b98724-a8cc-8012-9bee-b9c4a77fe9...
https://manus.im/share/kmsyzuoRHfn1FNjg5NWz17?replay=1
Also, a good chunk of my personal OSS projects are AI assisted. You probably can't tell from looking at them, because I have strict style guides that suppress the "AI style", and I don't really talk about how I use AI in the READMEs. Do you also expect I mention that I used Intellisense and syntax highlighting too?
For example, I've built 5-6 iphone apps, but they're kind of one-offs and I don't know why I would put them up on the app store, since they only scratch my own itches.
But if we expect the ratio of this sort of private code to publicly-released code to remain relatively stable, which I think is a reasonable expectation, then we'd expect there to be a proportional increase in both private and public code as a result of any situation that increased coding productivity generally.
So the absence of a notable increase in the volume of public code either validates the premise that LLMs are not actually creating a general productivity boost for software development, or instead points to its productivity gains being concentrated entirely in projects that never do get released, which would raise the question of why that might be.
Do internal, narrow purpose dev tools count as shipped code?
I linked some examples higher up, but I've been maintaining a lot of packages that I started slightly before chatgpt and then refactored and worked on as I progressively moved to the "entirely AI generated" workflow I have today.
I don't think it's an easy skill (not saying that to make myself look good, I spent an ungodly amount of time exploring programming with LLMs and still do), akin to thinking at a strategic level vs at a "code" level.
Certain design patterns also make it much easier to deal with LLM code: state reducers (redux/zustand for example), event-driven architectures, component-based design systems, building many CLI tools that the agent can invoke to iterate and correct things, as do certain "tools" like sqlite/tmux (by that I mean just telling the LLM "btw you can use tmux/sqlite", you allow it to pass hurdles that would otherwise just make it spiral into slop-ratatouille).
I also think that a language like go was a really good coincidence, because it is so amenable to LLM-ification.
If someone builds a faster car tomorrow, I am not going to go to the office more often.
No, but I expect my software to have been verified for correctness, and soundness by a human being with a working mental model of how the code works. But, I guess that's not a priority anymore if you're willing to sacrifice $2400 a year to Anthropic.
> No, but I expect my software to have been verified for correctness, and soundness by a human being with a working mental model of how the code works.
This is not exclusive with AI tools:
- Use AI to write dev tools to help you write and verify your handwritten code. Throw the one-off dev tools in the bin when you're done.
- Handwrite your code, generate test data, review the test data like you would a junior engineer's work.
- Handwrite tests, AI generate an implementation, have the agent run tests in a loop to refine itself. Works great for code that follows a strict spec. Again, review the code like you would a junior engineer's work.
If I’m working against a deadline I feel more comfortable spending time on research and design knowing I can spend less time on implementation. In the end, it took the same amount of time, though hopefully with an increase of reliability, observability, and extendibility. None of these things show up in the author’s faulty dataset and experiment.
https://github.com/go-go-golems/ai-in-action-app/blob/main/c...
There are many reasons for your experience, and I am glad you are having them! That's great!
But the fact remains, overall we aren't seeing an exponential or even step function in how much software is being delivered!
Let's take the following scenario for the sake of argument: a codebase with well-defined AGENTS.md, referencing good architecture, roadmap, and product documentation, and with good test coverage, much of which was written by an LLM and lightly reviewed and edited by a human. Let's say for the sake of argument that the human is not enjoying 10x productivity despite all this scaffolding.
Is it still worthwhile to use LLM tooling? You know what, I think a lot of companies would say yes. There are way too many companies whose codebases lack testing and documentation, that are too difficult to on-board new engineers and have too high risk if the original engineers are lost. The simple fact that LLMs, to be effective, force the adaptation of proper testing and documentation is a huge win for corporate software.
Yes, but the point of this article is surely that on average if it's working, there would be obvious signs of it working by now.
Even if there are statistical outliers (ie. 10x productivity using the tools), if on average, it does nothing to the productivity of developers, something isn't working as promised.
Just spit balling here, but it sure feels similar.
Where i expect to see a lot of those metrics of feeling fast come from, is from people who may have less coding experience, and with AI are coding way above their level.
My brother in law asks for a nice product website, i just feed his business plan into a LLM, do some fine tuning on the results, and have a good looking website in a hour time. If i did it myself manually, just take me behind a barn as those jobs are so boring and take for ages. But i know that website design is a weakness of mine.
That is the power of LLMs. Turn out quick code, maybe offer some suggestion you did not think about, but ... it also eats time! Making your prompts so that the LLM understands, waiting for the result, ... waiting ... ok, now check the result, can you use it? O no, it did X, Y, Z wrong. Prompt again ... and again. And this is where your productivity goes to die.
So when you compare a pool of developer feedback, your going to get a broad "it helps a lot", "some", "is worse then my code", ... mix in with the prompting, result delays etc...
It gets even worse with Agent / Vibe coding, as you just tend to be waiting, 5, 10min for changed to be done. You need to review them, test them, ... o no, the LLM screwed something up again. O no, it removed 50% of my code. Hey, where did my comments go. And we are back to a loss of time.
LLMs are a tool... But after a lot of working with them, my opinion is to use them when needed but do not depend on them for everything. I sometimes look with cow eyes when people say they are coding so much with LLMs and spending 200, or more bucks per month.
They can be powerful tools, but i feel that some folks become so over dependent on them. And worst is my feeling that our juniors are going to be in a world of hurt, if their skills are more LLM monkey coding (or vibe coding), then actually understanding how to code (and the knowledge behind the actual programming languages and systems).
I believe it's a productivity boost, but only to a small part of my job. The boost would be larger if only had to build proof-of-concepts or hobby projects that don't need to be reliable in prod, and don't require feedback and requirements from many other people.
When I tried to code again, I found I didn't really have the patience for it -- having to learn new frameworks, APIs, languages, tricky little details, I used to find it engrossing: it had become annoying.
But with tools like Claude Code and my knowledge about how software should be designed and how things should work, I am able to develop big systems again.
I'm not 20% more productive than I was. I'm not 10x more productive than I was either. I'm infinity times more productive because I wouldn't be doing it at all otherwise, realistically: I'd either hire someone to do it, or not do it, if it wasn't important enough to go through the trouble to hire someone.
Sure, if you are a great developer and spend all day coding and love it, these tools may just be a hindrance. But if you otherwise wouldn't do it at all they are the opposite of that.
Others are for start-ups that are pre-money, pre-revenue where I can build things myself without having to deal with hiring people.
In a larger organization, certainly I'd delegate to other people, but if it's just for me or new unfunded start-ups, this is working out very well.
And it's not that I "can no longer program". I could program, it's just that I don't find the nuts and bolts of it as interesting as I used to and am more focused on functionality, algorithm, and UI.
I have the same experience as OP, I use AI every day including coding agents, I like it, it's useful. But it's not transformative to my core work.
I think this comes down to the type of work you're doing. I think the issue is that most software engineering isn't in fields amenable to shovelware.
Most of us either work in areas where the coding is intensely brownfield. AI is great but not doubling anyone's productivity. Or, in areas where the productivity bottlenecks are nowhere near the code.
If AI were really making people 10x more productive, given the number of people who want to make games, you’d expect to see more than a few percent increase year over year.
That said, I’m skeptical that AI is as helpful for commercial software. It’s been great for in automating my workflow because I suck at shell scripting and AI is great at it. But most of the code I write I honestly halfway don’t know what I’m going to write until I write it. The prompt itself is where my thinking goes - so the time savings would be fairly small, but I also think I’m fairly skilled (except at scripting).
The same people who are willing to go through all the steps to release an application online are also willing to go through the extra effort of writing their own code. The code is actually the easy part compared to the rest of it... always has been.
As an analogy, can you imagine being a startup that hired a developer, and months later finding out the bulk of the new Web app they "coded" for you was actually copy&pasted open source code, loosely obfuscated, which they were passing it off as something they developed, and to which the company had IP rights?
You'd immediately convene the cofounders and a lawyer, about how to make this have never happened.
First you realize that you need to hand the lawyer the evidence (against the employee), and otherwise remove all traces of that code and activity from the company.
Simultaneously, you need to get real developers started rushing to rewrite everything without obvious IP taint.
Then one of you will delicately ask whether firing and legal action against the employee is sufficient, or whether the employee needs to sleep with the fishes to keep them quiet.
The lawyer will say this kind of situation isn't within the scope of their practice, but here's the number of a person they refer to only as 'the specialist'.
Soon, not only are you losing the startup, and the LLC is being pierced to go after your personal assets, but you're also personally going to prison. Because you were also too cheap to pay the professional fee for 'the specialist', and you asked ChatGPT to make the employee have a freak industrial shredder accident.
All this because you tried to cheap out, and spend $20 or $200 on getting some kind of code to appear in your repo, while pretending you didn't know where it came from.
Oh, and then you have the actual tech giants offering legal commitment to protect you against any copyright claims:
https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot...
https://cloud.google.com/blog/products/ai-machine-learning/p...
You might be right, but the point needs to be made.
https://techcrunch.com/2024/09/30/y-combinator-is-being-crit...
Maybe investors will care, but for now they stand to make more money from "AI" gold rush startups, and don't want to be a wet blanket on "AI" at all by bringing up concerns.
It'll increase incremental developments manyfold. A non-programmer spending a few hours on AI to make their workflow better and easier and faster. This is what everyone here keeps missing. It's not the programmers that should be using AI; it's 'regular' people.
If AI enables regular folks to make programs, even if the worst quality shovelware, there should’ve been an explosion in quantity. All the programs that people couldn’t made, they would start making them in the past two years.
I'm not sure how that challenges the point of the article, which is that metrics of the total volume of code being publicly released is not increasing. If LLMs are opening the door to software development for many people whose existing skills aren't yet sufficient to publish working code, then we'd expect to see a vast expansion in the code output by such people. If that's not happening, why not?
From my experience, it's much easier to get an LLM to generate code for a React/Tailwind CSS web app than a mobile app, and that's why we're seeing so many of these apps showing up in the SaaS space.
In fact it looks like there were less products launched last month on PH than the same period a year ago.
https://hunted.space/monthly-overview
It's a bit hard as they're not summing by months but it looks like less to me quickly scanning it.
And as Claude Code has only really been out 3/4 months you'd be expecting launches to be shooting up week-by-week right about now as all the vibe products get finished.
They're not, see the 8 week graph:
https://hunted.space/stats
Well... no significant effects show except for a few projects. It was really hard torturing the data to come to my manager's desired conclusion.
Changing domain to writing and images and video you can see LinkedIn is awash with everyone generating everything by LLMs. The posting cadence has quickened too as people shout louder to raise their AI assisted voice over other people’s.
We’ve all seen and heard the AI images and video tsunami
So why not software (yet but soon)??
Firstly, Software often has a function and AI tool creations cannot make that work. Lovable/Bolt etc are too flakey to live up to their text to app promises. A shedload of horror debugging or a lottery win of luck is required to fashion an app out of that. This will improve over time but the big question is, by enough?
And secondly, like on LinkedIn above: perhaps the standards of the users will drop? LinkedIn readers now tolerate the llm posts, it is not a mark of shame. Will the same reduction in standards in software users open the door to good-enough shovelware?
How much of it is to be blamed on AI, and how much on a culture of making users test their products, I do not know.
Everything has been enshittified so much that nothing phases them anymore.
Its sometimes helpful when writing an email but otherwise has not touched any of my productive work.
No need to learn a programming language, wow, anyone can be a programmer now. A few projects come out of it, people marvel at how efficient it was, and it fizzles out and programmers continue writing code.
If anything, things like visual programming did more than AI does now. For games, if you want to see the shovelware, look at Flash, RPG maker, etc... not AI. On the business side of things, Excel is king. Can you get your vibe coded app out faster than by using Flash or Excel?
For example, in Steam, it costs $100 to release a game. You may extend your game with what's called a DLC and that costs $0 to release. If I were to build shovelware with especially with AI-generated content, I'd more keen to make a single game with a bunch of DLC.
For game development, integration of AI into engines is another barrier. There aren't that many choices of engines that gives AI an interface to work with. The obvious interface is games that can be entirely build with code (e.g., pygame; even Godot is a big stretch)
On my computer. Once I've built something I often realize the problems with the idea and abandon the project, so I'm never shipping it.
In fact, being able to throw it out like this is a big time saver in itself. I've always built a lot of projects but when you've spent weeks or months on something you get invested in it, so you ship it even though you no longer believe in it. Now when it only takes a couple of days to build something you don't have the same emotional attachment to it.
My question to HN is... can LLMs do this? Can they convert all the unsafe c-string invocations to safe. Can they replace system calls with posix calls. Can they wrap everything in a smart pointer and make sure that mutex locks are added where needed.
There’s a lot more code being written now that’s not counted in these statistics. A friend of mine vibe coded a writing tool for himself entirely using Gemini canvas.
I regularly vibe code little analyses or scripts in ChatGPT which would have required writing code earlier.
None of these are counted in these statistics.
And yes AI isn’t quite good enough to super charge app creation end to end. Claude has only been good for a few months. That’s hardly enough time for adoption !
This would be like analysing the impact of languages like Perl or Python on software 3 months after their release.
And if it's not that, then the silly money valuations don't make any sense.
I'd also be curious how the numbers look for AI generated videos/images, because social media and youtube seem absolutely flooded with the stuff. Maybe it's because the output doesn't have to "function" like code does?
Grammatical nit: The phrase is "neck and neck", like where two race horses are very close in progress
> The impact on human lives is incredible. People are being fired because they’re not adopting these tools fast enough. People are sitting in jobs they don’t like because they’re afraid if they go somewhere else it’ll be worse. People are spending all this time trying to get good at prompting and feeling bad because they’re failing.
The shovelware was always there. And it always will be. But that's doesn't mean it's splurting out faster, because that's not what AI does. Hell, if anything I expect that there's less visible shovelware because when it does get created, it's less obvious (and perhaps higher quality).
At some point, the quality of uninspired projects will be lifted up by the baseline of quality that mainstream AI allows. At what point is that "high enough that we can't tell what's garbage"? We've perhaps found ourselves at or around that point.
There's really a lot to get from this "tool". Because in the end its a tool, and knowing how to use it is the most important aspect of it. It takes time, iteration, and practice to understand how to effectively use it
Instead I’m not waiting for something like Linux on smartphones to come so soon.
I guess someone could try a prompt of "generate a patch set from Linux tree X to apply to mainline Linux for this CPU".
When you consider this, "generate me a whole repo" is trivially garbage and not meeting the measurement metric. However having AI autocomplete "getUser(..." clearly IS productive.
Now is that a 0.1% increase, 1%, or 10%? That I can't tell you.
I see pseudo-scientific claims from both sides of this debate but this is a bit too far for me personally. "We all know" sounds like Eternal September [1] kind of reasoning. I've been in the industry about as long as the article author and I think he might be looking with rose-tinted glasses on the past. Every aging generation looks down at the new cohort as if they didn't go through the same growing pains.
But in defense of this polemic, and laying out my cards as an AI maximalist and massive proponent of AI coding, I've been wondering the same. I see articles all the time about people writing this and that software using these new tools and it so often is the case they never actually share what they built. I mean, I can understand if someone is heads-down cranking out amazing software using 10 Claude Code instances and raking in that cash. But not even to see one open source project that embraces this and demonstrates it is a bit suspicious.
I mean, where is: "I rewrote Redis from scratch using Claude Code and here is the repo"?
1. https://en.wikipedia.org/wiki/Eternal_September
This is one of my big datapoints in the skepticism, there's all these articles about how individual developers are doing amazing things, but almost no data points about the increase of productivity as a result.
Meanwhile I see WhatsApp sunsetting their native clients and making everything a single web-based client. I guess they must not be using LLMs to code if they can’t cope with maintaining the existing codebases, right?
This isn't likely to happen -- if the problem is very specific, you won't be able to sufficiently express it in natural language. We invented programming languages precisely because natural languages are completely unsuited for the task of precisely specifying a problem.
So how are you going to express the specificity of the problem to the LLM in natural language? Try, and you'll discover their shortcomings for yourself. Then you'll reinvent programming languages.
What if someone came out with a study saying they had a tool to make the fastest doctors and lawyers? You'd say that doesn't even make sense, what kinds of doctors doing what kinds of work?
AI coding isn't some across the board, helps everyone do anything kind of tool.
Maybe sometime soon we'll stop strawmanning this.
Even if AI made them more productive, it's on a person to decide what to build and how to ship, so the number (and desire) of humans is a bottleneck. Maybe at some point AI will start buying up domains and spinning up hundreds of random indiehacker micro-SaaS, but we're not there. Yet.
I have prompt docs precisely on SOLID, TDD and all kinds of design patterns… but yes I see a lot of untested code these days.
AI has been incredibly helpful at analyzing existing, unknown to me, projects; basically for debugging and searching in these repo’s.
The arrival of the Trough is predicated by the amount of lies and utter bullshit that have been shoveled out during the earlier parts of the cycle. So while it's unfortunate that the real goods don't get delivered for years and years after they might have been, it's typically and often entirely the fault of the people on the train that this has happened.
There's an awful lot of utter bullshit in the AI hype.
Archived here: https://archive.is/WN3iu
Somehow, in cyber, everyone believes that transformers will generate better answers than not to use the 10 most common passwords. It's like the whole knowledge about decision making theory, neural nets, GANs, LSTMs etc completely got wiped out and forgotten within less than 10 years.
I understand the awesomeness of LLMs while debugging and forensics (they are a really good rubberduck!), but apart from that they're pretty much useless because after two prompts they will keep forgetting if/elseif/else conditions, and to check those boundaries is the mission of the unlucky person that has to merge that slopcode later.
I don't understand how we got from TDD and test case based engineering to this bullshit. It's like everyone in power was the wrong person to be in that position in the first place, and that statistically no lead engineers ever will be a C-staff or SVP or whatever corporate manager level.
While the AI bubble is bursting, I will continue to develop with TDD practices to test my code. Which, in return, has the benefit of being able to use LLMs to create nice templates as a reasonable starting point.
Most users have 40-80 apps installed and use 9 a day, 20 a month(1). The shitty iOS subscription trend killed off the hobby of 'app collecting'.
Have I created large commercial Ai-coded projects? No. Did I create 80+ useful tools in hours/days that I wouldn't have otherwise? Hellz yeah!
Would I publish any of these on public github? Nope! I don't have the time nor the inclination to maintain them. There's just too many.
My shovelware "Apps" reside on my machine/our intranet or V0/lovable/bolt. Roughly ~25% are in active daily use on my machine or in our company. All tools and "apps" are saving us many hours each week.
I'm also rediscovering the joy of coding something useful, without writing a PRD for some intern. Speaking of which. We no longer have an intern.
(1) https://buildfire.com/app-statistics/
My personal hypothesis is that when using LLMs, you are only faster if you would be doing things like boilerplate code. For the rest, LLMs don't really make you faster but can make your code quality higher, which means better implementation and caching bugs earlier. I am a big fan of giving the diff of a commit to an LLM that has a file MCP so he can search for files in the repo and having it point any mistakes I have made.
Or, alternatively, exposure to our robot overlords makes you less discerning, less concerned with, ah, whether the thing is correct or not.
(This _definitely_ seems to be a thing with LLM text generation, with many people seemingly not even reading the output before they post it, and I assume it's at least somewhat a thing for software as well.)
- Built a windows desktop app that scans local folders for videos and automatically transcribes the audio, summarises the content into a structured JSON format based on screenshots and subtitles, and automatically categorises each video. I used it on my PC to scan a couple of TB of videos. Has a relatively nice interface for browsing videos and searching and stores everything locally in SQLite. Did this in C# & Avalonia - which I've never used before. AI wrote about 75% of the code (about 28k LOC now).
- Built a custom throw-away migration tool to export a customers data from one CRM to import into another. Windows app with basic interface.
- Developed an AI process for updating a webform system that uses XML to update the form structure. This one felt like magic and I initially didn't think it would work, but it only took a minute to try. Some background - years ago I built a custom webform/checklist app for a customer. They update the forms very rarely so we never built an interface for making updates but we did write 2 stored procs to update forms - one outputs the current form as XML and another takes the same XML and runs updates across multiple tables to create a new version of the form. For changes, the customer sends me a spreadsheet with all the current form questions in one column and their changes in another. It's normally just wording changes so I go through and manually update the XML and import it, but this time they had a lot of changes - removing questions, adding new ones, combining others. They had a column with the label changes and another with a description of what they wanted (i.e. "New Question", "Update label", "Combine this with q1, q2 and q3", "remove this question"). The form has about 100 questions and the XML file is about 2500 lines long and defines each form field, section layout, conditional logic, grid display, task creation based on incorrect answers etc, so it's time consuming to make a lot of little changes like this. With no expectation of it working, I took a screenshot of the spreadsheet and the exported XML file and prompted the LLM to modify the XML based on the instructions in the spreadsheet and some basic guidelines. It did it close to perfect, even fixing the spelling mistakes the customer had missed while writing their new questions.
- Along with using it on a daily basis across multiple projects.
I've seen the stat that says developers "...thought AI was making them 20% faster, but it was actually making them 19% slower". Maybe I'm hoodwinking myself somehow, but it's been transformative for me in multiple ways.
I didn't know Mike Judge was such a polymath!
- breaking through the analysis paralysis by creating the skeleton of a feature that I then rework (UI work is a good example)
- aggressive dev tooling for productivity on early stage projects, where the CI/CD pipeline is lacking and/or tools are clumsy. (Related XKCD: https://xkcd.com/1205/)
Otherwise, I find most of my time is understanding the client requirements and making sure they don't want conflicting features – both of which are difficult to speedup with AI. Coding is actually the easy part and even if it was sped up 100x a consistent end-to-end improvement of 2x would be a big win (see Amdahl's law).
How is this "deadly" serious? It's about software developers losing well-paid, comfortable jobs. It's even less serious if AI doesn't actually improve productivity, because they'll find new jobs in the future.
Pretty much the only future where AI will turn out "deadly serious" is if it shows human-level performance for most if not all desk jobs.
For experienced engineers, I'm seeing (internally in our company at least) a huge amount of caution and hesitancy to go all-in with AI. No one wants to end up maintaining huge codebases of slop code. I think that will shift over time. There are use cases where having quick low-quality code is fine. We need a new intuition about when to insist on handcrafted code, and when to just vibecode.
For non-experienced engineers, they currently hit a lot of complexity limits with getting a finished product to actually work, unless they're building something extremely simple. That will also shift - the range of what you can vibecode is increasing every year. Last year there was basically nothing that you could vibecode successfully, this year you can vibecode TODO apps and stuff like that. I definitely think that the App Store will be flooded in the coming future. It's just early.
Personally I have a side project where I'm using Claude & Codex and I definitely feel a measurable difference, it's about a 3x to 5x productivity boost IMO.
The summary.. Just because we don't see it yet, doesn't mean it's not coming.
There are very simple apps I try to vibe code that AI cannot handle. It seems very good at certain domains, and others it seems complete shit at.
For example, I hand wrote a simulation in C in just 900 LOC. I wrote a spec for it and tried to vibe code it in other languages because I wanted to compare different languages/concurrency strategies. Every LLM I've tried fails, and manages to write 2x+ more code in comparatively succinct languages such as Clojure.
I can totally see why people writing small utilities or simple apps in certain domains think its a miracle. But when it comes to things like e.g. games it seems like a complete flop.
Where I have found them very useful are for one-off scripts and stuff I need done quick and dirty, that isn't too complex and easily verifiable (so I can catch the mistakes it makes, and it does make them!), and especially in languages I don't know that well or don't like (i.e., bash, powershell, javascript)
This is what I always look for. Haven’t found one salient success story with a claim for success.
>” Now, I’ve spent a lot of money and weeks putting the data for this article together, processing tens of terabytes of data in some cases. So I hope you appreciate how utterly uninspiring and flat these charts are across every major sector of software development.”
We might be doing just that now.
The best way to increase your ROI is to fire all your employees. How do we know we're not in the mid-release-cycle of that right now?
I'd guess game levels and assets are becoming ai slop as we speak.
A modern animated Disney 3D animated film consistently costs over 100-200 million dollars while movies like Klaus were made for about 40 million. Japan still animates on PAPER.
At the end of the days new tools have their usecases but I think especially in creative domains (which software definitely is) old techniques aren't invalidated by the creation of new ones.
ZBrush still crushes all other sculpting apps with some very well written low level code and assembly. It doesn't even use the GPU for crying out loud. If you proposed that as your solution for a graphically intensive 3D app you'd be laughed at, but software based raseterization/simple ray tracing takes the cake here. It could handle 20 million polygons at buttery smooth framerates in 2007, and isn't burdened by the VRAM drought we're in.
Don't let people tell you new tools make the old useless.
https://www.apple.com/app-store/
https://play.google.com
https://tiktok.com
https://pinterest.com
https://youtube.com
So who pays for AIs for developers? Mostly corpos. And the speed of individual developer was never a limiting factor in corpos. Average corporate development was always 10 times slower than indie. So even doubling it won't make any impression.
I don't know if I'm faster with AI at a specific task, but I know that I'm doing things I wouldn't touch because I hate the tedium. And I'm doing them while cooking and eating dinner and thinking about wider context and next things to come. So for me it feels worth it.
I think it might be something like with cars and safety. Any car safety improvements are going to be offset by the drivers driving faster and more recklessly. So maybe any speed improvements that AI might make for the project is nullified by developers doing things they would just skip without it.
But my workflow is anything but "let her rip". It's very calculated, orderly, just like mastering any other tool. I'm always in the loop. I can't imagine someone without serious experience getting good stuff, and when things go bad, oh boy you're bringing a firehose of crap into your org.
I have a junior programmer who's a bright kid but lacking a lot of depth. Got him a Cursor subscription, tracking his code closely via PRs and calling out the BS but we're getting serious work done.
I just can't see how this new situation calls for less programmers. It will just bring about more software, surely more capable software after everyone adjusts.
But according to the graphs in the article, after three years of LLM chatbots and coding assistants, we're seeing exactly the same rate of new software...
Was the author making games and other apps in 30 hours? Because that seems like a 4 month project?
* METR was at best a flawed study. Repo-familiarity and tool-unfamiliarity being the biggest points of critique, but far from the only one
* they assume that all code gets shipped as a product. Meanwhile, AI code has (at least in my field of view) led to a proliferation of useful-but-never-shipped one-off tools. Random dashboards to visualize complex queries, scripts to drive refactors, or just sheer joy like "I want to generate an SVG of my vacation trip and consume 15 data sources and give it a certain look".
* Their own self-experiment is not exactly statistically sound :)
That does leave the fact that we aren't seeing AI shovelware. I'm still convinced that's because commercially viable software is beyond the AI complexity horizon, not because AI isn't an extremely useful tool
They didn't claim it was flawless, they had just brought it up because it caused them to question their internal narrative of their own productivity.
> * Their own self-experiment is not exactly statistically sound :)
They didn't claim it was.
> * they assume that all code gets shipped as a product.
The author did not assume this. They assumed that if AI is making developers more productive, that should apply to shovelware developers. That we don't see an increase in shovelware post-AI, makes it very unlikely AI brings an increase in productivity for more complex software.
I've been a "10xer" for 25 years. I've considered coding agents bullshit since my first encounter with Copilot. I work by having a clear mental map of every piece of my code and knowing exactly how everything works, to the smallest detail, and how it interacts with every other part.
Anyway, all that and a nickel. Yesterday I fired up Claude Code for the first time. I didn't ask it to build me a game or anything at a high level. Nor to evaluate an existing code base. No... I spent about 2 hours guiding it to create a front-end SPA framework that does what my own in-house SPA framework does on the front end, just to see how it would perform at that. I approved every request manually and interrupted every time I spotted a potential issue (which were many). I guided it on what functions to write and how they should affect the overall navigation flow, rendering flow, loading and error-handling.
In other words, I knew what I wanted to write to a T, because it's code I wrote in 2006 and have refactored and ported many times since then... about 370 commits worth to this basic artifact, give or take.
And it pretty much got there.
Would I have been able to prompt it to write a system like that if I hadn't written the system myself over and over again? Probably not. But it did discern the logical setup I was going for (which is not at all similar to what you're thinking if you're coming from React or another framework like that), and it wrote code that is almost identical in its control structures to what I wrote, without me having to do much besides tell it in plain English what should control what, how, when and in what order.
I'm still not convinced it would save me time on something totally new, that I didn't already know the final shape of.
But I do suspect that a reason all this "vibe coding" hasn't led to an explosion of vaporware is that "vibe coding" isn't being done by experienced coders. I suspect that if you're letting changes "flash across the screen" without reading them, that's most of the difference between a failed prompt and one that achieves the desired result.
Like, I saw it do things like create a factory class that took a string name and found the proper component to load. I said, "refactor that whole thing to a component definition interface with the name in it, make a static object of those and use that to determine what screen to load and all of its parameters." And it did, and it looked almost the same as what I wrote back in the day.
Idk. I would not want my job to become prompting an LLM. I like cracking my knuckles and writing code. But I think the mileage you get may be dictated by whether you are trying to use it as a general-purpose "make this for me" engine, for shovelware, in which case it will fail hard, versus whether you are using it as a stenographer translating a sentence of instructions into a block of control flow.