LLMs are very useful tools for software development, but focusing on employment does not appear to really dig into if it will automate or augment labor (to use their words). Behaviors are changing not just because of outcomes but because of hype and expectations and b2b sales. You'd expect the initial corporate behaviors to look much the same whether or not LLMs turn into fully-fire-and-forget employee-replacement tools.
Some nits I'd pick along those lines:
>For instance, according to the most recent AI Index Report, AI systems could solve just 4.4% of coding problems on SWE-Bench, a widely used benchmark for software engineering, in 2023, but performance increased to 71.7% in 2024 (Maslej et al., 2025).
Something like this should have the context of SWE-Bench not existing before November, 2023.
Pre-2023 systems were flying blind with regard to what they were going to be tested with. Post-2023 systems have been created in a world where this test exists. Hard to generalize from before/after performance.
> The patterns we observe in the data appear most acutely starting in late 2022, around the time of rapid proliferation of generative AI tools.
This is quite early for "replacement" of software development jobs as by their own prior statement/citation the tools even a year later, when SWE-Bench was introduced, were only hitting that 4.4% task success rate.
It's timing lines up more neatly with the post-COVID-bubble tech industry slowdown. Or with the start of hype about AI productivity vs actual replaced employee productivity.
NitpickLawyer · 44m ago
> Hard to generalize from before/after performance.
While this is true, there are ways to test (open models) on tasks created after the model was released. We see good numbers there as well, so something is generalising there.
hochstenbach · 1h ago
One would expect that if such studies indeed indicate that AI has an effect on early-career workers in AI-exposed occupations, that this would be a global effect. I wonder if there are good comparable non-US studies available.
moi2388 · 26m ago
As a non-US citizen, in my EU country we’re still starving for new programmers.
trhway · 25m ago
Poland? Sometimes ago i looked up salaries in Warsaw - it were like $10-$20K/month which as i understand is pretty high by EU standards.
yurishimo · 14m ago
Really? That's crazy. I'm earning a bit over 5k in the Netherlands. Granted, not Amsterdam, but still.
eru · 1h ago
Yes, even if the underlying AI stops advancing today, it will take a while for the economy to digest and adjust to the new systems. Eg a lot of the improvements in usefulness in the last few quarters came from better tooling, not necessarily better models.
But with progress continuing in the models, too, it's an even more complicated affair.
trhway · 27m ago
Offshoring was similar - i.e. companies discovered that expensive labor here can be performed inexpensively there while senior laborers/PMs here would perform the overseeing role - and we can look at it how long it took to digest it and adjust to it. While 15-20 years ago it was all the rage, today it is just an established well understood and efficiently utilized, where applicable, practice.
ath3nd · 1h ago
> LLMs are very useful tools for software development
That's an opinion many disagree with. As a matter of fact, the only limited study up to date showed that LLMs usage decrease productivity for experienced developers by roughly 19%. Let's reserve opinions and link studies.
My anecdotal experience, for example, is that LLMs are such a negative drain on both time and quality that one has to be really early in their career to benefit from their usage.
CuriouslyC · 1m ago
People who suck at typing are better off writing by hand as well. I don't need to argue, I'll let history pick a winner.
9rx · 8m ago
> decrease productivity for experienced developers by roughly 19%.
Seems about right when trying to tell an LLM what to code. But flipping the script, letting the LLM tell you what to code, productivity gains seem much greater. Like most programmers will tell you: Writing code isn't the part of software development that is the bottleneck.
manmademagic · 51m ago
I wouldn't call myself an 'experienced' developer, but I do find LLMs useful for once-off things, where I can't justify the effort to research and implement my own solution. Two recent examples come to mind:
1. Converting exported data into a suitable import format based on a known schema
2. Creating syntax highlighting rules for language not natively support in a Typst report
Both situations didn't have an existing solution, and while the outputs were not exactly correct, they only needed minor adjustments.
Any other situation, I'd generally prefer to learn how to do the thing, since understanding how to do something can sometimes be as important as the result.
yakshaving_jgt · 55m ago
I’m 15 years into my career and I write Haskell every day. I’m getting a massive productivity boost from using an LLM.
black_knight · 25m ago
How do you find the quality of the Haskell code produced by LLM? Also, how do you use the LLM when coding Haskell? Generating single functions or more?
yakshaving_jgt · 11m ago
I'm stuck in my ways with vim/tmux/ghci etc, so I'm not using some AI IDE. I write stuff into ChatGPT and use the output, copying manually, or writing it myself with inspiration from what I get. I feed it a fair bit of context (like, say, a production module with a load of database queries, and the associated spec module) so that it copies the structure and patterns that I've established.
The quality of the Haskell code is about as good as I would have written myself, though I think it falls for primitive obsession more than I would. Still, I can add those abstractions myself after the fact.
Maybe one of the reasons I'm getting good results is because the LLM effectively has to argue with GHC, and GHC always wins here.
I've found that it's a superpower also for finding logic bugs that I've missed, and for writing SQL queries (which I was never that good at).
wahnfrieden · 44m ago
That's a skill issue. That lone study was observing untrained participants.
It's no surprise to me that devs who are accustomed to working on one thing at a time due to fast feedback loops have not learned to adapt to paralellizing their work (something that has been demonized at agile style organizations) and sit and wait on agents and start watching YouTube instead, as the study found (productivity hits were due to the participants looking at fun non-work stuff instead of attempting to parallelize any work).
The study reflects usage of emergent tools without training, and with regressive training on previous generation sequential processes, so I would expect these results. If there is any merit in coordinating multiple agents on slower feedback work, this study would not find it.
ardit33 · 37m ago
LLMs help a lot in doing 'well defined' tasks, and things that you already know you want, and they just accelerate the development of it. You still have to re-write some of it, but they do the boring stuff fast.
They are not great if your tasks are not well defined. Sometimes, they suprise you with great solutions, sometimes they produce mess that just wastes your time and deviates from your mission.
To, me LLMs have been great accelerants when you know what you want, and can define it well. Otherwise, they can waste your time by creating a lot of code slop, that you will have to re-write anyways.
One huge positive sideffect, is that sometimes, when you create a component, (i.e. UI, feature, etc), often you need a setup to test, view controllers, data, which is very boring and annoying / time wasting to deal. LLM can do that for you within seconds (even creating mock data), and since this is mostly test code, it doesn't matter if the code quality is not great, it just matters to get something in the screen to test the real functionality.
AI/LLMs have been a huge time savers for this part.
whatever1 · 1h ago
To me it seems that LLMs are a tool that only increase productivity for given headcount in dimensions that were neglected in the past.
For example, everyone now writes emails with perfect grammar in a fraction of a time. So now the expectation for emails is that they will have perfect grammar.
Or one can build an interactive dashboard to visualize their spreadsheet and make it pleasing. Again the expectation just changed. The bar is higher.
So far I have not seen productivity increase in dimensions with direct sight to revenue. (Of course there is the niche of customer service, translation services etc that already were in the process of being automated)
manmademagic · 1h ago
It's an interesting dilemma, since if I know that an email was written mostly with AI, it feels to me like the author didn't put effort in, and thus I won't put much effort into reading the email.
I had a conversation with my manager about the implications of everyone using AI to write/summarise everything. The end result will most likely be staff getting Copilot to generate a report, then their manager uses Copilot to summarise the report and generate a new report for their manager, ad inifinitum.
Eventually all context is lost, busywork is amplified, and nobody gains anything.
chii · 5m ago
> Eventually all context is lost, busywork is amplified
why not fire everyone in between the top-most manager and the actual "worker" doing the work, as the report could be generated with the correct level of summary?
Wololooo · 1h ago
I'm sad to see this for several reasons because I do not expect or want everyone up use a LLM to converse with me via mail, the whole point is to exchange information, with everyone using a LLM as output and input, now the whole thing becomes a game of telephone.
You do not need to build a spreadsheet visualiser tool there are plenty of options that exist and are free and open source.
I'm not against advances, I'm just really failing to see what problem was in need of solving here.
The only use I can get behind is the translation, which admittedly works relatively well with LLMs in general due to the nature of the work.
sschueller · 56m ago
I don't have time to read paragraphs of AI slop emails. Please keep them short and to the point. No need to send it through an LLM.
dumbfoundded · 54m ago
Corporations will require everything going through an LLM to meet company standards.
ggm · 53m ago
Any Board which supports management hollowing out future profits by either firing, or not hiring junior staff deserves to have their bonus rescinded.
Think like a forestry investor, not a cash crop next season.
joshdavham · 6m ago
It's going to be extremely interesting to see what the field of software dev will look like in a few years given how few juniors are getting hired recently.
monster_truck · 1h ago
I've got a few buddies over at Microsoft, they've all said something along the lines of "I really hate using copilot. They at least let us use pre-approved models in VSCode, we get most that come out. But all AI metrics are tracked and there are layoffs every quarter. I have kids now man. Strange times. I know you would have quit months ago" and they're right.
Now that bs work has next to no cost, I see a lot more bs work being done, and often on pointless bureaucratic activities involving generating questionnaires and answering them. It's as if the activities add up to a big net zero.
Some nits I'd pick along those lines:
>For instance, according to the most recent AI Index Report, AI systems could solve just 4.4% of coding problems on SWE-Bench, a widely used benchmark for software engineering, in 2023, but performance increased to 71.7% in 2024 (Maslej et al., 2025).
Something like this should have the context of SWE-Bench not existing before November, 2023.
Pre-2023 systems were flying blind with regard to what they were going to be tested with. Post-2023 systems have been created in a world where this test exists. Hard to generalize from before/after performance.
> The patterns we observe in the data appear most acutely starting in late 2022, around the time of rapid proliferation of generative AI tools.
This is quite early for "replacement" of software development jobs as by their own prior statement/citation the tools even a year later, when SWE-Bench was introduced, were only hitting that 4.4% task success rate.
It's timing lines up more neatly with the post-COVID-bubble tech industry slowdown. Or with the start of hype about AI productivity vs actual replaced employee productivity.
While this is true, there are ways to test (open models) on tasks created after the model was released. We see good numbers there as well, so something is generalising there.
But with progress continuing in the models, too, it's an even more complicated affair.
That's an opinion many disagree with. As a matter of fact, the only limited study up to date showed that LLMs usage decrease productivity for experienced developers by roughly 19%. Let's reserve opinions and link studies.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
My anecdotal experience, for example, is that LLMs are such a negative drain on both time and quality that one has to be really early in their career to benefit from their usage.
Seems about right when trying to tell an LLM what to code. But flipping the script, letting the LLM tell you what to code, productivity gains seem much greater. Like most programmers will tell you: Writing code isn't the part of software development that is the bottleneck.
1. Converting exported data into a suitable import format based on a known schema 2. Creating syntax highlighting rules for language not natively support in a Typst report
Both situations didn't have an existing solution, and while the outputs were not exactly correct, they only needed minor adjustments.
Any other situation, I'd generally prefer to learn how to do the thing, since understanding how to do something can sometimes be as important as the result.
The quality of the Haskell code is about as good as I would have written myself, though I think it falls for primitive obsession more than I would. Still, I can add those abstractions myself after the fact.
Maybe one of the reasons I'm getting good results is because the LLM effectively has to argue with GHC, and GHC always wins here.
I've found that it's a superpower also for finding logic bugs that I've missed, and for writing SQL queries (which I was never that good at).
It's no surprise to me that devs who are accustomed to working on one thing at a time due to fast feedback loops have not learned to adapt to paralellizing their work (something that has been demonized at agile style organizations) and sit and wait on agents and start watching YouTube instead, as the study found (productivity hits were due to the participants looking at fun non-work stuff instead of attempting to parallelize any work).
The study reflects usage of emergent tools without training, and with regressive training on previous generation sequential processes, so I would expect these results. If there is any merit in coordinating multiple agents on slower feedback work, this study would not find it.
They are not great if your tasks are not well defined. Sometimes, they suprise you with great solutions, sometimes they produce mess that just wastes your time and deviates from your mission.
To, me LLMs have been great accelerants when you know what you want, and can define it well. Otherwise, they can waste your time by creating a lot of code slop, that you will have to re-write anyways.
One huge positive sideffect, is that sometimes, when you create a component, (i.e. UI, feature, etc), often you need a setup to test, view controllers, data, which is very boring and annoying / time wasting to deal. LLM can do that for you within seconds (even creating mock data), and since this is mostly test code, it doesn't matter if the code quality is not great, it just matters to get something in the screen to test the real functionality. AI/LLMs have been a huge time savers for this part.
For example, everyone now writes emails with perfect grammar in a fraction of a time. So now the expectation for emails is that they will have perfect grammar.
Or one can build an interactive dashboard to visualize their spreadsheet and make it pleasing. Again the expectation just changed. The bar is higher.
So far I have not seen productivity increase in dimensions with direct sight to revenue. (Of course there is the niche of customer service, translation services etc that already were in the process of being automated)
I had a conversation with my manager about the implications of everyone using AI to write/summarise everything. The end result will most likely be staff getting Copilot to generate a report, then their manager uses Copilot to summarise the report and generate a new report for their manager, ad inifinitum.
Eventually all context is lost, busywork is amplified, and nobody gains anything.
why not fire everyone in between the top-most manager and the actual "worker" doing the work, as the report could be generated with the correct level of summary?
You do not need to build a spreadsheet visualiser tool there are plenty of options that exist and are free and open source.
I'm not against advances, I'm just really failing to see what problem was in need of solving here.
The only use I can get behind is the translation, which admittedly works relatively well with LLMs in general due to the nature of the work.
Think like a forestry investor, not a cash crop next season.
https://www.fool.com/investing/2024/11/29/this-magnificent-s...