The current state of LLM-driven development

66 Signez 33 8/9/2025, 4:17:16 PM blog.tolki.dev ↗

Comments (33)

tptacek · 2h ago

Learning how to use LLMs in a coding workflow is trivial. There is no learning curve. You can safely ignore them if they don’t fit your workflows at the moment.

I have never heard anybody successfully using LLMs say this before. Most of what I've learned from talking to people about their workflows is counterintuitive and subtle.

It's a really weird way to open up an article concluding that LLMs make one a worse programmer: "I definitely know how to use this tool optimally, and I conclude the tool sucks". Ok then. Also: the piano is a terrible, awful instrument; what a racket it makes.

prerok · 2h ago

I agree with your assessment about this statement. I actually had to reread it a few times to actually understand it.

He is actually recommending Copilot for price/performance reasons and his closing statement is "Don’t fall for the hype, but also, they are genuinely powerful tools sometimes."

So, it just seems like he never really gave a try at how to engineer better prompts that these more advanced models can use.

edfletcher_t137 · 2h ago

The first two points directly contradict each other, too. Learning a tool should have the outcome that one is productive with it. If getting to "productive" is non-trivial, then learning the tool is non-trivial.

troupo · 1h ago

> I have never heard anybody successfully using LLMs say this before. Most of what I've learned from talking to people about their workflows is counterintuitive and subtle.

Because for all our posturing about being skeptical and data driven we all believe in magic.

Those "counterintuitive non-trivial workflows"? They work about as well as just prompting "implement X" with no rules, agents.md, careful lists etc.

Because 1) literally no one actually measures whether magical incarnations work and 2) it's impossible to make such measurements due to non-determinism

roxolotl · 10m ago

On top of this a lot of the “learning to work with LLMs” is breaking down tasks into small pieces with clear instructions and acceptance criteria. That’s just part of working efficiently but maybe don’t want to be bothered to do it.

bgwalter · 1h ago

Pianists' results are well known to be proportional to their talent/effort. In open source hardly anyone is even using LLMs and the ones that do have barely any output, In many cases less output than they had before using LLMs.

The blogging output on the other hand ...

ebiester · 3h ago

I disagree from almost the first sentence:

> Learning how to use LLMs in a coding workflow is trivial. There is no learning curve. You can safely ignore them if they don’t fit your workflows at the moment.

Learning how to use LLMs in a coding workflow is trivial to start, but you find you get a bad taste early if you don't learn how to adapt both your workflow and its workflow. It is easy to get a trivially good result and then be disappointed in the followup. It is easy to try to start on something it's not good at and think it's worthless.

The pure dismissal of cursor, for example, means that the author didn't learn how to work with it. Now, it's certainly limited and some people just prefer Claude code. I'm not saying that's unfair. However, it requires a process adaptation.

mkozlows · 3h ago

"There's no learning curve" just means this guy didn't get very far up, which is definitely backed up by thinking that Copilot and other tools are all basically the same.

rustybolt · 1h ago

> "There's no learning curve" just means this guy didn't get very far up

Not everyone with a different opinion is dumber than you.

jaynetics · 53m ago

I'm not a native speaker, but to me that quote doesn't necessarily imply an inability of OP to get up the curve. Maybe they just mean that the curve can look flat at the start?

SadErn · 1h ago

This is all just ignorance. We've all worked with LLMs and know that creating an effective workflow is not trivial and it varies based on the tool.

scrollaway · 12m ago

No, it's sometimes just extremely easy to recognize people who have no idea what they're talking about when they make certain claims.

Just like I can recognize a clueless frontend developer when they say "React is basically just a newer jquery". Recognizing clueless engineers when they talk about AI can be pretty easy.

It's a sector that is both old and new: AI has been around forever, but even people who worked in the sector years ago are taken aback by what is suddenly possible, the workflows that are happening... hell, I've even seen cases where it's the very people who have been following GenAI forever that have a bias towards believing it's incapable of what it can do.

For context, I lead an AI R&D lab in Europe (https://ingram.tech/). I've seen some shit.

leptons · 2h ago

Basically, they are the same, they are all LLMs. They all have similar limitations. They all produce "hallucinations". They can also sometimes be useful. And they are all way overhyped.

trenchpilgrim · 1h ago

The amount of misconceptions in this comment are quite profound.

Copilot isn't an LLM, for a start. You _combine_ it wil a selection of LLMs. And it absolutely has severe limitations compared to something like Claude Code in how it can interact with the programming environment.

"Hallucinations" are far less of a problem with software that grounds the AI to the truth in your compiler, diagnostics, static analysis, a running copy of your project, runnning your tests, executing dev tools in your shell, etc.

leptons · 49m ago

>Copilot isn't an LLM, for a start

You're being overly pedantic here and moving goalposts. Copilot (for coding) without an LLM is pretty useless.

I stand by my assertion that these tools are all basically the same fundamental tech - LLMs.

spenrose · 1h ago

So many articles should prepend “My experience with ...” to their title. Here is OP's first sentence: “I spent the past ~4 weeks trying out all the new and fancy AI tools for software development.” Dude, you have had some experiences and they are worth writing up and sharing. But your experiences are not a stand-in for "the current state." This point applies to a significant fraction of HN articles, to the point that I wish the headlines were flagged “blog”.

mettamage · 55m ago

Clickbait gets more reach. It's an unfortunate thing. I remember Veritasium in a video even saying something along the lines of him feeling forced to do clickbaity YouTube because it works so well.

The reach is big enough to not care about our feelings. I wish it wasn't this way.

kodisha · 1h ago

LLM driven coding can yield awesome results, but you will be typing a lot and, as article states, requires already well structured codebase.

I recently started with fresh project, and until I got to the desired structure I only used AI to ask questions or suggestions. I organized and written most of the code.

Once it started to get into the shape that felt semi-permanent to me, I started a lot of queries like:

```

- Look at existing service X at folder services/x

- see how I deploy the service using k8s/services/x

- see how the docker file for service X looks like at services/x/Dockerfile

- now, I started service Y that does [this and that]

- create all that is needed for service Y to be skaffolded and deployed, follow the same pattern as service X

```

And it would go, read existing stuff for X, then generate all of the deployment/monitoring/readme/docker/k8s/helm/skaffold for Y

With zero to none mistakes. Both claude and gemini are more than capable to do such task. I had both of them generate 10-15 files with no errors, with code being able to be deployed right after (of course service will just answer and not do much more than that)

Then, I will take over again for a bit, do some business logic specific to Y, then again leverage AI to fill in missing bits, review, suggest stuff etc.

It might look slow, but it actually cuts most boring and most error prone steps when developing medium to large k8s backed project.

simonw · 2h ago

Learning how to use LLMs in a coding workflow is trivial. There is no learning curve. [...]

LLMs will always suck at writing code that has not be written millions of times before. As soon as you venture slightly offroad, they falter.

That right there is your learning curve! Getting LLMs to write code that's not heavily represented in their training data takes experience and skill and isn't obvious to learn.

No comments yet

randfish · 3h ago

Deeply curious to know if this is an outlier opinion, a mainstream but pessimistic one, or the general consensus. My LinkedIn feed and personal network certainly suggests that it's an outlier, but I wonder if the people around me are overly optimistic or out of synch with what the HN community is experiencing more broadly.

MobiusHorizons · 3h ago

My impression has been that in corporate settings (and I would include LinkedIn in that) AI optimism is basically used as virtue signaling, making it very hard to distinguish people who are actually excited about the tech from people wanting to be accepted.

My personal experience has been that AI has trouble keeping the scope of the change small and targeted. I have only been using Gemini 2.5 pro though, as we don’t have access to other models at my work. My friend tells me he uses Claud for coding and Gemini for documentation.

WD-42 · 3h ago

I think it’s pretty common among people whose job it is to provide working, production software.

If you go by MBA types on LinkedIn that aren’t really developers or haven’t been in a long time, now they can vibe out some react components or a python script so it’s a revolution.

danielbln · 2h ago

Hi, my job is building working production software (these days heavily LLM assisted). The author of the article doesn't know what they're talking about.

No comments yet

Terretta · 3h ago

Which part of the opinion?

I tend to strongly agree with the "unpopular opinion" about the IDEs mentioned versus CLI (specifically, aider.chat and Claude Code).

Assuming (this is key) you have mastery of the language and framework you're using, working with the CLI tool in 25 year old XP practices is an incredible accelerant.

Caveats:

- You absolutely must bring taste and critical thinking, as the LLM has neither.

- You absolutely must bring systems thinking, as it cannot keep deep weirdness "in mind". By this I mean the second and third order things that "gotcha" about how things ought to work but don't.

- Finally, you should package up everything new about your language or frameworks since a few months or year before the knowledge cutoff date, and include a condensed synthesis in your context (e.g., Swift 6 and 6.1 versus the 5.10 and 2024's WWDC announcements that are all GPT-5 knows).

For this last one I find it useful to (a) use OpenAI's "Deep Research" to first whitepaper the gaps, then another pass to turn that into a Markdown context prompt, and finally bring that over to your LLM tooling to include as needed when doing a spec or in architect mode. Similarly, (b) use repomap tools on dependencies if creating new code that leverages those dependencies, and have that in context for that work.

I'm confused why these two obvious steps aren't built into leading agentic tools, but maybe handling the LLM as a naive and outdated "Rain Man" type doesn't figure into mental models at most KoolAid-drinking "AI" startups, or maybe vibecoders don't care, so it's just not a priority.

Either way, context based development beats Leroy Jenkins.

procaryote · 1h ago

Linkedin posts seems like an awful source. The people I see posting for themselves there are either pre-successful or just very fond of personal branding

dezmou · 2h ago

OP did miss the vscode extension for claude code, it is still terminal based but: - it show you the diff of the incoming changes in vscode ( like git ) - it know the line you selected in the editor for context

sudhirb · 2h ago

I have a biased opinion since I work for a background agent startup currently - but there are more (and better!) out there than Jules and Copilot that might address some of the author's issues.

troupo · 1h ago

And those mythical better tools tools that you didn't even bother to mention are?

philipwhiuk · 1h ago

There’s an IntelliJ extension for GitHub CoPilot.

It’s not perfect but it’s okay.

SadErn · 54m ago

It's all about the Kilo Code extension.

dash2 · 1h ago

They missed OpenAI Codex, maybe deliberately? It's less llm-development and more vibe-coding, or maybe "being a PHB of robots". I'm enjoying it for my side project this week.

yogthos · 37m ago

Personally, I’ve had a pretty positive experience with the coding assistants, but I had to spend some time to develop intuition for the types of tasks they’re likely to do well. I would not say that this was trivial to do.

Like if you need to crap out a UI based on a JSON payload, make a service call, add a server endpoint, LLMs will typically do this correctly in one shot. These are common operations that are easily extrapolated from their training data. Where they tend to fail are tasks like business logic which have specific requirements that aren’t easily generalized.

I’ve also found that writing the scaffolding for the code yourself really helps focus the agent. I’ll typically add stubs for the functions I want, and create overall code structure, then have the agent fill the blanks. I’ve found this is a really effective approach for preventing the agent from going off into the weeds.

I also find that if it doesn’t get things right on the first shot, the chances are it’s not going to fix the underlying problems. It tends to just add kludges on top to address the problems you tell it about. If it didn’t get it mostly right at the start, then it’s better to just do it yourself.

All that said, I find enjoyment is an important aspect as well and shouldn’t be dismissed. If you’re less productive, but you enjoy the process more, then I see that as a net positive. If all LLMs accomplish is to make development more fun, that’s a good thing.

I also find that there's use for both terminal based tools and IDEs. The terminal REPL is great for initially sketching things out, but IDE based tooling makes it much easier to apply selective changes exactly where you want.

As a side note, got curious and asked GLM-4.5 to make a token field widget with React, and it did it in one shot.

It's also strange not to mention DeepSeek and GLM as options given that they cost orders of magnitude less per token than Claude or Gemini.

weeksie · 1h ago

Yet another developer who is too full of themselves to admit that they have no idea how to use LLMs for development. There's an arrogance that can set in when you get to be more senior and unless you're capable of force feeding yourself a bit of humility you'll end up missing big, important changes in your field.

It becomes farcical when not only are you missing the big thing but you're also proud of your ignorance and this guy is both.

Open models by OpenAI (openai.com)

GPT-5 (openai.com)

Genie 3: A new frontier for world models (deepmind.google)

Perplexity is using stealth, undeclared crawlers to evade no-crawl directives (blog.cloudflare.com)

uBlock Origin Lite now available for Safari (apps.apple.com)

Show HN: I spent 6 years building a ridiculous wooden pixel display (benholmen.com)

Ultrathin business card runs a fluid simulation (github.com)

I want everything local – Building my offline AI workspace (instavm.io)

Show HN: Kitten TTS – 25MB CPU-Only, Open-Source TTS Model (github.com)

Emailing a one-time code is worse than passwords (blog.danielh.cc)

Things that helped me get out of the AI 10x engineer imposter syndrome (colton.dev)

Modern Node.js Patterns (kashw1n.com)

Vibechart (vibechart.net)

Claude Opus 4.1 (anthropic.com)

I gave the AI arms and legs then it rejected me (grell.dev)

Claude Code IDE integration for Emacs (github.com)

GPT-5: Key characteristics, pricing and system card (simonwillison.net)

Job-seekers are dodging AI interviewers (fortune.com)

Monitor your security cameras with locally processed AI (frigate.video)

Mastercard deflects blame for NSFW games being taken down (pcgamer.com)

Writing a good design document (grantslatton.com)

Jim Lovell, Apollo 13 commander, has died (nasa.gov)

Qwen-Image: Crafting with native text rendering (qwenlm.github.io)

Historical Tech Tree (historicaltechtree.com)

Tesla withheld data, lied, misdirected police to avoid blame in Autopilot crash (electrek.co)

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Cursed Knowledge (immich.app)

Objects should shut up (dustri.org)

Flipper Zero dark web firmware bypasses rolling code security (rtl-sdr.com)

GPT-5 for Developers (openai.com)

US reportedly forcing TSMC to buy 49% stake in Intel to secure tariff relief (notebookcheck.net)

How we made JSON.stringify more than twice as fast (v8.dev)

Getting good results from Claude Code (dzombak.com)

Linear sent me down a local-first rabbit hole (bytemash.net)

Japan: Apple Must Lift Browser Engine Ban by December (open-web-advocacy.org)

Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

PHP 8.5 adds pipe operator (thephp.foundation)

Ollama Turbo (ollama.com)

Ozempic shows anti-aging effects in trial (trial.medpath.com)

Scientific fraud has become an 'industry,' analysis finds (science.org)

The surprise deprecation of GPT-4o for ChatGPT consumers (simonwillison.net)

Windows XP Professional (win32.run)

So you want to parse a PDF? (eliot-jones.com)

OpenAI's new open-source model is basically Phi-5 (seangoedecke.com)

DrawAFish.com Postmortem (aldenhallak.com)

Exit Tax: Leave Germany before your business gets big (eidel.io)

Tor: How a military project became a lifeline for privacy (thereader.mitpress.mit.edu)

Cursor CLI (cursor.com)

Harmony: OpenAI's response format for its open-weight model series (github.com)

Ask HN: What trick of the trade took you too long to learn?

The current state of LLM-driven development

Comments (33)