Frequent Nightmares Predict Early Death More Strongly Than Smoking or Obesity (science.slashdot.org)

This article was a bit confusing for me. It starts off by describing what "doing it wrong" looks like (okay). It then goes on to talk about Agents. Perhaps it's just that my human brain needs a firmware update, but I was expecting the "what doing it wrong looks like" section to be followed by a "what doing it right looks like" section. Instead, the next paragraph just begins with "Agents".

Sure, one could surmise that perhaps "doing it right" means "using Agents", but that's not even how the article reads:

> "To make AI development work for you, you’ll need to provide your AI assistant with two things: the proper context and specific instructions (prompts) on how to behave under certain circumstances."

This, to me, doesn't necessitate the usage of agents, so to then enter a section of agents seems to be skipping over a potentially-implied logical connection between the problem in the "doing it wrong" section and how that is solved in the "Agents" section.

Copying code snippets into web UIs and testing manually is slow and clunky, but Agents are essentially just automations around these same core actions. I feel this article could've made a stronger point by getting at the core of what it means to do it wrong.

• Is "doing it wrong" indicated by the time wasted by not using an agentic mechanism vs manual manipulation?

• Is "doing it wrong" indicated by manually switching between tools instead of using MCP to automate tool delegation?

Having written several non-trivial agents myself using Gemini and OpenAI's APIs, the main difference between handing off a task to an agent and manually copy/pasting into chat UIs is efficiency — I usually first do a task manually using chat UIs, but once I have a pattern established, or have identified a set of tools to validate responses, I can then "agentify" it if it's something I need to do repeatedly. But the quality of both approaches is still dependent on the same core principles: adequate context (no more nor less than what keeps the LLM's attention on the task at hand) and adequate instructions for the task (often with a handful of examples). In this regard, I agree with the author, as correct context + instructions are the key ingredients to a useful response. The agentic element is an efficiency layer on top of those key ingredients which frees up the dev from having to manually orchestrate, and potentially avoids human error (and potentially introduces LLM error).

Am I missing something here?

mattkrick · 37m ago

I want to believe, and I promise I'm not trying to be a luddite here. Has anyone with decent (5+ years) experience built a non-trivial new feature in a production codebase quicker by letting AI write it?

Agents are great at familiarizing me with a new codebase. They're great at debugging because even when they're wrong, they get me thinking about the problem differently so I ultimately get the right solution quicker. I love using it like a super-powered search tool and writing single functions or SQL queries about the size of a unit test. However, reviewing a junior's code ALWAYS takes more time than writing it myself, and I feel like AI quality is typically at the junior level. When it comes to authorship, either I'm prompting it wrong, or the emperor just isn't wearing clothes. How can I become a believer?

rco8786 · 18m ago

“Kinda”. I run Claude code on a parallel copy of our monorepo, while I use my primary copy.

I typically only give Claude the boring stuff. Refactors, tech debt cleanup, etc. But occasionally will give it a real feature if the urgency is low and the feature is extremely well defined.

That said, I still spend a considerable amount of time reviewing and massaging Claude’s code before it gets to PR. I haven’t timed myself or anything, but I suspect that when the task is suitable for an LLM, it’s maybe 20-40% faster. But when it’s not, it’s considerably slower and sometimes just fails completely.

9rx · 33m ago

> Has anyone with decent (5+ years) experience built a non-trivial new feature in a production codebase quicker by letting AI write it?

I would say yes. I have been blown away a couple of times. But find it is like playing a slot machine. Occasionally you win — most of the time you lose. As long as my employer is willing to continue to cover the bet, I may as well pull the handle. I think it would be pretty hard to convince myself to pay for it myself, though.

ramesh31 · 17m ago

>Has anyone with decent (5+ years) experience built a non-trivial new feature in a production codebase quicker by letting AI write it?

Yes. Claude Code has turned quarter long initiatives into a few afternoons of prompting for me, in the context of multiple different massive legacy enterprise codebases. It all comes down to just reaching that "jesus take the wheel" level of trust in it. You have to be ok with letting it go off and potentially waste hundreds of dollars in tokens giving you nonsense, which it will some times. But when it doesn't it's like magic, and makes the times that it does worth the cost. Obviously you'll still review every line before merging, but that takes an order of magnitude less time than wrestling with it in the first place. It has fundamentally changed what myself and our team is able to accomplish.

glhaynes · 6m ago

>Obviously you'll still review every line before merging, but that takes an order of magnitude less time than wrestling with it in the first place.

Just speculating here, but I wouldn't be surprised if the truth of both parts of this sentence vary quite a bit amongst users of AI coding tools and their various applications; and, if so, if that explains a lot of the discrepancy amongst reports of success/enthusiasm levels.

ath3nd · 19m ago

> and I feel like AI quality is typically at the junior level. When it comes to authorship, either I'm prompting it wrong, or the emperor just isn't wearing clothes. How can I become a believer?

The emperor is stark naked, but the hype is making people see clothes where there is only an hairy shriveled old man.

Sure, I can produce "working" code with Claude, but I have not ever been able to produce good working code. Yes, it can write a okay-ish unit test (almost 100% identical to how I'd have written it), and on a well structured codebase (not built with Claude) and with some preparation, it can kind of produce a feature. However, on more interesting problems it's just slop and you gotta keep trying and prodding until it produces something remotely reasonable.

It's addictive to watch it conjure up trash and you constantly trying to steer it in the right direction, but I have never ever ever been able to achieve the code quality level that I am comfortable with. Fast prototype? Sure. Code that can pass my code review? Nah.

What is also funny is how non-deterministic the quality of the output is. Sometimes it really does feel like you almost fly off with it, and then bam, garbage. It feels like a roulette, and you gotta keep spinning the wheel to get your dopamine hit/reward.

All while wasting money and time, and still it ends up far far worse than you doing it in the first place. Hard pass.

ActionHank · 39m ago

"If you are only using your hammer to hammer nails, you're doing it wrong" then goes on to explain how you should use agents.

I would've thought that following the initial argument and the progression to the latest trend we would've ended at use agents and write specs and these several currently popular MCPs.

I guess my rant is it to arrive at the point that no one knows what the "correct" way to use them is yet. A hammer has many uses.

(Evil)Doggie: An open-source CAN bus research and penetration testing tool (blackhat.com)

LVFS Sustainability Plan (blogs.gnome.org)

Query-Mutating Data Race in Go (coder.com)

Samsung Missed the AI Moment [video] (youtube.com)

uses this (usesthis.com)

The Mother of All Currency Crises Is on the Horizon (foreignpolicy.com)

Mary Shields, First Woman to Finish the Iditarod, Dies at 80 (wsj.com)

Omron took AppleHealth data without consent then silently updated privacy policy (substack.com)

Show HN: The calendar that schedules everything for you (rhythm.to)

Show HN: MCP Document Indexer – Local AI search for your documents using Ollama (github.com)

Frequent Nightmares Predict Early Death More Strongly Than Smoking or Obesity (science.slashdot.org)

Fastest 5x5 Piston Door / Showcase [video] (youtube.com)

Banning VPNs to protect kids? Good luck with that (theregister.com)

Modest solar boost could cut US CO2 by 8.5M tons (thenewlede.org)

Nobelium becomes heaviest element with identified compounds (chemistryworld.com)

Where do meetups go when they die? (now.beehiiv.com)

MemSync - persistent memory for AI across apps (memsync.ai)

Smartwatches offer little insight into stress levels, researchers find (theguardian.com)

Show HN: Regolith – Regex library for TypeScript made to prevent ReDoS attacks (github.com)

Garlic-Hub Digital Signage Enters Release Candidate Stage (github.com)

Consent and Compromise (research.eye.security)

Breaking the Sorting Barrier for Directed Single-Source Shortest Paths (arxiv.org)

Google Gemini looping bug caused it to repeatedly say, "I am a disgrace" (arstechnica.com)

Telnyx adds GPT-5 and Groq GPT-OSS-120B to Voice AI Agents

I Want Everything Local – Building My Offline AI Workspace (instavm.io)

Scientists build an 'evolution engine' to rapidly reprogram proteins (phys.org)

Soviet hexapod robot prototype and computer simulation (1980) [video] (youtube.com)

When to Hire a Computer Performance Engineering Team (brendangregg.com)

Show HN: Practice JavaScript with quizzes, theoretical questions, and exercises (codeguage.com)

Ask HN: How would you build second brain in the AI era?

Who Will Be the Next Air Bud? Nationwide Search for Star Golden Retriever Begins (variety.com)

Ask HN: Have you ever been targeted by a negative SEO attack?

Yonatan Netanyahu (en.wikipedia.org)

Art in the Age of Digital Intelligence (giozaarour.substack.com)

The Collectors Who Save Video-Game History from Oblivion (2022) (newyorker.com)

Semi-incremental Markdown renderer for LLM (github.com)

We need better ways to evaluate how AI memory systems perform (cognee.ai)

UC Berkeley OCF mirror (mirrors.ocf.berkeley.edu)

Rapidus is Japan's best bet. Will it be enough? (zososdispatches.substack.com)

Nobody Has a Personality Anymore (thefp.com)

Big Butter Rating List – I have tested butter from all over Europe (twitter.com)

The surprise deprecation of GPT-4o for ChatGPT consumers (simonwillison.net)

Best Alpine.js Alternative (blog.hmpl-lang.dev)

Why do I think prediction markets wont scale? (rnikhil.com)

Dev Environments with WebAssembly (thenewstack.io)

EXMS86: EMS-Backed XMS (mateusz.fr)

AI industry horrified to face largest copyright class action ever certified (arstechnica.com)

Radiation-Hard 8-Channel 15-Bit 40-MSPS ADC for the Atlas Argon Calorimeter (ieeexplore.ieee.org)

The Best Line Length (blog.glyph.im)

Judge Detested by Trump Will Oversee Epstein Files Case (newrepublic.com)

Programming with AI: You're Probably Doing It Wrong

Comments (8)