I’m just today after having my first real success with Claude (and generally with coding agents). I’ve played with Cursor in the past but am now trying Claude and others.
As mentioned in the article, the big trick is having clear specs. In my case I sat down for 2 hours and wrote a 12 step document on how I would implement this (along with background information). Claude went through step by step and wrote the code. I imagine this saved me probably 6-10 hours. I’m now reviewing and am going to test etc. and start adjusting and adding future functionality.
Its success was rooted in the fact I knew exactly how to do what it needed to do. I wrote out all the steps and it just followed my lead.
It makes it clear to me that mid and senior developers aren’t going anywhere.
That said, it was amazing to just see it go through the requirements and implement modules full of organised documented code that I didn’t have to write.
mft_ · 5h ago
Can you (or anyone) share an example of such a specification document? As an amateur programmer experimenting with CC, it would be very helpful to understand the nature and depth of the information that is helpful.
jamesponddotco · 5h ago
I have multiple system prompts that I use before getting to the actual specification.
1. I use the Socratic Coder[1] system prompt to have a back and forth conversation about the idea, which helps me hone the idea and improve it. This conversation forces me to think about several aspects of the idea and how to implement it.
2. I use the Brainstorm Specification[2] user prompt to turn that conversation into a specification.
3. I use the Brainstorm Critique[3] user prompt to critique that specification and find flaws in it which I might have missed.
4. I use a modified version of the Brainstorm Specification user prompt to refine the specification based on the critique and have a final version of the document, which I can either use on my own or feed to something like Claude Code for context.
Doing those things improved the quality of the code and work spit out by the LLMs I use by a significant amount, but more importantly, it helped me write much better code on my own because I know have something to guide me, while before I used to go blind.
As a bonus, it also helped me decide if an idea was worth it or not; there are times I'm talking with the LLM and it asks me questions I don't feel like answering, which tells me I'm probably not into that idea as much as I initially thought, it was just my ADHD hyper focusing on something.
> I use the Socratic Coder[1] system prompt to have a back and forth conversation about the idea. (prompt starts with: 1. Ask only one question at a time)
Why only 1? IMHO it's better to write a long prompt explaining yourself as much as possible (exercises your brain and you figure out things), and request as many questions to clarify as possible, review, and suggestions, all at once. This is better because:
1. It makes you think deeper and practice writing clearly.
2. Even though each interaction is quite slower, since you are more active and engaged it feels shorter (try it), and you minimize interactions significantly.
3. It's less wasteful as going back and forth
4. You converge in much shorter time as your misconceptions, misunderstandings, problems expressing yourself, or confusion on the part of the LLM are all addressed very early.
5. I find it annoying to wait for the replies.
I guess if you use a fast response conversational system like ChatGPT app it would make more sense. But I don't think that way you can have deep conversations unless you have a stellar working memory. I don't, so it's better for me to write and read, and re-write, and re-read...
jamesponddotco · 6m ago
I do one question at a time so I don't feel overwhelmed and can answer questions with more details.
I start with an idea between <idea> tags, write as much as I possibly can between these tags, and then go one question at a time answering the questions with as much details as I possibly can.
Sometimes I'll feed the idea to yet another prompt, Computer Science PhD[1], and use the response as the basis for my conversation with the socratic coder, as the new basis might fill in gaps that I forgot to include initially.
[2]: Something like "Based on my idea, can you provide your thoughts on how the service should be build, please? Technologies to use, database schema, user roles, permissions, architectural ideas, and so on."
ctoth · 2h ago
You may want to turn these good prompts into slash commands! :)
jamesponddotco · 1h ago
They are subagents and slash commands, depending on the project. Eventually, I need to come up with a “dotclaude” repository with these and a few others I use in.
Edit: Sorry, I had a brain fart for a second, thought you were talking about other prompts. I prefer to keep those as chats with the API, not Claude Code, but yeah, they might work as slash commands too.
time0ut · 5h ago
Thank you for sharing these prompts. These are excellent.
indigodaddy · 3h ago
Wish we could star files in addition to repos
addandsubtract · 1h ago
You mean like adding a bookmark, or downloading the files? Yeah, wish that was possible on the web.
indigodaddy · 1h ago
Well I use GitHub stars kind of like coding/cool project/idea/whatever bookmarks, so yeah would be neat to be able to star just any file within a repo in addition to the repo itself
taude · 3h ago
Search Claude-code Planning mode. You can use claude to help you write specs. Many YouTube videos, as well. I think spec docs are pretty personal and project specific....
No comments yet
bongodongobob · 5h ago
I do a multistep process
Step 1: back and forth chat about the functionality we want. What do we want it to do? What are the inputs and outputs? Then generate a spec/requirements sheet.
Step 2: identify what language, technologies, frameworks to use to accomplish the goal. Generate a technical spec.
Step 3: architecture. Get a layout of the different files that need to be created and a general outline of what each will do.
Step 4: combine your docs and tell it to write the code.
camel_gopher · 5h ago
Many mid and senior developers cannot write specs. I agree with the intent of your statement.
miroljub · 5h ago
> That said, it was amazing to just see it go through the requirements and implement modules full of organised documented code that I didn’t have to write
Small side remark, but what is the value added of the AI generated documentation for the AI generated code. It's just a burden that increases context size whenever AI needs to re-analyse or change the existing code. It's not like any human is ever going to read the code docs, when he can just ask AI what it is about.
lurking_swe · 3h ago
This is sort of like asking “why do pilots still perform manual takeoffs and landing even though full autopilot is possible?” It’s because autopilot is intended to help pilots, not replace them. Too much could go wrong in the real world. Having some skills that you practice daily is crucial to remaining a good pilot. Similarly, it’s probably good to write some code daily to keep skills sharp.
1) when your cloud LLM has an outage, your manager probably still expects you to be able to do your work for the most part. Not to go home because openai is down lol. You being productive as an engineer should not depend on the cloud working.
2) You may want to manually write code for certain parts of the project. Important functions, classes, modules, etc. Having good auto-generated docs is still useful when using a traditional IDE like IntelliJ, WebStorm, etc.
3) Code review. I’m assuming your team does code review as part of your SDLC??? Documentation can be helpful when reviewing code.
fragmede · 2h ago
> You being productive as an engineer should not depend on the cloud working.
lol where do you work? This obviously isn't true for the entire industry. If Github or AWS or your WiFi/ISP is down, productivity is greatly reduced. Many SaaS company don't have local dev, so rely on the cloud broadly being up. "Should" hasn't been the reality in industry for years.
xcf_seetan · 1h ago
Well, the only thing i need to write code is to be alive. No Github or AWS? No problem, have local copies of everything. No Claude? ok, i have local llm to give some help. So, internet is not so needed to write code. No IDE's just a CLI? Sure all i need is a text editor and a compiler/linker working. No computer or electricity? Get a pen and paper and start writing code on paper, will get to the computer when possible. I do not depend on cloud working to be productive.
lurking_swe · 2h ago
My company deploys everything in the cloud, but we can still do some meaningful work locally for a few hours if needed in a pinch. Out gitlab is self hosted because that’s extra critical.
I continue writing code and unit tests while i wait for the cloud to work again. If the outage is a long time, I may even spin up “DynamoDB Local” via docker for some of our simpler services that only interact with DynamoDB. Our apache flink services that read from kafka are a lost cause obviously lol.
It’s also a good opportunity to tackle any minor refactoring that you’ve been hoping to do. Also possible without the cloud.
You can also work on _designing_ new features (whiteboarding, creating a design document, etc). Often when doing so you need to look at the source to see how the current implementation works. That’s much easier with code comments.
SatvikBeri · 3h ago
Examples. Examples are helpful for both humans and LLMs, especially if you have a custom framework or are using an unusual language. And I find I can generate ~10 good examples with LLMs in the time it would take me to generate ~3 good examples manually.
aosaigh · 5h ago
I’m not sure I agree that I’ll never look at the code. I think it’s still important to know how the code is working for your own mental model of the app. So in this case I’ll be testing and reviewing everything to see how it’s implemented. With that in mind it’s useful for me as well as serving as context for the AI. That said, you may be right.
cpursley · 3h ago
Claude's proclivity for writing detailed comments and inline comments and very near perfect commit messages is one of the best things about it.
r_murphey · 3h ago
Often someone will have to maintain the code. Whether the maintainer is a human or an AI, an explanation of the intent of the code could be helpful.
weego · 4h ago
written once, looked at 100 times.
I try to prompt-enforce no line by line documentation, but encourage function/class/module level documentation that will help future developers/AI coding agents. Humans are generally better, but AI sometimes needs a help to stop it not understanding a piece of code's context and just writing it's own new function that does the same thing
infecto · 5h ago
Doc strings within the code could be helpful for both humans and AI. Sometimes spoken word intent is easier to digest then code and help identify side effects for both human and AI.
felixgallo · 5h ago
frequently your session/context may drop (e.g. claude crashes, or your internet dies, or your computer restarts, etc.). Claude does best when it can recover the context and understand the current situation from clear documentation, rather than trying to reverse engineer intent and structure from an existing code base. Also, the human frequently does read the code docs as there may be places where Claude gets stuck or doesn't do what you want, but a human can reason their way into success and unstick the obstacle.
manwe150 · 4h ago
From Claude -r you can resume any conversation at any previous point, so there isn’t a way to lose context that way. As opposed to compact, which I find makes it act brain dead afterwards for a while
r_murphey · 3h ago
Oh God yes, I wish there were better tools to help one curate and condense a context when one finds that sweet spot where it's writing great code.
Der_Einzige · 4h ago
I promise you that token context rot is worse than the gains from added natural language explanations
alwillis · 3h ago
Keep in mind each Claude subagent gets its own context.
felixgallo · 4h ago
This hasn't been my experience.
nisegami · 5h ago
It's entirely possible that the parameters that get activated by comments in code are highly correlated with the parameters involved in producing good code.
The_Fox · 3h ago
I too just yesterday had my first positive experience with Claude writing code in my project. I used plan mode for the first time and gave it the "think harder" shove. It was a straightforward improvement but not trivial. The spec wasn't even very detailed- I mentioned a couple specific classes and the behaviour to change, and it wrote the code I would have expected to write, with even a bit more safety checking than I would have done.
dewey · 4h ago
After someone mentioned that recently I've started to write really detailed specs with the help of ChatGPT Deep Research and editing it myself. Then getting this exported as a Markdown document and passing it to Cursor really worked very well.
It puts you in a different mind space to sit down and think about it instead of iterating too much and in the end feeling productive while actually not achieving much and going mostly in circles.
sillyfluke · 4h ago
The test and review cycle is what determines time saved in my view. Since you were satisfied overall I take it that cycle was not too cumbersome?
The parent wrote:
>I imagine this saved me probably 6-10 hours. I’m now reviewing and am going to test etc.
Guessing the time saved prior to reviewing and testing seems premature fron my end.
mfalcon · 2h ago
That's the way I'd used it, I've built a document with all the requirements and then gave it to CC. But it was not a final document, I had to go back and make some changes after experimenting with the code CC built.
philipwhiuk · 5h ago
Frankly, even if you ignore Claude entirely, being able to write a good spec for yourself is a worthwhile endeavour.
aosaigh · 5h ago
Complete agree. It’s a core skill of a good developer. What’s interesting is that in the past I’d have started this process but then jumped into coding prematurely. Now when you know you are using an agent, the more you write, the better the results.
danielbln · 5h ago
Yes but let's not forget the lessons of waterfall planning. You can't anticipate everything, so the detail level of the implementation plan should be within a goldi locks zone of detailed but not too detailed, and after each implementation and test phase one should feel comfortable adjusting the spec/plan to the current state of things.
aosaigh · 5h ago
Another good point. I noticed this happening while writing my document.
A few times while writing the doc I had to go back and update the previous steps to add missing features.
Also I knew when to stop. It’s not fully finished yet. There are additional stages I need to implement. But as an experienced developer, I knew when I had enough for “core functionalty” that was well defined.
What worries me is how do you become a good developer if AI is writing it all?
One of my strengths as a developer is understanding the problem and breaking it down into steps, creating requirements documents like I’ve discussed.
But that’s a hard-earned skill from years of client work where I wrote the code. I have a huge step up in getting the most from these agents now.
danielbln · 4h ago
Agents raise the floor for all, but they raise the ceiling for those of us with sufficient priors.
closewith · 5h ago
The downside of waterfall was not overly detailed specs. In fact, the best software development is universally waterfall following a good, ideally formal spec.
The downside that Agile sought to remedy was inflexibility, which is an issue greatly ameliorated by coding agents.
danielbln · 4h ago
Maybe if you know the entire possibility space beforehand, in which case that's a great position to be in. In other cases and if the spec doesn't align with reality after implementation has begun or unforseen issues pop up, the spec needs revision, does it not?
zer00eyz · 46m ago
> In other cases and if the spec doesn't align with reality after implementation has begun or unforeseen issues pop up, the spec needs revision, does it not?
Yes and then it gets pumped back to the top of the waterfall and goes through the entire process. Many organizations became so rigid that this was a problem. It is what Tom Smykowski in office space is a parody of. It's why you get much of the early web having things like the "feature creep" and "if web designers were architects".
Waterfall failed because of politics mingled into the process, it was the worst sort of design by committee. If you want a sense of how this plays out you simply have to look at Wayland development. The fact that is has been as successful as it is, is a testament to the will and patience of those involved.
noiv · 44m ago
This is the way.
spyckie2 · 5h ago
Can’t you send the same spec through cursor? Am I missing something there?
aosaigh · 4h ago
Yes certainly. I’m sure Cursor would do a good job.
That said, I think that the differing UIs of Cursor (in the IDE) and Claude (in the CLI) fundamentally change how you approach problems with them.
Cursor is “too available”. It’s right there and you can be lazy and just ask it anything.
Claude nudges you to think more deeply and construct longer prompts before engaging with it.
That my experience anyway
danielbln · 4h ago
Fun fact: there is a Cursor CLI now
esafak · 5h ago
You can use Claude to write the spec next time.
maherbeg · 4h ago
I highly recommend having fairly succinct project level CLAUDE.md files, and defer more things into sub-folders. Use the top level as a map. Then during your planning of a feature, it can reach into each folder as it sees fit to find useful context to build out your phased implementation plan. I have it use thinking mode to figure out the right set of context.
At the end of each phase, I ask claude to update my implementation plan with new context for a new instance of claude to pick it up. This way it propagates context forward, and then I can clear the context window to start fresh on the next phase.
gabrielpoca118 · 1h ago
I don’t write any structured specs and I still get a lot of value out of it. I basically use it in incremental steps where I’m telling it what I want at a much lower level. Am always watching what it is doing and stopping it to correct the action. At least for me this approach has worked much better than asking it for bigger things.
bgirard · 5h ago
I'm playing with Claude Code to build an ASCII factorio-like. I first had it write code without much code supervision. It quickly added most of the core features you'd expect (save/load, options, debug, building, map generation, building, belts, crafting, smart belt placing, QoL). Then I started fixing minor bugs and each time it would break something eg. tweaking movement broke belts. So I prompted it to add Playwright automation. Then it wasn't able to write good quality tests and have them all pass, the test were full of sleep calls, etc...
So I looked at the code more closely and it was using the React frontend and useEffect instead of a proper game engine. It's also not great at following hook rules and understanding their timing in advance scenarios. So now I'm prompting it to use a proper tick based game engine and rebuilding the game up, doing code reviews. It's going 'slower' now, but it's going much better now.
My goal is to make a Show HN post when I have a good demo.
FeepingCreature · 2h ago
Yep, human contribution is extremely valuable especially very early in before the AI has a skeleton it can work off of. You have to review those first few big refactors like a hawk. After that you can relax a bit.
JulesRosser · 2h ago
For anyone like me who was struggling to manage work and personal Claude subscriptions, you can just use an alias like this:
alias claude-personal="CLAUDE_CONFIG_DIR=~/.claude-personal claude"
I've been working with Claude Code daily for a month or so. It is quite excellent and better than the other agents I have used (Cursor, Q). This article has some good tips that echo some of the things I have learned.
Some additional thoughts:
- I like to start with an ideation session with Claude in the web console. I explain the goals of the project, work through high level domain modeling, and break the project down into milestones with a target releasable goal in mind. For a small project, this might be a couple hours of back and forth. The output of this is the first version of CLAUDE.md.
- Then I start the project with Claude Code, have it read my global CLAUDE.md and the project CLAUDE.md and start going. Each session begins this way.
- I have Claude Code update the project CLAUDE.md as it goes. I have it mark its progress through the plan as it goes. Usually, at the end of the session, I will have it rewrite a special section that contains its summary of the project, how it works, and how to navigate the code. I treat this like Claude's long term memory basically. I have found it helps a lot.
- Even with good guidelines, Claude seems to have a tendency to get ahead of itself. I like to keep it focused and build little increments as I would myself if it is something I care about. If its just some one off or prototype, I let it go crazy and churn whatever works.
kace91 · 4h ago
Does the $20 subscription hold a similar bang for your buck as cursor?
I’m curious about the tool but I wonder if it requires more significant investment to be a daily driver.
time0ut · 3h ago
Using claude code feels like pairing with another programmer. Cursor feels like a polished extension of the IDE. They are both good tools and easily worth $20/mo. I think Anthropic has a 7 day free trial going on. Worth trying it out.
delichon · 5h ago
Asking the agent to perform a code review on its own work is surprisingly fruitful.
I do this routinely with its suggestions, usually before I apply them. It is surprising how often Claude immediately dumps on its own last output, talking both of us out if it, and usually with good reasons. I'd like to automate this double-take.
I found that for a period of time Claude was almost excessively negative when reviewing its own work. It was only after some contemplation that I realized that it was the phrasing of my code review slash command that framed the review with a negative bent, essentially prompting Claude to dump on its own stuff. The phrasing of that prompt has been a focus of a lot of effort on my side since.
No comments yet
michaelteter · 1h ago
There’s an interesting transition point that we must keep in mind when using these tools.
For research, investigation, and proof of concept, it is good to be flexible and a bit imprecise.
But once a path seems clear, writing a single detailed document (even with “help”) is valuable before working with a separate AI assistant.
The challenge is recognizing that transition point. It’s very easy to just meander from zero to sort-of-product without making this separation.
abroun_beholder · 5h ago
Nice post, I'll try a few of those in my own file. From my side, one thing in the troubleshooting section that I think is missing is telling the agent that it should collect some proof of what it thinks is wrong before trying to proceed with a fix. I have burnt through a large number of tokens in the past in situations where Claude took a look at the dodgy code (that it had written) and went 'aha! I know what the problem is here' before proceeding to make things worse. Telling Claude to add in debug print statements can be remarkably effective but I'm sure it can also come up with other approaches.
enobrev · 4h ago
Nothing quite like "I see what the problem is", and then seeing Claude start reading a bunch of files and strategizing the re-architecture of a feature just resolve its own 3-line blunder.
If you happen catch it and you're quick to "esc" and just tell it to find a simpler solution, it's surprisingly great at reconsidering, resolving the issue simply, and picking up where it left off before the blunder.
nightshift1 · 2h ago
yes, that is why even in full automated mode, it is better to pay attention.
Sometimes it even tries to git checkout the file when it is stuck with some i indentation issues.
toxik · 2h ago
I just never run it in full auto, I look at its proposed changes and more often than not ask it to do it another way. Sometimes I'm just so disappointed that I just go code it up myself.
monkeydust · 4h ago
Been playing around with Claude Code for a home project over the last week.
I started with an idea but no spec. I got it to a happy place I can deploy yesterday. Spent around $75 on tokens. It was starting to feel expensive towards the end.
I did wonder if I had started with a clearer specification could I have got there quicker and for less money.
The thing is though, looking back at the conversations I had with it, the back and forth (vibe coding I guess) helped me refine what I was actually after so in two minds if a proper tight specification upfront would have been the best thing.
electroly · 4h ago
Switch from Opus to Sonnet. When people report high spending in Claude Code it's always because they're using Opus. Opus is for people on unlimited plans who aren't paying API rates.
JulesRosser · 4h ago
You could also define a subagent that uses Opus, for special cases such as planning
libraryofbabel · 4h ago
I use Claude Code regularly and have been responsible for introducing colleagues to it. The consensus here seems to be that it’s the best coding agent out there. But since it’s the only coding agent I’ve used, when colleagues ask why it’s better than Cursor, Cline, GitHub Copilot, Gemini CLI, etc., I sometimes struggle to articulate reasons.
Claude Code power users, what would you say makes it superior to other agents?
paulhodge · 4h ago
Lots of signs point to a conclusion that the Opus and Sonnet models are fundamentally better at coding, tool usage, and general problem solving across long contexts. There is some kind of secret sauce in the way they train the models. Dario has mentioned in interviews that this strength is one of the company's closely guarded secrets.
And I don't think we have a great eval benchmark that exactly measures this capability yet. SWE Bench seems to be pretty good, but there's already a lot of anecdotal comments that Claude is still better at coding than GPT 5, despite having similar scores on SWE Bench.
libraryofbabel · 2h ago
Yeah, agree that the benchmarks don't really seem to reflect the community consensus. I wonder if part of it is the better symbiosis between the agent (Claude Code) and the Opus and Sonnet models it uses, which supposedly are fine-tuned on Claude Code tool calls? But agree, there is probably some additional secret sauce in the training, perhaps to do with RL on multi-step problems...
aosaigh · 4h ago
I mentioned this is another comment, but for me one of the big positives is nothing to do with the model, it’s the UI of how it presents itself.
I hated at first that it wasn’t like Cursor, sitting in the IDE. Then I realised I was using Cursor completely differently, using it often for small tasks where it’s only moderately helpful (refactoring, adding small functions, autocompleting)
With Claude I have to stop, think and plan before engaging with it, meaning it delivers much more impactful changes.
Put another way, it demands more from me meaning I treat it with more respect and get more out of it
libraryofbabel · 4h ago
This is a good point, the CLI kind of forces you to engage with the coding process through the eyes of the agent, rather than just treating it as “advanced autocomplete” in the IDE.
However, there are a lot of Claude Code clones out there now that are basically the same (Gemini CLI, Codex, now Cursor CLI etc.). Claude still seems to lead the pack, I think? Perhaps it’s some combination of better coding performance due to the underlying LLM (usually Sonnet 4) being fine-tuned on the agent tool calls, plus Claude is just a little more mature in terms of configuration options etc.?
enobrev · 4h ago
I haven't tried codex or cursor-cli yet, but I have tried to give gemini a few tasks and in my experience, compared to claude code, it's not great.
Gemini's been very quick to dive in and start changing things, even when I don't want it to. But those changes almost always fall short of what I'm after. They don't run or they leave failing tests, and when I ask it to fix the tests or the underlying issue, it churns without success. Claude is significantly slower and definitely not right all the time, but it seems to do a better job of stepping through a problem and resolving it well enough, while also improving results when I interject.
CamouflagedKiwi · 4h ago
Not a power user, but most recently I tried it out against Gemini and Claude produced something that compiled and almost worked - it was off in some specifics that I could easily tweak. The next thing I asked it (with slightly more detailed prompting) it more or less just nailed.
Meanwhile Gemini got itself stuck in a loop of compile/fail/try to fix/compile/fail again. Eventually it just gave up and said "I'm not able to figure this out". It does seem to have a kind of self-esteem problem in these scenarios, whereas Claude is more bullish on itself (maybe not always a good thing).
Claude seems to be the best at getting something that actually works. I do think Gemini will end up being tough competition, if nothing else because of the price, but Google really need a bit of a quality push on it. A free AI agent is worthless if it can't solve anything for me.
nlh · 5h ago
One fantastic tip I discovered (sorry I've forgotten who wrote it but probably a fellow HNer):
If you're using an AI for the "architecture" / spec phase, play a few of the models off each other.
I will start with a conversation in Cursor (with appropriate context) and ask Gemini 2.5 Pro to ask clarifying questions and then propose a solution, and once I've got something, switch the model to O3 (or your other preferred thinking model of choice - GPT-5 now?). Add the line "please review the previous conversation and critique the design, ask clarifying questions, and proposal alternatives if you think this is the wrong direction."
Do that a few times back and forth and with your own brain input, you should have a pretty robust conversation log and outline of a good solution.
Export that whole conversation into an .md doc, and use THAT in context with Claude Code to actually dive in and start writing code.
You'll still need to review everything and there will still be errors and bad decisions, but overall this has worked surprisingly well and efficiently for me so far.
enobrev · 4h ago
I do something very similar for the planning phase, as well as for the code-review after a task is complete. I like to switch between opus in claude code and gemini cli, so I can work from the same files rather than copying and pasting things.
One tip I picked up from a video recently to avoid sycophancy was to take the resulting spec and instead of telling the reviewing LLM "I wrote this spec", tell it "an engineer on my team wrote this spec". When it doesn't think it's insulting you, it tends to be a bit more critical.
jihadjihad · 3h ago
"The summer intern wrote this spec."
iambateman · 5h ago
if you use Laravel, I wrote github.com/iambateman/speedrun to help get good results. You type /feature [[description of feature]] and it takes it from there.
The system helps you build out a spec first, then uses a few subagents which are tuned for placing files, reviewing for best practice, etc.
I've been using it for about a week and about 70% of my Claude Code usage runs through /feature right now.
The nice thing is you can give it a _lot_ of requests and let it run for 10-15 minutes without interruption. Plus, it makes a set of planning documents before it implements, so you can see exactly what it thought it was changing.
naiv · 4h ago
The update to Opus 4.1 really improved the quality.
I get a lot of success when I’ve laid out the patterns and first implementation of an idea in my code. Then tell Claude to repeat the pattern to implement X feature.
And do it very step by step in what would equate to a tiny PR that gradually roles out the functionality. Too big and I find lots of ugly surprises and bugs and reorganizations that don’t make sense.
andrew_lastmile · 4h ago
Creating temporary artifacts of implementation plans seem to be very useful for breaking down complex tasks and even more so, for me to double check the logic and plans.
tobyhinloopen · 5h ago
That's a great, short prompt. I'm going to steal it.
OJFord · 5h ago
Do you mean the claude.md?
renewiltord · 1h ago
Claude code is fantastic. For me, the insight was that you have to give it the ability to close the loop. If it writes code it will try to reason about the code. "The button has the right CSS and so should be visible".
But everything is better if it can close the loop. So I instead instruct it to always use the puppeteer tool to launch the app and use some test credentials and see if the functionality works.
That's for a web app but you can see how you can do this for other things. Either unit tests, integration tests, or the appropriate MCP.
It needs to see what it's done and observe the resulting world. Not just attempt to reason to it.
Claude also leans towards what it's good at. Repetition costs it nothing so it doesn't mind implementing the same 5 times. One thing it did when I started is implement a sidebar on every page rather than using a component. So you need to provide some pressure against that with your prompts or at least force it to refactor at the end.
berlinismycity · 5h ago
Including Claude Code into the normal subscription was a genius move by Anthrophic. It's so much better than copy and pasting code from chat windows, but that's hard to tell if I had to pay via an API for that service
As mentioned in the article, the big trick is having clear specs. In my case I sat down for 2 hours and wrote a 12 step document on how I would implement this (along with background information). Claude went through step by step and wrote the code. I imagine this saved me probably 6-10 hours. I’m now reviewing and am going to test etc. and start adjusting and adding future functionality.
Its success was rooted in the fact I knew exactly how to do what it needed to do. I wrote out all the steps and it just followed my lead.
It makes it clear to me that mid and senior developers aren’t going anywhere.
That said, it was amazing to just see it go through the requirements and implement modules full of organised documented code that I didn’t have to write.
1. I use the Socratic Coder[1] system prompt to have a back and forth conversation about the idea, which helps me hone the idea and improve it. This conversation forces me to think about several aspects of the idea and how to implement it.
2. I use the Brainstorm Specification[2] user prompt to turn that conversation into a specification.
3. I use the Brainstorm Critique[3] user prompt to critique that specification and find flaws in it which I might have missed.
4. I use a modified version of the Brainstorm Specification user prompt to refine the specification based on the critique and have a final version of the document, which I can either use on my own or feed to something like Claude Code for context.
Doing those things improved the quality of the code and work spit out by the LLMs I use by a significant amount, but more importantly, it helped me write much better code on my own because I know have something to guide me, while before I used to go blind.
As a bonus, it also helped me decide if an idea was worth it or not; there are times I'm talking with the LLM and it asks me questions I don't feel like answering, which tells me I'm probably not into that idea as much as I initially thought, it was just my ADHD hyper focusing on something.
[1]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...
[2]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...
[3]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...
> I use the Socratic Coder[1] system prompt to have a back and forth conversation about the idea. (prompt starts with: 1. Ask only one question at a time)
Why only 1? IMHO it's better to write a long prompt explaining yourself as much as possible (exercises your brain and you figure out things), and request as many questions to clarify as possible, review, and suggestions, all at once. This is better because:
I guess if you use a fast response conversational system like ChatGPT app it would make more sense. But I don't think that way you can have deep conversations unless you have a stellar working memory. I don't, so it's better for me to write and read, and re-write, and re-read...I start with an idea between <idea> tags, write as much as I possibly can between these tags, and then go one question at a time answering the questions with as much details as I possibly can.
Sometimes I'll feed the idea to yet another prompt, Computer Science PhD[1], and use the response as the basis for my conversation with the socratic coder, as the new basis might fill in gaps that I forgot to include initially.
[1]: https://github.com/jamesponddotco/llm-prompts/blob/trunk/dat...
[2]: Something like "Based on my idea, can you provide your thoughts on how the service should be build, please? Technologies to use, database schema, user roles, permissions, architectural ideas, and so on."
Edit: Sorry, I had a brain fart for a second, thought you were talking about other prompts. I prefer to keep those as chats with the API, not Claude Code, but yeah, they might work as slash commands too.
No comments yet
Step 1: back and forth chat about the functionality we want. What do we want it to do? What are the inputs and outputs? Then generate a spec/requirements sheet.
Step 2: identify what language, technologies, frameworks to use to accomplish the goal. Generate a technical spec.
Step 3: architecture. Get a layout of the different files that need to be created and a general outline of what each will do.
Step 4: combine your docs and tell it to write the code.
Small side remark, but what is the value added of the AI generated documentation for the AI generated code. It's just a burden that increases context size whenever AI needs to re-analyse or change the existing code. It's not like any human is ever going to read the code docs, when he can just ask AI what it is about.
1) when your cloud LLM has an outage, your manager probably still expects you to be able to do your work for the most part. Not to go home because openai is down lol. You being productive as an engineer should not depend on the cloud working.
2) You may want to manually write code for certain parts of the project. Important functions, classes, modules, etc. Having good auto-generated docs is still useful when using a traditional IDE like IntelliJ, WebStorm, etc.
3) Code review. I’m assuming your team does code review as part of your SDLC??? Documentation can be helpful when reviewing code.
lol where do you work? This obviously isn't true for the entire industry. If Github or AWS or your WiFi/ISP is down, productivity is greatly reduced. Many SaaS company don't have local dev, so rely on the cloud broadly being up. "Should" hasn't been the reality in industry for years.
I continue writing code and unit tests while i wait for the cloud to work again. If the outage is a long time, I may even spin up “DynamoDB Local” via docker for some of our simpler services that only interact with DynamoDB. Our apache flink services that read from kafka are a lost cause obviously lol.
It’s also a good opportunity to tackle any minor refactoring that you’ve been hoping to do. Also possible without the cloud.
You can also work on _designing_ new features (whiteboarding, creating a design document, etc). Often when doing so you need to look at the source to see how the current implementation works. That’s much easier with code comments.
I try to prompt-enforce no line by line documentation, but encourage function/class/module level documentation that will help future developers/AI coding agents. Humans are generally better, but AI sometimes needs a help to stop it not understanding a piece of code's context and just writing it's own new function that does the same thing
It puts you in a different mind space to sit down and think about it instead of iterating too much and in the end feeling productive while actually not achieving much and going mostly in circles.
The parent wrote:
>I imagine this saved me probably 6-10 hours. I’m now reviewing and am going to test etc.
Guessing the time saved prior to reviewing and testing seems premature fron my end.
A few times while writing the doc I had to go back and update the previous steps to add missing features.
Also I knew when to stop. It’s not fully finished yet. There are additional stages I need to implement. But as an experienced developer, I knew when I had enough for “core functionalty” that was well defined.
What worries me is how do you become a good developer if AI is writing it all?
One of my strengths as a developer is understanding the problem and breaking it down into steps, creating requirements documents like I’ve discussed.
But that’s a hard-earned skill from years of client work where I wrote the code. I have a huge step up in getting the most from these agents now.
The downside that Agile sought to remedy was inflexibility, which is an issue greatly ameliorated by coding agents.
Yes and then it gets pumped back to the top of the waterfall and goes through the entire process. Many organizations became so rigid that this was a problem. It is what Tom Smykowski in office space is a parody of. It's why you get much of the early web having things like the "feature creep" and "if web designers were architects".
Waterfall failed because of politics mingled into the process, it was the worst sort of design by committee. If you want a sense of how this plays out you simply have to look at Wayland development. The fact that is has been as successful as it is, is a testament to the will and patience of those involved.
That said, I think that the differing UIs of Cursor (in the IDE) and Claude (in the CLI) fundamentally change how you approach problems with them.
Cursor is “too available”. It’s right there and you can be lazy and just ask it anything.
Claude nudges you to think more deeply and construct longer prompts before engaging with it.
That my experience anyway
At the end of each phase, I ask claude to update my implementation plan with new context for a new instance of claude to pick it up. This way it propagates context forward, and then I can clear the context window to start fresh on the next phase.
So I looked at the code more closely and it was using the React frontend and useEffect instead of a proper game engine. It's also not great at following hook rules and understanding their timing in advance scenarios. So now I'm prompting it to use a proper tick based game engine and rebuilding the game up, doing code reviews. It's going 'slower' now, but it's going much better now.
My goal is to make a Show HN post when I have a good demo.
alias claude-personal="CLAUDE_CONFIG_DIR=~/.claude-personal claude"
https://julesrosser.com/blog/Multiple-Claude-accounts.html
Some additional thoughts:
- I like to start with an ideation session with Claude in the web console. I explain the goals of the project, work through high level domain modeling, and break the project down into milestones with a target releasable goal in mind. For a small project, this might be a couple hours of back and forth. The output of this is the first version of CLAUDE.md.
- Then I start the project with Claude Code, have it read my global CLAUDE.md and the project CLAUDE.md and start going. Each session begins this way.
- I have Claude Code update the project CLAUDE.md as it goes. I have it mark its progress through the plan as it goes. Usually, at the end of the session, I will have it rewrite a special section that contains its summary of the project, how it works, and how to navigate the code. I treat this like Claude's long term memory basically. I have found it helps a lot.
- Even with good guidelines, Claude seems to have a tendency to get ahead of itself. I like to keep it focused and build little increments as I would myself if it is something I care about. If its just some one off or prototype, I let it go crazy and churn whatever works.
I’m curious about the tool but I wonder if it requires more significant investment to be a daily driver.
No comments yet
For research, investigation, and proof of concept, it is good to be flexible and a bit imprecise.
But once a path seems clear, writing a single detailed document (even with “help”) is valuable before working with a separate AI assistant.
The challenge is recognizing that transition point. It’s very easy to just meander from zero to sort-of-product without making this separation.
If you happen catch it and you're quick to "esc" and just tell it to find a simpler solution, it's surprisingly great at reconsidering, resolving the issue simply, and picking up where it left off before the blunder.
I started with an idea but no spec. I got it to a happy place I can deploy yesterday. Spent around $75 on tokens. It was starting to feel expensive towards the end.
I did wonder if I had started with a clearer specification could I have got there quicker and for less money.
The thing is though, looking back at the conversations I had with it, the back and forth (vibe coding I guess) helped me refine what I was actually after so in two minds if a proper tight specification upfront would have been the best thing.
Claude Code power users, what would you say makes it superior to other agents?
And I don't think we have a great eval benchmark that exactly measures this capability yet. SWE Bench seems to be pretty good, but there's already a lot of anecdotal comments that Claude is still better at coding than GPT 5, despite having similar scores on SWE Bench.
I hated at first that it wasn’t like Cursor, sitting in the IDE. Then I realised I was using Cursor completely differently, using it often for small tasks where it’s only moderately helpful (refactoring, adding small functions, autocompleting)
With Claude I have to stop, think and plan before engaging with it, meaning it delivers much more impactful changes.
Put another way, it demands more from me meaning I treat it with more respect and get more out of it
However, there are a lot of Claude Code clones out there now that are basically the same (Gemini CLI, Codex, now Cursor CLI etc.). Claude still seems to lead the pack, I think? Perhaps it’s some combination of better coding performance due to the underlying LLM (usually Sonnet 4) being fine-tuned on the agent tool calls, plus Claude is just a little more mature in terms of configuration options etc.?
Gemini's been very quick to dive in and start changing things, even when I don't want it to. But those changes almost always fall short of what I'm after. They don't run or they leave failing tests, and when I ask it to fix the tests or the underlying issue, it churns without success. Claude is significantly slower and definitely not right all the time, but it seems to do a better job of stepping through a problem and resolving it well enough, while also improving results when I interject.
Meanwhile Gemini got itself stuck in a loop of compile/fail/try to fix/compile/fail again. Eventually it just gave up and said "I'm not able to figure this out". It does seem to have a kind of self-esteem problem in these scenarios, whereas Claude is more bullish on itself (maybe not always a good thing).
Claude seems to be the best at getting something that actually works. I do think Gemini will end up being tough competition, if nothing else because of the price, but Google really need a bit of a quality push on it. A free AI agent is worthless if it can't solve anything for me.
If you're using an AI for the "architecture" / spec phase, play a few of the models off each other.
I will start with a conversation in Cursor (with appropriate context) and ask Gemini 2.5 Pro to ask clarifying questions and then propose a solution, and once I've got something, switch the model to O3 (or your other preferred thinking model of choice - GPT-5 now?). Add the line "please review the previous conversation and critique the design, ask clarifying questions, and proposal alternatives if you think this is the wrong direction."
Do that a few times back and forth and with your own brain input, you should have a pretty robust conversation log and outline of a good solution.
Export that whole conversation into an .md doc, and use THAT in context with Claude Code to actually dive in and start writing code.
You'll still need to review everything and there will still be errors and bad decisions, but overall this has worked surprisingly well and efficiently for me so far.
One tip I picked up from a video recently to avoid sycophancy was to take the resulting spec and instead of telling the reviewing LLM "I wrote this spec", tell it "an engineer on my team wrote this spec". When it doesn't think it's insulting you, it tends to be a bit more critical.
The system helps you build out a spec first, then uses a few subagents which are tuned for placing files, reviewing for best practice, etc.
I've been using it for about a week and about 70% of my Claude Code usage runs through /feature right now.
The nice thing is you can give it a _lot_ of requests and let it run for 10-15 minutes without interruption. Plus, it makes a set of planning documents before it implements, so you can see exactly what it thought it was changing.
I personally really like to use Claude Code together with Zen MCP https://github.com/BeehiveInnovations/zen-mcp-server to analyse existing and review fresh code with additional eyes from Gpt5 and Gemini.
And do it very step by step in what would equate to a tiny PR that gradually roles out the functionality. Too big and I find lots of ugly surprises and bugs and reorganizations that don’t make sense.
But everything is better if it can close the loop. So I instead instruct it to always use the puppeteer tool to launch the app and use some test credentials and see if the functionality works.
That's for a web app but you can see how you can do this for other things. Either unit tests, integration tests, or the appropriate MCP.
It needs to see what it's done and observe the resulting world. Not just attempt to reason to it.
Claude also leans towards what it's good at. Repetition costs it nothing so it doesn't mind implementing the same 5 times. One thing it did when I started is implement a sidebar on every page rather than using a component. So you need to provide some pressure against that with your prompts or at least force it to refactor at the end.