Ask HN: How much better are AI IDEs vs. copy pasting into chat apps?
For context, I'm a heavy user of Gemini / ChatGPT for coding and Copilot. But I haven't used Cursor / Windsurf / etc..
Copy pasting into chat apps is a first world problem: it will do the work for you, but you have to give it all the context in the prompt, which for a larger project, gets tedious.
The issue with Copilot is that it's not as smart as the "thinking" chat apps.
This makes it clear why there's such a need for AI IDEs. I don't want to construct my context to a chat app. The context is already in my codebase, so the AI should pick up on it. But I also hear that it gets expensive because of the pay-per-use pricing, as opposed to effectively unlimited prompts for a thinking chat app if you pay the monthly subscription.
So I just wanted to get the lay of the land. How good are these IDEs on constructing your context to the LLMs? How much more expensive is it, and is it worth it for you?
I use both Cursor on Claude 3.7 and ChatGPT on 4o/o3. Cursor seems kind of "dumb" compared to 4o, but it's a good workhorse.
I let Cursor handle the basics - basically acting as a glorified multi-file autocomplete. I think through common problems with 4o, tough problems with o3, I copy all of Svelte's docs into 4o (https://svelte-llm.khromov.se/) to get good Svelte 5-focused feedback, I have 4o code-review what Cursor writes from time to time, I have 4o, sometimes o3, generate "precise" prompts that I'll give to Cursor when me talking off-the-cuff to Cursor doesn't get good results after a few attempts.
I don't consider myself an expert in these areas yet so I might be misusing Cursor, or not making enough use of its rules system, or something. I feel like I get good value for my ChatGPT subscription. I don't feel like I get good value for my Cursor subscription, but I also still feel like keeping it because $20 to type a lot less is still pretty nice. I would be upset if I only had a Cursor subscription and no access to ChatGPT. I am pretty hesitant to pay for AI à la carte. I feel much better about working within the limitations of a known subscription cost.
Same experience. I'm in no need to let the AI write large, real working code. But it's amazing for fast refactoring, function signatures, boilerplate, etc. The real boring parts. I only wish it wouldn't be so eager to try to dump a 50 lines of bullshit code every time. But for small changes, super cool.
What do you mean by that? You can use 4o model in Cursor?
even if both companies use same model (OpenAI and Cursor), system prompt has a huge impact on results
1. You have chats right there in the editor, easy to copy/paste and manage context/history without switching to a browser. You can also quickly add files or portions of files to the context or remove them again.
2. You can choose which model you want to use for what, granted you have an API key.
3. You can quickly highlight some code and ask for a change to it, which along with managed context is powerful.
I tried auto complete again and again but it doesn't work for me. At first I think "yeah, that's what I wanted to write", and then I have to look closer to realise it's not, and that completely breaks my flow and concentration. I can always write some pseudo code and proactively convert it to real code, I like to be in the driver seat.
Context management is really central to my workflow, I manage it like a hawk. Models tend to deteriorate as context content increases, in my experience, so I really try to keep it narrow.
For that reason, and because our clients didn't sign up for their code to be sent to Anthropic et al, I _mostly_ use models like I would use StackOverflow, not to generate non-trivial code I'd actually use.
But having the chats in my editor is really invaluable for me. Powerful text wrangling features make a difference in both speed and motivation.
I use it pretty heavily with pretty much only the high-end models and pay about $15 per month.
1. Drop project files into https://files2prompt.com to copy a prompt with all file contents
2. Paste into https://aistudio.google.com/ and set a low (or 0) temperature and low top_p
Since Gemini 2.5 Pro is free in AI Studio at the moment, and there's a 1M token limit, this works for most things I need. Cursor is better in some cases where I need a bunch of small edits across a bunch of different files.
I've been using it for about a year and it's incredible.
My take is that most people did not invest the little time necessary to get acustomed to its workflow.
My advices for aider are:
- familiarize yourself with the chat modes (architect, code, ask)
- familiarize yourself with the edit modes (diff, whole, etc) and know which to use for a given model. Indeed not all models handle all modes equally well.
- make the code one feature at a time, by small chunks if needed, by limiting the contxt to the relevant files.
- practice to learn how to best phrase stuff.
- write you coding preferences into aider convention files. Things like "always use type hints, beartype for type checking, python click for the cli, add many comments".
I'm mainly doing python and with proper comments and type hints it's really easy for models to write code that works with the rest of the repo even with only a few files in its context.
I'm now also trying the same with with Claude-code, which has been pretty useful too. It managed to figure out and explain a fairly technical code base (a few DNA processing algorithms). I haven't tried with with code I'm unfamiliar with yet.
So, my verdict so far: well worth the effort to try it and learn what's available. It's not that expensive to try. If it can help automate somethings (documentation and tests, for example), that's what I'm really hoping to use the AI assistants for. It works for actual coding too, but I still prefer to provide the main foundation.
Of course it also gets stuff wrong and everything needs to be properly validated, but it's a nice tool to try out.
Edit : Sarcastic
VSCode/Cursor run natively under Linux.
I guess it's mostly because of my usage type, more 'explain these few lines of code' or 'write a short funcion for X' in Python or JS, not the 'create this complete app' I see people using the agentic approaches for.
OpenAI is mostly out of habit and because I have grinded out a high tier because I use it through the api in my production systems. With Anthropic I still run into throtteling, even though it's output seems better in terms of less halucinating.
Wrote about it: https://ghiculescu.substack.com/p/nobody-codes-here-anymore
The developers now have access to the license to the software, probably via their work email address.
I don't think the person you're replying to necessarily said that they are forcing their developers to use the tool.
https://plugins.jetbrains.com/plugin/26658-contextbuilder
> ContextBuilder is a plugin that lets you combine and manage code files for AI prompts or other tooling. Features include a tool window with the same UI as the old ContextHistoryDialog, filetype exclusions, and a history manager.
You're able to select a directory, multiple directories, a file, multiple files and or a combination of them. After determining your selection, Right Click in the Project Explorer and you'll see "Generate Context".
The "context-in-the-codebase" thing for AI-based IDEsis overrated IMO. To extract the most from it, you have to remember to @mention the files.
If you don't remember to @mention specific files, the agent simply tries to perform searches (i.e., it has access to that tool) on the files and folders until it gets some results... and will usually keep broadening the search until it does.
It works well enough I suppose. But I still find myself beginning new chats (for example, per feature) because the model still loses its place, and with all the code/lint fixes it does, it starts to lose context.
Then you're right back having to @mention more files to ensure the model knows how to structure your front end, back end, etc.
(Please excuse any misnamed development terms.) :-)
What I do notice is that the AI systems seem to forget to pull in a lot of context as the project size grows. It's almost as if there's something of a limit to the amount of input data, either to manage costs or compute cycles or something like that.
When I start "vibecoding" it goes really well in the beginning but then when the project grows in complexity, asking it to do something relatively simple falls apart as the AI "forgets" to do that really simple thing in every applicable place (e.g., removing a feature that is no longer needed).
- A custom vscode plugin to help me copy file contents from the tree view, along with file paths
- A chat based ide (LibreChat, but anything will do)
- An agent that syncs code back, once I'm happy with the conversation (https://github.com/codespin-ai/codebox)
Sometimes I add the agent right at the beginning. Sometimes I do that when I feel the code is ready to be written back. Another thing with the codebox agent is that it lets the agent run unit tests (even inside a docker compose environment, if you want integration tests); so in some cases it can be fire-and-forget.
However, I think it can be "vibe coded"these days.
It has also shortcuts for instance fixing errors in the code by just selecting "fix with copilot". Its a bit hit or miss but for simple things work well.
It is very good and seamless. Cannot think of ever copy pasting multiple files of code to some web chats and back.
From your description of how this plugin works, I see why it works so well. It is basically just focusing on the essentials (populating the appropriate context) and then does as best as it can to get out of the way ( <- this part is really important I think)
That, and the way it generates commit comments, I don't think I will write another commit message even again.
They are a pain to apply manually. I’ve tried several visual studio code extensions that apply patches from clipboard or provide an input for it, even saving them to a file and running command line tools - but they all fail
Anyone has a good system, extension or application to easily copy these diffs from ChatGPT and apply them to the code on visual studio (or some other editor)?
You can of course use their new agentic features which can automate a lot of the context management via tool use, but I find that to be a waste of time and resources for the majority of tasks I care about.
I currently manage my own fork to use a custom FIM completion provider I built, that tends to do most of the heavy lifting for me in terms of anything AI-generated.
Upgrade to a paid plan and you can use many of the same “thinking” models.
Honestly, your entire question is best answered by signing up for a free trial of one of these tools and using it. Not the free tier, a trial or a paid plan.
Copy and pasting into another app is extremely inefficient. You really need to try a tool that integrates your model of choice into the editor.
This is unlike most other companies, say Spotify, where if you get access to their source, there's still not much that you can do with it.
I have found Cursor sort of useful for aiding in refactors, where I need to e.g. refactor dozens of function calls to move to a new generic function or something. So stuff where it is more than just a find and replace. I've also used it a bit for programming languages I'm not familiar with, though I feel it hinders my learning of the language so I try to avoid that.
I use chat based stuff also to do these Input -> ? -> Output kind of things for data conversion or refactorings, I also use it a lot for generating more involved SQL (which I'm not very good at).
- Gemini 2.5 pro tries to fix and refactor my entire codebase unless I aggressively constrain it, which is too much work. I've written it off as unusable for coding, but can be fine for educational purposes
- For AI IDE coding, I narrow down the scope of what I ask so Gemini 2.0 Flash or Haiku can handle it. I've haven't seen better results switching to paid models.
- For generating large swaths of code, I recently went back to copy/pasting into Claude Projects, with my github project hooked up and relevant files added to context. For a moderately complex component, it usually takes 5-15 generated versions to work, though I end up adjusting my specs a bit in the process. This is still faster than using the agent. Claude Code might get similar results, but I already pay for Anthropic so haven't tried it out.
- If I'm picking up a new library, concept, or pattern, I usually chat with Claude to level up my knowledge. The "Plan" mode in cline focuses more on task execution than skill development.
I'm open to suggestions on what to try next!
https://prompt.16x.engineer/
It is used by quite a lot of people. So the problem is definitely there.
The app supports API integration as well. From usage stats, still there are more people using the copy pasting flow instead of API flow.
I suspect it is because people are already have a subscription so they can basically use that for free, versus via API where they have to pay.
With that said, Cursor's $20/month unlimited usage is really too good to miss. I will wait for it to end soon.
They acquired Supermaven last year, which was also one of the fastest code completion with the biggest context window even versus the new models. So they are able to do a lot of quality work locally before involving the major LLMs.
A year's subscription falls at about $16/month. I would say Cursor is the first real use of AI and it gives so much unfair advantage to those who use it. You get quality + speed/stability + low price, there's just little reason to paste anything into websites anymore, and no reason to go back to Copilot.
Another thing I've noticed is that writing the code myself has often a net longterm benefit. For a while I was generating all kinds of python scripts for converting data between formats etc. I'm not good with python, but LLMs are, so I just used that. Until I ran into a wall too many times. Eventually every "temporary script" kept growing until LLMs could no longer provide what I wanted, and then I was stuck. I would have to go through the whole code from start and internalize it fully to be able to get the thing done. I would have saved time doing it myself from the start. So now I mainly use the chat just to ask about available functions, syntax, etc, but rarely use the generated code. IDEs are not great for this kind of approach in my experience.
I use RepoPrompt [0] + Cursor Tab Complete (since it's easily the best on the market), but I don't use the in-IDE cursor/windsurf agentic flows unless I have a small task I need doing where I can describe it perfectly. RepoPrompt is an underhyped context selection tool that includes
- A prompt formatter so you can easily dump it + prompts into chat windows
- A built-in diff format prompt template that gets auto-inserted so you can auto-apply the resulting changes.
- A unique "Chat" mode that's an API frontend for multi-turn conversations with dynamic code context, templated prompting, etc.
- An incredible "delegate edit" feature where the results of your conversation get sent to cheaper/smaller LLMs to actually enact. The cost savings and increase in coherency are unreal. It's not unusual for a single turn to generate something like 50 specific edits across a dozen files that are completely coherent.
Right now my main workflow is to discuss architecture with a context dump + prompt to o3, settle on a solution, build a PRD, give that to 2.5 Pro to create a more granular plan, make edits, then send it back to 2.5 Pro to delegate edits to Flash 2.0/DeepSeekV3/o4-mini. Not unusal to see it produce 20-50 edits across a dozen files with perfect coherency. Any fixups or minor things I do in cursor.
There are also the CLI agentic code tools. I find claude code to be like a more powerful version of the cursor agent flow. They're great for directed tasks that involve a lot of exploration, like understanding how you need to integrate with some submodule on the other side of the codebase.
[0] https://repoprompt.com/
full files, JIRA tickets with is/should, DRY/KISS refactorings and root cause analysis are done this way
full files are usually just copy/pasted, everything else added for cursor as in insttuctions
To answer why I don't use chat at all for coding anymore, some benefits of IDE / CLI agents: Copy pasting into chat apps don't allow for agentic runs (ie run bash commands, edit code, validate with linting / testing before marking task as 'complete'). I can mostly just write my descriptive task (usually with code references if they aren't already there in clinerules / claude.md, etc.) and come back after a short walk or drinking some tea and check the code or just check at runtime to ask it to fix up the code. This is honestly very similar to "waiting for code to compile."
For reference, I've used the following to make production code changes:
- Cursor: which when Claude 3.5 was out seemed to be the best AI IDE, but the changes they've made around agent mode and 3.7 haven't really worked super well for me. Migrated to claude code after 3.7 came out.
- Claude Code: Very good, I generally just ran it in dangerous mode and it could accomplish tasks fairly well. Wrote a decent Claude.md file as well. Only con is that it got expensive quickly. On the order of $3-5 per session which would accomplish a single task.
- Cline with Gemini 2.5 Pro: Moved to it yesterday, it's been doing a good job and is effectively free right now using my own API key for Gemini. Seems a bit verbose at times although that might just be Cline's prompts.
I haven't tried Aider or Windsurf, but have heard good things about Windsurf's agentic mode. Although I might not move to Windsurf at all since Cline with Gemini works pretty well & is free to use.
I think the cost is/was a big factor preventing many from switching from Chat-based coding to Cline/Cursor/Aider.
For my hobby projects I don't want to spend 3 USD to get some assistance which I can get for free through a monthly subscription (and some extra pasting).
Google is smart to offer a free use tier on API-usage which seems enough for agentic coding.
I was extremely disappointed. It recreated enum files with the same values, tons of changes, overall I rolled back.
I use copy/paste into ChatGPT/Claude a lot. I will stick with that. I control the scope, I control what I copy or how I take the solutions.
I stand by my belief that AI cannot work on medium to large codebases. I know we see vibe coding as a trend, but those are usually greenfield applications.
Very surprised when I asked Cursor to diagram some code, saves me so much time. I wish it would output a better format than ASCII.
I would definitely use it over copy pasting for the convenience. However, I only code as a hobby these days, so probably won't pay for it unless I really get stuck into a larger bit of work - but would definitely then.
For engineering though, it's probably better to spend more money on Claude Code. He will be your super-knowledgeable junior assistant, which will result in much higher efficiency. AIDER is also an option.
Much more hit and miss for actually writing code. Sometimes it works.
When I copy paste via ChatGPT I direct the model on what to do and retain responsibility and understanding of all code getting saved.
I do really like Cascade in windsurf because it does a decent job of parsing the entire codebase and finding the spot to make changes. But I think I’ll probably be most successful combining my approaches and using Cascade at a coding level not a “vibe” level
That being said, it is worth the price as it can be helpful sometimes. I pay $15 per month and haven't gone over the token limit yet. ChatGPT is still fundamentally more useful however.
For writting code the old fashion way. You type the code the intellisense/auto-complete build in to the like of cursor AI is the most annoying thing i have ever seen in a IDE. You constantly sitting hitting the escape key try to get their auto-complete suggestion to go away. Most of the time is it guessing where wrong and is mostly useless.
The biggest issue is when you try to use a class that has properties. When you press the dot after the class name normally the IDE will list the public properties on the class.
Now you have to battle that the list now longer contains just the properties found on the class but also non-existing properties invented by the AI IDE. As clever as this AI autocomplete can be, i really hate it because I can no longer rely on that my IDE is telling the truth about what properties and functions there is exist on a class.
and if you are the type of programmer that like to put empty lines in your code for readability. Then moment you hit enter you now get an often completly irrelevnat code suggestion and need to battle that suggestion to go away. I just want an blank line for crying out loud.
PS: my AI IDE is cursor
Compared to Cursor with full codebase context where I can get away with less, so I typically fall back to the lazy prompting patterns of "not like that, do better". Which likely eats up more time than had I prompted well to begin with.
VS Code is very quickly catching up to Cursor as well (added tab-driven completions).
Also you don’t have to be right on the edge of things. You can be, or you can wait for the dust to settle. Claude Code / OpenAI Codex and the agent modes are probably the edge now.
Also if you’re still using a terminal (who isn’t), try Warp. It’s improved a ton over the last year, support more niche shells like Fish now. It’s actually a really “fluid” AI chat integration, because it smartly figures out when what you’re typing is not a command but a prompt. (It’s probably also on the edge, I haven’t figured out how to tell it “you’re wrong” when it asks for a permission for example, I have to “cancel” the chat).
With every update to the Copilot extension, I see less and less of a need to switch away from VS Code to some “dedicated AI editor du jour”.
Maybe I’m missing something, or maybe people aren’t keeping up with Copilot’s capabilities?
⁽¹⁾ https://code.visualstudio.com/docs/copilot/reference/copilot...
⁽²⁾ https://code.visualstudio.com/docs/copilot/chat/chat-agent-m...
⁽³⁾ https://code.visualstudio.com/docs/copilot/chat/copilot-edit...
https://prompt.16x.engineer/cli-tools
You can install it in a few seconds, one line: curl 9ol.es/talker | sh
tell me what you think.
Is this true? Isn't the context size controlled by the model? Is there any difference (outside convenience) to pasting your code in the web interface?
Maybe the logic for gathering the context is wrong, the sampling parameters are tweaked or the tool context is messing things up.
`cat file.js | xclip -sel clip`
On Linux, this copies the whole file to the clipboard and I can paste it in.
This is considerably faster than using the mouse.
I can also get a code review done by getting the diff `git diff develop..feature/my-branch | xclip -sel clip`
The big secret to success?
JetBrains IDEs have “compare with clipboard” - essential tool to verify what the LLM changed.
With ChatGPT Canvas I’ve completed small projects like WordPress plugins, handy personal Python scripts, chrome extensions (my favorite use of ChatGPT) and most importantly debugged faster than any tool I’ve ever used. Sure, my customer-facing GPT API apps are burning credits like a rocket on full throttle but at least I’m billing them for it. The API speed is wildly inconsistent so you can’t rely on it for everything.
One key tip is you can’t keep a single chat going forever. It gets sluggish, especially when you paste long code blocks. My method is simple. I ask for the final clean code, then request a complete detailed summary of what we built and what the functions do. Then I start a fresh chat, paste the summary and code, and continue clean. Think of it like making a commit, just a very crude version.
It suits my workflow a lot better because I can reject suggestions easily, or accept partial diffs. I can approve certain chunks but instruct it to rethink other parts if I think they are a bad approach. This isn’t possible if you are just copypasting chunks of code back and forth between an IDE and browser window.
As for the cost, you get 500 “fast” requests per month for Premium models. If you use them up, then you can still use the Premium models, but are placed in a queue so have to wait a bit longer (Cursor refers to these as slow requests). I haven’t been in this situation yet, but I am not a super heavy user.
I find the trick is to give simpler requests to the unmetered models, and save the harder requests for the premium models. You can select the model you want to use in the chat or agent box.
The model list is here : https://docs.cursor.com/settings/models
So, for example, I might give simpler requests that save time or involve boilerplate to Deepseek-v3, gemini-2.5-flash-preview, gpt-4o-mini, or grok3-mini-beta. These are unmetered, so you can use them as much as you like. For requests that are more difficult and require better thought, I would choose something like claude-3.7-sonnet, or gemini-2.5-pro.
For context, I prefer the workflow of using Ask (chat) mode and working on a single file at a time. If I want another file to be considered, I refer to it using the @ convention in Cursor to refer to other sources. Others have complained about Cursor skimping on the context window of requests to control costs, so I follow this approach to make sure that the things I specify are considered. The files referenced remain in the context of the chat, so they are included for followup requests.
I feel that for those of us with programming experience, using it in a discriminating way is the best approach. It might be disconcerting to see people rage about Cursor on Reddit, but there are a lot of Vibe coders on r/cursor with unrealistic expectations. So they might well only use Agentic mode with vague directions, then rage because they burned through all their credits in a weekend, or had the models erase code because they’re unaware of Version Control. Agentic mode can be useful, but I want to at least glance at every diff to make sure things are going in the right direction.
I would suggest the free trial to see if it improves your workflow as much as it did mine. BTW, I find that working in TypeScript is a sweet spot for this, as VSCode and derivatives excel at TS, and when diffs are suggested you get immediate feedback on whether it would break compilation before you’ve even accepted the suggestions.
I guess they try to do things to give better results: give the context a list of file names, give the context the full class graph of the Classes in the current working file (not sure about any specifics really) Reality is they don't do a whole bunch... but without doing what they do, it would be infuriating to figure that all out yourself , over and over and over for each thing you want to work on.
But dont worry, we just have to wrap layers of LLMs around layers of LLMs around loops of retrying and testing around loops of detecting when the censorship starts evolving and breaking until we finally just all decide to become farmers