Cursor raised 900M, are losing market share to claude code(resorting to poaching 2 leads from there [1]), AND they're decreasing the value of their product? Huge red flag. They should be able to burn cash like no tomorrow. Also, the PR language on this post, and the timing(midnight on a US holiday) is not ideal.
This news coupled with google raising the new gemini flash cost by 5x, azure dropping their startup credits, and 2-3 others(papers showing RL has also hit a wall for distilling or improving models), are now solid signals that despite what Sam altman says, intelligence will NOT be soon too cheap to meter. I think we are starting to see the squeeze from the big players. Interesting. I wonder how many startups are betting on models becoming 5-10x cheaper for their business models. If on device models don't get good, I bet a lot of them are in big trouble
> I think we are starting to see the squeeze from the big players.
I’m not convinced that these price increases represent an attempt to squeeze more profit out of a saturated market.
To me they look an awful lot like people realising that the sheer compute cost associated with modern models makes the historical zero-marginal cost model of software impossible. API calls to LLM models have far more in common with making calls to EC2 or Lambda for compute, than they do a standard API calls for SaSS.
A lot of early LLM based business models seemed to assume that the historical near zero-marginal cost of delivery for software would somehow apply to hosted LLM models which clearly isn’t the case.
You mix that in with rising datacenter costs, driven by lack of available electricity infrastructure to meet their demands, plus everyone trying to grab as much LLM land as possible, which requires more datacenters, more faster. And the result is rapidly increasing base costs for compute. Which we’re now seeing reflected in LLM pricing.
For me the thing that stands out about LLMs, is that their compute costs are easily 100-10000x greater per API call than a traditional SaSS API call. That fact alone should be enough for people to realise that the historically bottomless VC money that normal funds this stuff, isn’t quite a bottomless as it needs to be to meaningfully subsidise consumer pricing.
barrell · 3h ago
I don’t see any difference between your perception of events and “the squeeze,” except yours has more details. I think you and GP actually agree
(Edited for tone)
patapong · 5h ago
Very insightful. I think the payment model would have worked out just fine if the state of the art was the optimization of GPT-4 class models to bring down the cost over time, which would have made the services profitable eventually. Instead, newer models are getting larger and more resource heavy through reasoning, meaning costs per request are going up instead of down.
avianlyric · 4h ago
I think we’ll start seeing people focus on optimisation, we already see companies like Apple focus on it.
LLM are still to new, and still advancing to quickly for optimisation to take place. It’s like we’re back in the MHz wars of old between CPU manufacturers. The goal is just more performance, regardless of cost, because it was clear that even in the consumer space, people wanted more performance.
Then we hit a kind of plateau in last 10 years, where basic compute is so powerful that your average consumer is not longer upgrading every year for better performance. A 5 year old machine has enough performance for most people. Then the focus on energy efficiency kicked in, because people didn’t want faster computers, they wanted battery life and cheaper computers.
No doubt we’ll see the same with LLM, possibly quite soon. Claude Sonnet 4 and similar class models have enough reasoning performance, that agentic systems can be quite reliable. Which means we hit the base level of “reasoning” performance needed, and we can extend that “performance” in domain specific ways by lightly customising the agentic framework, with no need to fine tuning. The elimination of fine tuning to build domain specific agents is a huge game changer. But it also means that putting together a 10x or 100x efficient model, with “reasoning” performance equivalent to current gen LLM would also be a huge game changer. It opens up the possibility to apply this tech into spaces that currently require either lots of specialists knowledge to fine tune an LLM, or a huge amount of on tap compute to allow the agents to take enough turns to slowly “reason” they’re way through problems.
But a Claude Sonnet 4 that runs on a iPhone for example. That would make Apple’s complete failure to improve Siri look like a genius level move. Why bother with small incremental improvements using current tech, when waiting a few years, and just stuffing a full fat LLM and agent system into an iPhone will basically give you the ultimate Siri.
jonplackett · 6h ago
What’s funny is even electricity (nuclear in particular) isn’t ’too cheap to meter’ as originally promised. It’s actually the most expensive.
exclipy · 6h ago
What happened to wafer inference hardware like Cerebras? Why isn't Claude being served from that if it's so much faster and energy efficient?
totaa · 6h ago
Currently Cerebras, although faster, is more expensive than the traditional alternatives. Cursor's use case doesn't benefit from instant, users are happy to wait the few seconds (and watching the magic may even be beneficial)
niux · 5h ago
How is it more expensive?
avianlyric · 5h ago
I doubt Cerebras has even close to the scale to be a major player in this area.
Nvidia sold $35B of just datacenter GPUs last year. Of which the vast majority will be used for AI.
Cerebra entire revenue last year was only $78M. That’s three orders of magnitude smaller than Nvidia datacenter GPU business. Scaling a company 10X in a year is a pretty hard thing to do, and it’s not a question of money, it’s a question of people and organisation. So much stuff in a business breaks when it scales 10X, that it take months to years to fix enough stuff to support another 10x growth spurt without everything just imploding.
anonthrowawy · 6h ago
> papers showing RL also hitting a wall
any reference for this?
msgodel · 5h ago
Heh. Qwen3 still works on my machine.
NitpickLawyer · 8h ago
Cursor is an interesting case study on the wrapper vs. core debate. They were the first big success story in the coding space, enjoyed first mover advantages, and sweetheart deals with volume tokens from providers.
Now that all the providers have moved towards in-housing their coding solutions, the sweetheart deals are gone. And the wrapper goes back to "at cost" usage. Which, on paper should be less value / $ than any of the tiers offered by the providers themselves.
Whatever data they collected, and whatever investments they made in core tech remains to be seen. And it's a question of making use of that data. We can see that it is highly valuable for providers (as they all subsidise usage to an extent). Goog is offering lots of stuff for free, presumably to collect data.
One interesting distinction is on cursor vs. windsurf. Windsurf actually developed some internal models (either pre-trained or post-trained I don't know, but probably doesn't matter much) swe1 and swe1-lite that are actually pretty good on their own. I don't think cursor has done that, beyond their tab-next-prediction model. A clue, perhaps.
Anyway, it will be interesting to see how this all unfolds 2-5 years from now.
avianlyric · 5h ago
I think LLM agents have completely broken the business model that companies like Cursor were founded on.
Early on Cursor added value by finding clever to integrate LLM into an IDE, which would allow single shot output of an LLM to produce something useful, and do so quickly. That required a fair bit of engineering to make happen reliably.
But LLM agents completely break that. The moment people realised that rather than trying to bend our tools to work within the limits of an LLM, we could instead just make LLM “self-prompt” their way to better outputs, Cursors model stopped being viable.
It’s another classic case of the AI “Bitter Lesson”[0] being learned. Throwing more data, and more compute at AI produces faster, better progress, than careful methodical engineering.
I’m interested if anyone here knows what exactly Cursor has built? My limited understanding is that they’ve done nothing but fork VS code and add a chat window and AI-integrated editing tools.
stavros · 4h ago
Google has done nothing but make another search engine, Apple has done nothing but make another phone, Ferrari has done nothing but make another car, etc.
roxolotl · 3h ago
Eh that’s not quite fair. Google isn’t wrapping a search engine. Apple isn’t selling slightly optimized versions of other manufacturers’ phones.
Cursor is a solid tool but as best I can tell there’s not a ton there.
stavros · 3h ago
They have their own models, though, and their code completion is basically magic. It's not just a UI around GPT.
bravesoul2 · 7h ago
I saw an ad today OpenAI offering free AI interior design mock-up. Not sure if it a specific feature or just way to use image generation but either way they are commoditizing the thin wrappers.
jerpint · 7h ago
Cursor developed their own tab models IIRC
fullstackwife · 5h ago
but Gemini is perfect for tab completion, and the price for Gemini will be always lower
apt-apt-apt-apt · 7h ago
'We previously described ... as "rate limits", which wasn't an intuitive way to describe the limit. It is a usage credit pool'
Very strange that they decided to describe monthly credits as rate limits, and then spin it as 'unintuitive'. Feels like someone is trying to pull a fast one.
eru · 5h ago
Well, rate limits take a moving window of time (say, one second) and check how many requests you make during that time, and throttle you, if necessary.
Cursor just makes that window one month long.
Technically, that's a rate limit.
But yeah, only technically.
cpursley · 7h ago
I’ve deleted all of these wrappers (cursor, windsurf etc) after discovering Claude Code on pro. I’m not sure how it does it, but it’s just better. And ultimately, more cost effective.
rorads · 5h ago
I think it’s fundamentally about context management and business model. Claude Code is expensive because it will happily put very large volumes into context because Anthropic are paid by the token. Cursor makes the bet that it can pay less per token whilst giving you enough value to still make margins on your $20 per month (assuming you’re using their default models).
This all becomes very clear when you do something that feels like magic in Claude Code and then run /cost and see you’ve blown through $10 in a single hour long session. Which is honestly worth it for me.
anon7000 · 4h ago
I’ve found cursor to be meaningfully faster, and tab autocomplete is really nice. It’s not like I can avoid touching code anyways, and when I do, cursor tab is near perfect at being a very smart auto complete. Claude is running through AWS bedrock though, so that could be the performance issue. But I do much prefer the terminal app for prompting
lunalabs · 5h ago
how many requests do you get per 5 hour window?
cpursley · 4h ago
More than I can use on the $200 plan, and I hit it hard 9 hours a day. It’s mind blowingly good at react and sql as well as cli tools
exclipy · 6h ago
That adage that "this is the worst it'll ever be" when people espouse AI coding agents is looking a bit shoddy. No, costs don't inevitably go down when you're on a sweetheart deal.
zyngaro · 6h ago
"We’re improving how we communicate future pricing changes" like clearly and explicitly stating what your customers are paying for ? What kind of BS is this ?
submeta · 7h ago
I moved to Claude Code, Ghostty, Tmux and NeoVim. Very happy with my setup.
Not because of Cursor‘s pricing, but because in the end Claude Code is unmatched.
eknkc · 7h ago
I seem to like cursor agent with sonnet. Or even copilot agent with sonnet. The editor integrated agent feels better.
For example they can react to in editor linter errors without running a lint command etc.
niux · 5h ago
Claude Code can also do that if you pair it with the VS Code extension.
obblekk · 7h ago
Have people been dropping cursor usage for Claude code? I have dropped to using cursor as just an ide with auto complete. Curious if others are doing this too.
ipnon · 6h ago
Claude Code makes me feel like I'm dispatching a legit engineer to go get something done. But they come back in a minute instead of a week. Most of the time the solution gets the job done. Sometimes it introduces too much complexity, sometimes it's totally wrong, but it gets the job done. Cursor meanwhile just feels like shortening the (copy editor/paste chat/copy chat/paste editor) loop.
For $200/month you can get equivalent value to a team of engineers. Plan accordingly! The stack is no longer safe for employment. You need to move up to manager or move down to metal.
ducksinhats · 5h ago
> You need to move up to manager or move down to metal.
Why couldn't Claude do a managers job?
aprilthird2021 · 5h ago
What's an example of something you've had Claude Code do that would take a software engineer a week to do? Just curious.
I see people mention converting old legacy code from an old language to something more modern. I've also seen people mention greenfield projects.
Anything other than this? I'm trying to bring this productivity to my work but so far haven't been able to replace a week of work in a few minutes yet
mtkd · 2h ago
Last week stripped out all CSS from a fairly substantial project and replaced with Tailwind equivs, it got all but a few cases right
That was gemini-cli, I could see some mistakes on trial run so created a GEMINI.md with system prompt and project description (about 50 lines) which clarified some tricky source layout situations
Second run it was fine, ran for about an hour or so -- I had attempted to do it manually a while back but it started to look like it would take a week or two
matt3210 · 6h ago
Auto complete isn’t an AI thing
hn_throw2025 · 6h ago
Cursor’s autocomplete is SuperMaven (which they acquired).
From the site :
“Supermaven uses Babble, a proprietary language model specifically optimized for inline code completion. Our in-house models and serving infrastructure allow us to provide the fastest completions and the longest context window of any copilot.”
avianlyric · 4h ago
LLMs are literally auto-complete models. I just so happens that when your auto-complete model gets big enough, and you poke it in the right way, it accidentally pretends to be intelligent. And it turns out, that pretending to be intelligent is almost as useful as actually being intelligent.
hemmert · 6h ago
I recently started getting the feeling that they also make their workflows way more inefficient for me: their models started to always ask "do you want me to make that change for you", before they made the edits and I would simply reject them if they were not what I needed.
martin-adams · 6h ago
I noticed that as well so fixed it out of auto mode and onto the Claude 4 Sonnet 4. It takes longer but gives a better results.
I also saw that it was 0.8x the ‘credit cost’ thinking still that I had 500.
Now to learn that the 500 has gone and you get unlimited only on auto shows how easy it has been to misunderstand what they’re trying to say.
Also, I’ve no idea how to find out the cost of MAX. Especially as their web agent has the text MAX next to the selected non-max model.
primitivesuave · 6h ago
For me personally, I only care about the tab autocomplete. If that's a feature I have unlimited access to, I will happily pay $20/month for 10x productivity.
hn_throw2025 · 3h ago
It is. Tab is always included.
minimaxir · 8h ago
Did they really drop this news at Friday night, on a holiday?
mceoin · 8h ago
I was wondering who would catch that.
joshdavham · 8h ago
That's startups for you!
bravesoul2 · 7h ago
It's 9am on a workday somewhere in the world. Let's work!
mrklol · 6h ago
Just switch to amp code, their agent is superior in every way. With their tooling you achieve even more although they are using the same models.
$20 on API pricing is what Claude Pro will give you in a day. It doesn't matter how good cursor is, this is a massive limitation and price differential that they can't overcome. Even if they go with DeepSeek which is much cheaper, they are still significantly more expensive than a Claude subscription.
likium · 7h ago
That's what Claude Pro will give you... now. As with Cursor there's no guarantee that it'll last. Next year, next month, next week, they may change their pricing when the competition runs dry.
senko · 7h ago
With competition being Microsoft and Google, that might take a while ...
csomar · 5h ago
It'll not last but it might last longer than Cursor can remain solvent.
mellosouls · 6h ago
NB. For the time-being at least, you can OPT OUT of the new pricing on the Settings Menu in Advanced Settings.
Cursor deserve the criticism, but its been pretty obvious since they introduced the new Ultra plan that we were going to see classic enshittification on the formerly premium options. Very frustrating for long term supporters especially.
Note that although this update apologises for the miscommunication (which looks more like deliberate obfuscation), the only option for getting what we paid for (I'm on the annual plan) is the menu setting above, which should be default on!
This news coupled with google raising the new gemini flash cost by 5x, azure dropping their startup credits, and 2-3 others(papers showing RL has also hit a wall for distilling or improving models), are now solid signals that despite what Sam altman says, intelligence will NOT be soon too cheap to meter. I think we are starting to see the squeeze from the big players. Interesting. I wonder how many startups are betting on models becoming 5-10x cheaper for their business models. If on device models don't get good, I bet a lot of them are in big trouble
[1] https://www.investing.com/news/economy-news/anysphere-hires-...
I’m not convinced that these price increases represent an attempt to squeeze more profit out of a saturated market.
To me they look an awful lot like people realising that the sheer compute cost associated with modern models makes the historical zero-marginal cost model of software impossible. API calls to LLM models have far more in common with making calls to EC2 or Lambda for compute, than they do a standard API calls for SaSS.
A lot of early LLM based business models seemed to assume that the historical near zero-marginal cost of delivery for software would somehow apply to hosted LLM models which clearly isn’t the case.
You mix that in with rising datacenter costs, driven by lack of available electricity infrastructure to meet their demands, plus everyone trying to grab as much LLM land as possible, which requires more datacenters, more faster. And the result is rapidly increasing base costs for compute. Which we’re now seeing reflected in LLM pricing.
For me the thing that stands out about LLMs, is that their compute costs are easily 100-10000x greater per API call than a traditional SaSS API call. That fact alone should be enough for people to realise that the historically bottomless VC money that normal funds this stuff, isn’t quite a bottomless as it needs to be to meaningfully subsidise consumer pricing.
(Edited for tone)
LLM are still to new, and still advancing to quickly for optimisation to take place. It’s like we’re back in the MHz wars of old between CPU manufacturers. The goal is just more performance, regardless of cost, because it was clear that even in the consumer space, people wanted more performance.
Then we hit a kind of plateau in last 10 years, where basic compute is so powerful that your average consumer is not longer upgrading every year for better performance. A 5 year old machine has enough performance for most people. Then the focus on energy efficiency kicked in, because people didn’t want faster computers, they wanted battery life and cheaper computers.
No doubt we’ll see the same with LLM, possibly quite soon. Claude Sonnet 4 and similar class models have enough reasoning performance, that agentic systems can be quite reliable. Which means we hit the base level of “reasoning” performance needed, and we can extend that “performance” in domain specific ways by lightly customising the agentic framework, with no need to fine tuning. The elimination of fine tuning to build domain specific agents is a huge game changer. But it also means that putting together a 10x or 100x efficient model, with “reasoning” performance equivalent to current gen LLM would also be a huge game changer. It opens up the possibility to apply this tech into spaces that currently require either lots of specialists knowledge to fine tune an LLM, or a huge amount of on tap compute to allow the agents to take enough turns to slowly “reason” they’re way through problems.
But a Claude Sonnet 4 that runs on a iPhone for example. That would make Apple’s complete failure to improve Siri look like a genius level move. Why bother with small incremental improvements using current tech, when waiting a few years, and just stuffing a full fat LLM and agent system into an iPhone will basically give you the ultimate Siri.
Nvidia sold $35B of just datacenter GPUs last year. Of which the vast majority will be used for AI.
Cerebra entire revenue last year was only $78M. That’s three orders of magnitude smaller than Nvidia datacenter GPU business. Scaling a company 10X in a year is a pretty hard thing to do, and it’s not a question of money, it’s a question of people and organisation. So much stuff in a business breaks when it scales 10X, that it take months to years to fix enough stuff to support another 10x growth spurt without everything just imploding.
any reference for this?
Now that all the providers have moved towards in-housing their coding solutions, the sweetheart deals are gone. And the wrapper goes back to "at cost" usage. Which, on paper should be less value / $ than any of the tiers offered by the providers themselves.
Whatever data they collected, and whatever investments they made in core tech remains to be seen. And it's a question of making use of that data. We can see that it is highly valuable for providers (as they all subsidise usage to an extent). Goog is offering lots of stuff for free, presumably to collect data.
One interesting distinction is on cursor vs. windsurf. Windsurf actually developed some internal models (either pre-trained or post-trained I don't know, but probably doesn't matter much) swe1 and swe1-lite that are actually pretty good on their own. I don't think cursor has done that, beyond their tab-next-prediction model. A clue, perhaps.
Anyway, it will be interesting to see how this all unfolds 2-5 years from now.
Early on Cursor added value by finding clever to integrate LLM into an IDE, which would allow single shot output of an LLM to produce something useful, and do so quickly. That required a fair bit of engineering to make happen reliably.
But LLM agents completely break that. The moment people realised that rather than trying to bend our tools to work within the limits of an LLM, we could instead just make LLM “self-prompt” their way to better outputs, Cursors model stopped being viable.
It’s another classic case of the AI “Bitter Lesson”[0] being learned. Throwing more data, and more compute at AI produces faster, better progress, than careful methodical engineering.
[0] http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Cursor is a solid tool but as best I can tell there’s not a ton there.
Cursor just makes that window one month long.
Technically, that's a rate limit.
But yeah, only technically.
This all becomes very clear when you do something that feels like magic in Claude Code and then run /cost and see you’ve blown through $10 in a single hour long session. Which is honestly worth it for me.
Not because of Cursor‘s pricing, but because in the end Claude Code is unmatched.
For example they can react to in editor linter errors without running a lint command etc.
For $200/month you can get equivalent value to a team of engineers. Plan accordingly! The stack is no longer safe for employment. You need to move up to manager or move down to metal.
Why couldn't Claude do a managers job?
I see people mention converting old legacy code from an old language to something more modern. I've also seen people mention greenfield projects.
Anything other than this? I'm trying to bring this productivity to my work but so far haven't been able to replace a week of work in a few minutes yet
That was gemini-cli, I could see some mistakes on trial run so created a GEMINI.md with system prompt and project description (about 50 lines) which clarified some tricky source layout situations
Second run it was fine, ran for about an hour or so -- I had attempted to do it manually a while back but it started to look like it would take a week or two
From the site : “Supermaven uses Babble, a proprietary language model specifically optimized for inline code completion. Our in-house models and serving infrastructure allow us to provide the fastest completions and the longest context window of any copilot.”
I also saw that it was 0.8x the ‘credit cost’ thinking still that I had 500.
Now to learn that the 500 has gone and you get unlimited only on auto shows how easy it has been to misunderstand what they’re trying to say.
Also, I’ve no idea how to find out the cost of MAX. Especially as their web agent has the text MAX next to the selected non-max model.
https://ampcode.com
No comments yet
$20 on API pricing is what Claude Pro will give you in a day. It doesn't matter how good cursor is, this is a massive limitation and price differential that they can't overcome. Even if they go with DeepSeek which is much cheaper, they are still significantly more expensive than a Claude subscription.
Cursor deserve the criticism, but its been pretty obvious since they introduced the new Ultra plan that we were going to see classic enshittification on the formerly premium options. Very frustrating for long term supporters especially.
Note that although this update apologises for the miscommunication (which looks more like deliberate obfuscation), the only option for getting what we paid for (I'm on the annual plan) is the menu setting above, which should be default on!