A major current problem is that we're smashing gnats with sledgehammers via undifferentiated model use.
Not every problem needs a SOTA generalist model, and as we get systems/services that are more "bundles" of different models with specific purposes I think we will see better usage graphs.
simonjgreen · 1m ago
Completely agree. It’s worth spending time to experiment too. A reasonably simple chat support system I build recently uses 5 different models dependent on the function it it’s in. Swapping out different models for different things makes a huge difference to cost, user experience, and quality.
mustyoshi · 2m ago
Yeah this is the thing people miss a lot. 7,32b models work perfectly fine for a lot of things, and run on previously high end consumer hardware.
But we're still in the hype phase, people will come to their senses once the large model performance starts to plateau
mark_l_watson · 8m ago
I have already thought a lot about the large packaged inference companies hitting a financial brick wall, but I was surprised by material near the end of the article: the discussions of lock in for companies that can’t switch and about Replit making money on the whole stack. Really interesting.
I managed a deep learning team at Capital One and the lock-in thing is real. Replit is an interesting case study for me because after a one week free agent trial I signed up for a one year subscription, had fun the their agent LLM-based coding assistant for a few weeks, and almost never used their coding agent after that, but I still have fun with Replit as an easy way to spin up Nix based coding environments. Replit seems to offer something for everyone.
comrade1234 · 21m ago
I'm kind of curious what IntelliJ's deal is with the different providers. I usually just keep it set to Claude but there are others that you can pick. I don't pay extra for the AI assistant - it's part of my regular subscription. I don't think I use the AI features as heavily as many others, but it does feed my code base to whoever I'm set to...
louthy · 12m ago
Are you sure you don’t pay extra? I’m on Rider and it’s an additional cost. Unless us C# and F# devs are subsidising everyone else :D
Edit: It says on the Jetbrains website:
“The AI Assistant plugin is not bundled and is not enabled in IntelliJ IDEA by default.
AI Assistant will not be active and will not have access to your code unless you install the plugin, acquire a JetBrains AI Service license and give your explicit consent to JetBrains AI Terms of Service and JetBrains AI Acceptable Use Policy while installing the plugin.”
double051 · 7m ago
If you pay for the all products subscription, their AI features are now bundled in.
I believe that may be a relatively recent change, and I would not have known about it if I hadn't been curious and checked.
comrade1234 · 9m ago
When they first added the assistant it was $100/yr to enable it. However, it's now part of the subscription and they even reimbursed me a portion of the $100 that I paid.
flyinglizard · 12m ago
The truth is we're brute forcing some problems via tremendous amount of compute. Especially for apps that use AI backends (rather than chats where you interface with the LLM directly), there needs to be hybridization. I haven't used Claude Code myself but I did a screenshare session with someone who does and I think I saw it running old fashioned keyword search on the codebase. That's much more effective than just pushing more and more raw data into the chat context.
On one of the systems I'm developing I'm using LLMs to compile user intents to a DSL, without every looking at the real data to be examined. There are ways; increased context length is bad for speed, cost and scalability.
ath3nd · 19m ago
Mathematics are not relevant when we have hype and vibes. We can't have facts and projections and no path to profitability distract us from our final goal.
Which, of course, is to donate money to Sama so he can create AGI and be less lonely with his robotic girlfriend, I mean...change the world for the better somehow. /s
NitpickLawyer · 4m ago
I get your point but I think it's debatable. As long as the capabilities increase (and they have, IMO) cost isn't really relevant. If you can reasonably solve problems of a given difficulty (and we're starting to see that), then suddenly you can do stuff that you simply can't with humans. You can "hire" 100 agents / servers / API bundles, whatever and "solve" all tasks with difficulty x in your business. Then you cancel and your bottom is suddenly raised. You can't do that with humans. You can't suddenly hire 100 entry-level SWEs and fire them after 3 months.
Then you can think about automated labs. If things pan out, we can have the same thing in chemistry/bio/physics. Having automated labs definitely seems closer now than 2.5 years ago. Is cost relevant when you can have a lab test formulas 24/7/365? Is cost a blocker when you can have a cure to cancer_type_a? And then _b_c...etc?
Also, remember that costs go down within a few generations. There's no reason to think this will stop.
Not every problem needs a SOTA generalist model, and as we get systems/services that are more "bundles" of different models with specific purposes I think we will see better usage graphs.
But we're still in the hype phase, people will come to their senses once the large model performance starts to plateau
I managed a deep learning team at Capital One and the lock-in thing is real. Replit is an interesting case study for me because after a one week free agent trial I signed up for a one year subscription, had fun the their agent LLM-based coding assistant for a few weeks, and almost never used their coding agent after that, but I still have fun with Replit as an easy way to spin up Nix based coding environments. Replit seems to offer something for everyone.
Edit: It says on the Jetbrains website:
“The AI Assistant plugin is not bundled and is not enabled in IntelliJ IDEA by default. AI Assistant will not be active and will not have access to your code unless you install the plugin, acquire a JetBrains AI Service license and give your explicit consent to JetBrains AI Terms of Service and JetBrains AI Acceptable Use Policy while installing the plugin.”
On one of the systems I'm developing I'm using LLMs to compile user intents to a DSL, without every looking at the real data to be examined. There are ways; increased context length is bad for speed, cost and scalability.
Which, of course, is to donate money to Sama so he can create AGI and be less lonely with his robotic girlfriend, I mean...change the world for the better somehow. /s
Then you can think about automated labs. If things pan out, we can have the same thing in chemistry/bio/physics. Having automated labs definitely seems closer now than 2.5 years ago. Is cost relevant when you can have a lab test formulas 24/7/365? Is cost a blocker when you can have a cure to cancer_type_a? And then _b_c...etc?
Also, remember that costs go down within a few generations. There's no reason to think this will stop.