GPT-4 at $24.7 per million tokens vs Mixtral at $0.24 - that's a 100x cost difference! Even if routing gets it wrong 20% of the time, the economics still work. But the real question is how you measure 'performance' - user satisfaction doesn't always correlate with technical metrics.
FINDarkside · 11m ago
It's trivial to get around the same score than GPT-4 with 1% of the cost by using my propertiary routing algorithm that routes all requests to Gemini 2.5 Flash.
Keyframe · 32m ago
number of complaints / million tokens?
pqtyw · 16m ago
> GPT-4 at $24.7 per million tokens
While technically true why would you want to use it when OpenAI itself provides a bunch of many times cheaper and better models?
QuadmasterXLII · 37m ago
The framing in the headline is interesting. As far as I recall, spending 4x more compute on a model to improve performance by 7% is the move that has worked over and over again up to this point. 101 % of GPT-4 performance (potentially at any cost) is what I would expect an improved routing algorithm to achieve.
spoaceman7777 · 22m ago
Incredible that they are using contextual bandits, and named it:
Preference-prior Informed Linucb fOr adaptive rouTing (PILOT)
Rather than the much more obvious:
Preference-prior Informed Linucb For Adaptive Routing (PILFAR)
fny · 51m ago
Is there a reason human preference data is even needed? Don't LLMs already have a strong enough notion of question complexity to build a dataset for routing?
delichon · 43m ago
> a strong enough notion of question complexity
Aka Wisdom. No, LLMs don't have that. Me neither, I usually have to step in the rabbit holes in order to detect them.
jibal · 20m ago
LLMs don't have notions ... they are pattern matchers against a vast database of human text.
mhh__ · 8m ago
Please do a SELECT * from this database
andrewflnr · 48m ago
Is this really the frontier of LLM research? I guess we really aren't getting AGI any time soon, then. It makes me a little less worried about the future, honestly.
kenjackson · 41m ago
First, I don't think we will ever get to AGI. Not because we won't see huge advances still, but AGI is a moving ambiguous target that we won't get consensus on.
But why does this paper impact your thinking on it? It is about budget and recognizing that different LLMs have different cost structures. It's not really an attempt to improve LLM performance measured absolutely.
yahoozoo · 26s ago
That and LLMs are seemingly plateauing. Earlier this year, it seemed like the big companies were releasing noticeable improvements every other week. People would joke a few weeks is “an eternity” in AI…so what time span are we looking at now?
srekhi · 43m ago
I'm not following this either. You'd think this would be frontier back in 2023
jibal · 19m ago
LLMs are not on the road to AGI, but there are plenty of dangers associated with them nonetheless.
guluarte · 34m ago
I'm starting to think that there will not be an 'AGI' moment, we will simply slowly build smarter machines over time until we realize there is 'AGI'. It would be like video calls in the '90s everybody wanted them, now everybody hates them, lmao.
While technically true why would you want to use it when OpenAI itself provides a bunch of many times cheaper and better models?
Rather than the much more obvious: Preference-prior Informed Linucb For Adaptive Routing (PILFAR)
Aka Wisdom. No, LLMs don't have that. Me neither, I usually have to step in the rabbit holes in order to detect them.
But why does this paper impact your thinking on it? It is about budget and recognizing that different LLMs have different cost structures. It's not really an attempt to improve LLM performance measured absolutely.