I tried Kimi on a few coding problems that Claude was spinning on. It’s good. It’s huge, way too big to be a “local” model — I think you need something like 16 H200s to run it - but it has a slightly different vibe than some of the other models. I liked it. It would definitely be useful in ensemble use cases at the very least.
summarity · 10h ago
Reasonable speeds are possible with 4bit quants on 2 512GB Mac Studios (MLX TB4 Ring - see https://x.com/awnihannun/status/1943723599971443134) or even a single socket Epyc system with >1TB of RAM (about the same real world memory throughput as the M Ultra). So $20k-ish to play with it.
For real-world speeds though yeah, you'd need serious hardware. This is more of a "deploy your own stamp" model, less a "local" model.
wongarsu · 3h ago
Reasonable speeds are possible if you pay someone else to run it. Right now both
NovitaAI and Parasail are running it, both available through Openrouter and both promising not to store any data. I'm sure the other big model hosters will follow if there's demand.
I may not be able to reasonably run it myself, but at least I can choose who I trust to run it and can have inference pricing determined by a competitive market. According to their benchmarks the model is about in a class with Claude 4 Sonet, yet already costs less than one third of Sonet's inference pricing
gpm · 8h ago
> or even a single socket Epyc system with >1TB of RAM
How many tokens/second would this likely achieve?
kachapopopow · 3h ago
around 1 by the time you try to do anything useful with it (>10000 tokens)
neuroelectron · 6h ago
1
refulgentis · 10h ago
I write a local LLM client, but sometimes, I hate that local models have enough knobs to turn that people can advocate they're reasonable in any scenario - in yesterday's post re: Kimi k2, multiple people spoke up that you can "just" stream the active expert weights out of 64 GB of RAM, and use the lowest GGUF quant, and then you get something that rounds to 1 token/s, and that is reasonable for use.
Good on you for not exaggerating.
I am very curious what exactly they see in that, 2-3 people hopped in to handwave that you just have it do agent stuff overnight and it's well worth it. I can't even begin to imagine unless you have a metric **-ton of easily solved problems that aren't coding. Even a 90% success rate gets you into "useless" territory quick when one step depends on the other, and you're running it autonomoously for hours
segmondy · 8h ago
I do deepseek at 5tk/sec at home and I'm happy with it. I don't need to do agent stuff to gain from it, I was saving to eventually build out enough to run it at 10tk/sec, but with kimi k2, plan has changed and the savings continue with a goal to run it at 5 tk/sec at home.
fzzzy · 8h ago
I agree, 5 tokens per second is plenty fast for casual use.
overfeed · 4h ago
Also works perfectly fine in fire-and-forget, non-interactive agentic workflows. My dream scenario is that I create a bunch of kanban tickets and assign them to one or more AI personas[1], and wake up to some Pull Requests the next morning. I'd me more concerned about tickets-per-day, and not tk/s as I have no interest in watching the inner-workings of the model.
1. Some more creative than others, with slightly different injected prompts or perhaps even different models entirely.
numpad0 · 1h ago
> I create a bunch of kanban tickets and assign them to one or more AI personas[1],
Yeah that. Why can't we just `find ./tasks/ | grep \.md$ | xargs llm`. Can't we just write up a government proposal style document, have LLM recursively down into sub-sub-projects and back up until the original proposal document can be translated into a completion report. Constantly correcting a humongous LLM with infinite context length that can keep everything in its head doesn't feel like the right approach.
refulgentis · 8h ago
Cosign for chat, that's my bar for usable on mobile phone (and correlates well with avg. reading speed)
tuananh · 5h ago
looks very much usable for local usage.
handzhiev · 8h ago
I tried it a couple of times in comparison to Claude. Kimi wrote much simpler and more readable code than Claude's over-engineered solutions.
It missed a few minor subtle edge cases that Claude took care of though.
airstrike · 7h ago
Claude what? Sonnet? 3.7? 3.5? Opus? 4?
nathan_compton · 7h ago
The first question I gave it (a sort of pretty simple recreational math question I asked it to code up for me) and it was outrageously wrong. In fairness, and to my surprise, OpenAI's model also failed with this task, although with some prompting, sort of got it.
moffkalast · 10h ago
Still pretty good, someone with enough resources could distil it down to a more manageable size for the rest of us.
ozgune · 11h ago
This is a very impressive general purpose LLM (GPT 4o, DeepSeek-V3 family). It’s also open source.
I think it hasn’t received much attention because the frontier shifted to reasoning and multi-modal AI models. In accuracy benchmarks, all the top models are reasoning ones:
If someone took Kimi k2 and trained a reasoning model with it, I’d be curious how that model performs.
GaggiX · 11h ago
>If someone took Kimi k2 and trained a reasoning model with it
I imagine that's what they are going at MoonshotAI right now
Alifatisk · 5h ago
Why hasn’t Kimis current and older models been benchmarked and added to Artificial analysis yet?
satvikpendem · 10h ago
This is not open source, they have a "modified MIT license" where they have other restrictions on users over a certain threshold.
Our only modification part is that, if the Software (or any derivative works
thereof) is used for any of your commercial products or services that have
more than 100 million monthly active users, or more than 20 million US dollars
(or equivalent in other currencies) in monthly revenue, you shall prominently
display "Kimi K2" on the user interface of such product or service.
diggan · 10h ago
That seems like a combination of Llama's "prominently display “Built with Llama”" and "greater than 700 million monthly active users" terms but put into one and masquerading as "slightly changed MIT".
kragen · 10h ago
I feel like those restrictions don't violate the OSD (or the FSF's Free Software Definition, or Debian's); there are similar restrictions in the GPLv2, the GPLv3, the 4-clause BSD license, and so on. They just don't have user or revenue thresholds. The GPLv2, for example, says:
> c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
And the 4-clause BSD license says:
> 3. All advertising materials mentioning features or use of this software must display the following acknowledgement:
This product includes software developed by the organization.
Both of these licenses are not just non-controversially open-source licenses; they're such central open-source licenses that IIRC much of the debate on the adoption of the OSD was centered on ensuring that they, or the more difficult Artistic license, were not excluded.
It's sort of nonsense to talk about neural networks being "open source" or "not open source", because there isn't source code that they could be built from. The nearest equivalent would be the training materials and training procedure, which isn't provided, but running that is not very similar to recompilation: it costs millions of dollars and doesn't produce the same results every time.
But that's not a question about the license.
ensignavenger · 2h ago
The OSD does not allow for discrimination:
"The license must not discriminate against any person or group of persons."
"The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research."
By having a clause that discriminates based on revenue, it cannot be Open Source.
If they had required everyone to provide attribution in the same manner, then we would have to examine the specifics of the attribution requirement to determine if it is compatible... but since they discriminate, it violates the open source definition, and no further analysis is necessary.
mindcrime · 6h ago
It may not violate the OSD, but I would still argue that this license is a Bad Idea. Not because what they're trying to do is inherently bad in any way, but simply because it's yet another new, unknown, not-fully-understood license to deal with. The fact that we're having this conversation illustrating that very fact.
My personal feeling is that almost every project (I'll hedge a little because life is complicated) should prefer an OSI certified license and NOT make up their own license (even if that new license is "just" a modification of an existing license). License proliferation[1] is generally considered a Bad Thing for good reason.
I'm of the personal opinion that it's quite reasonable for the creators to want attribution in case you manage to build a "successful product" off their work. The fact that it's a new or different license is a much smaller thing.
A lot of open source, copyleft things already have attribution clauses. You're allowed commerical use of someone else's work already, regardless of scale. Attribution is a very benign ask.
wongarsu · 3h ago
Aren't most licenses "not fully understood" in any reasonable legal sense? To my knowledge only the Artistic License and the GPL have seen the inside of a court room. And yet to this day nobody really knows how the GPL works with languages that don't follow C's model of a compile and a link step. And the boundaries of what's a derivative work in the GPL are still mostly set by convention, not a legal framework.
What makes us comfortable with the "traditional open source licenses" is that people have been using them for decades and nothing bad has happened. But that's mostly because breaking an open source license is rarely litigated against, not because we have some special knowledge of what those licenses mean and how to abide by that
mindcrime · 3h ago
Aren't most licenses "not fully understood" in any reasonable legal sense?
OK, fair enough. Pretend I said "not well understood" instead. The point is, the long-standing, well known licenses that have been around for decades are better understood that some random "I made up my own thing" license. And yes, some of that may be down to just norms and conventions, and yes, not all of these licenses have been tested in court. But I think most people would feel more comfortable using an OSI approved license, and are hesitant to foster the creation of even more licenses.
If nothing else, license proliferation is bad because of the combinatorics of understanding license compatibility issues. Every new license makes the number of permutations that much bigger, and creates more unknown situations.
alt187 · 6h ago
What part of this goes against the four fundamental freedoms? Can you point at it?
simonw · 5h ago
"The freedom to run the program as you wish, for any purpose (freedom 0)."
Being required to display branding in that way contradicts "run the program as you wish".
weitendorf · 3h ago
You are still free to run the program as you wish, you just have to provide attribution to the end user. It's essentially CC BY but even more permissive, because the attribution only kicks in once when specific, relatively uncommon conditions are met.
I think basically everybody considers CC BY to be open source, so a strictly more permissive license should be too, I think.
a2128 · 3h ago
Being required to store the GPL license notice on my hard drive is contradicting my wishes. And I'm not even earning $20 million US dollars per month off GPL software!
owebmaster · 2h ago
This freedom might be against the freedom of others to get your modifications.
Alifatisk · 6h ago
Exactly, I wouldn’t mind adding that text on our service if we made 20m $, the parent made it sound like a huge clause
tonyhart7 · 4h ago
Yeah, its fair for them if they want a little bit credit
nothing gucci there
moffkalast · 10h ago
That's basically less restrictive than OpenStreetMap.
echelon · 10h ago
> This is not open source
OSI purism is deleterious and has led to industry capture.
Non-viral open source is simply a license for hyperscalers to take advantage. To co-opt offerings and make hundreds of millions without giving anything back.
We need more "fair source" licensing to support sustainable engineering that rewards the small ICs rather than mega conglomerate corporations with multi-trillion dollar market caps. The same companies that are destroying the open web.
This license isn't even that protective of the authors. It just asks for credit if you pass a MAU/ARR threshold. They should honestly ask for money if you hit those thresholds and should blacklist the Mag7 from usage altogether.
The resources put into building this are significant and they're giving it to you for free. We should applaud it.
teiferer · 8h ago
> small ICs
The majority of open source code is contributed by companies, typically very large corporations. The thought of the open source ecosystem being largely carried by lone hobbyist contributors in their spare time after work is a myth. There are such folks (heck I'm one of them) and they are appreciated and important, but their perception far exceeds their real role in the open source ecosystem.
wredcoll · 8h ago
I've heard people go back and fortg on this before but you seem pretty certain about it, can you share some stats so I can see also?
Intermernet · 3h ago
Yep, awesome stuff. Call it "fair source" if you want to. Don't call it open source. I'm an absolutist about very few things, but the definition of open source is one of them. Every bit of variation given in the definition is a win for those who have ulterior motives for polluting the definition. Open source isn't a vague concept, it's a defined term with a legally accepted meaning. Very much like "fair use". It's dangerous to allow this definition to be altered. OpenAI (A deliberate misnomer if ever there was one) and friends would really love to co-opt the term.
satvikpendem · 8h ago
That's great, nothing wrong with giving away something for free, just don't call it open source.
drawnwren · 6h ago
It's silly, but in the LLM world - "open source" is usually used to mean "weights are published". This is not to be confused with the software licensing meaning of "open source".
simonw · 5h ago
The more tasteful corners of the LLM world use "open weights" instead of "open source" for licenses that aren't OSI.
fzysingularity · 10h ago
If I had to guess, the OpenAI open-source model got delayed because Kimi K2 stole their thunder and beat their numbers.
irthomasthomas · 10h ago
Someone at openai did say it was too big to host at home, so you could be right. They will probably be benchmaxxing, right now, searching for a few evals they can beat.
johnb231 · 6h ago
These are all "too big to host at home". I don't think that is the issue here.
"The smallest deployment unit for Kimi-K2 FP8 weights with 128k seqlen on mainstream H200 or H20 platform is a cluster with 16 GPUs with either Tensor Parallel (TP) or "data parallel + expert parallel" (DP+EP)."
16 GPUs costing ~$30k each. No one is running a ~$500k server at home.
weitendorf · 2h ago
For most people, before it makes sense to just buy all the hardware yourself, you probably should be renting GPUs by the hour from the various providers serving that need. On Modal, I think should cost about $72/hr to serve Kimi K2 https://modal.com/pricing
Once that's running it can serve the needs of many users/clients simultaneously. It'd be too expensive and underutilized for almost any individual to use regularly, but it's not unreasonable for them to do it in short intervals just to play around with it. And it might actually be reasonable for a small number of students or coworkers to share a $70/hr deployment for ~40hr/week in a lot of cases; in other cases, that $70/hr expense could be shared across a large number of coworkers or product users if they use it somewhat infrequently.
So maybe you won't host it at home, but it's actually quite feasible to self-host, and is it ever really worth physically hosting anything at home except as a hobby?
No comments yet
pxc · 5h ago
I think what GP means is that because the (hopefully) pending OpenAI release is also "too big to run at home", these two models may be close enough in size that they seem more directly comparable, meaning that it's even more important for OpenAI to outperform Kimi K2 on some key benchmarks.
No comments yet
ls612 · 4h ago
This is a dumb question I know, but how expensive is model distillation? How much training hardware do you need to take something like this and create a 7B and 12B version for consumer hardware?
johnb231 · 2h ago
The process involves running the original model. You can rent these big GPUs for ~$10 per hour, so that is ~$160 per hour for as long as it takes
exegeist · 7h ago
Technical strengths aside, I’ve been impressed with how non-robotic Kimi K2 is. Its personality is closer to Anthropic’s best: pleasant, sharp, and eloquent. A small victory over botslop prose.
jug · 10h ago
I like new, solid non-reasoning models that push the frontier. These still have nice use cases (basically anything where logic puzzles or STEM subjects don't apply) where you don't want to spend cash on reasoning tokens.
emacdona · 9h ago
To me, K2 is a mountain and SOTA is “summits on the air”. I saw that headline and thought “holy crap” :-)
All the AI models are no using em-dashes.
ChatGPT keeps using them even after explicitly told not to.
Anybody know what’s up with these models?
pxc · 6h ago
So far, I like the answer quality and its voice (a bit less obsequious than either ChatGPT or DeepSeek, more direct), but it seems to badly mangle the format of its answers more often than I've seen with SOTA models (I'd include DeepSeek in that category, or close enough).
data_maan · 11h ago
"Open source" lol
Open-weight. As usual, you don't get the dataset, training scripts, etc.
CaptainFever · 10h ago
It's not even open-weight. It's weight-available. It uses a "modified MIT license":
Modified MIT License
Copyright (c) 2025 Moonshot AI
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the “Software”), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Our only modification part is that, if the Software (or any derivative works
thereof) is used for any of your commercial products or services that have
more than 100 million monthly active users, or more than 20 million US dollars
(or equivalent in other currencies) in monthly revenue, you shall prominently
display "Kimi K2" on the user interface of such product or service.
mitthrowaway2 · 9h ago
This seems significantly more permissive than GPL. I think it's reasonable to consider it open-weight.
weitendorf · 3h ago
So "MIT with attribution" (but only for huge commercial use cases making tons of money off the product) is not open-weight? Do you consider CC BY photos on Wikipedia to be Image Available or GPL licensed software to be code-available too?
Tangent: I don't understand the contingent that gets upset about open LLMs not shipping with their full training regimes or source data. The software a company spent hundreds of millions of dollars creating, which you are now free to use and distribute with essentially no restrictions, is open source. It has weights in it, and a bunch of related software for actually running a model with those weights. How dare they!
MallocVoidstar · 9h ago
4-clause BSD is considered open source by Debian and the FSF and has a similar requirement.
mistercheph · 10h ago
Wont happen under the current copyright regime, it is impossible to train SOTA without copyrighted text, how do you propose distributing that?
irthomasthomas · 10h ago
List the titles.
mixel · 10h ago
But probably they don't have the rights to actually train on them and that's why they do not publish the list. Otherwise it may be laziness who knows
msk-lywenn · 10h ago
Bibtex
awestroke · 11h ago
This is the model release that made Sam Altman go "Oh wait actually we can't release the new open source model this week, sorry. Something something security concerns".
Perhaps their open source model release doesn't look so good compared to this one
bhouston · 9h ago
Impressive benchmarks!
jacooper · 5h ago
The problem with Chinese models is finding decent hosting.
The best you can find right now for kimi k2 is only 30 tps, not great.
38 · 8h ago
The web chat has extremely low limits FYI. I ran into the limit twice before getting a sane answer and gave up
It kinda feels like it, but Moonshots delivery has been like this before aswell, it was just now their new release got way more highlight than usual. When they released Kimi k1.5, those bench were impressive at the time! But everyone was busy with Deepseek v3 and QwQ-32B
DataDaemon · 10h ago
Oops, China is leading with AI, when the Nasdaq investors check their AI investments?
For real-world speeds though yeah, you'd need serious hardware. This is more of a "deploy your own stamp" model, less a "local" model.
I may not be able to reasonably run it myself, but at least I can choose who I trust to run it and can have inference pricing determined by a competitive market. According to their benchmarks the model is about in a class with Claude 4 Sonet, yet already costs less than one third of Sonet's inference pricing
How many tokens/second would this likely achieve?
Good on you for not exaggerating.
I am very curious what exactly they see in that, 2-3 people hopped in to handwave that you just have it do agent stuff overnight and it's well worth it. I can't even begin to imagine unless you have a metric **-ton of easily solved problems that aren't coding. Even a 90% success rate gets you into "useless" territory quick when one step depends on the other, and you're running it autonomoously for hours
1. Some more creative than others, with slightly different injected prompts or perhaps even different models entirely.
Yeah that. Why can't we just `find ./tasks/ | grep \.md$ | xargs llm`. Can't we just write up a government proposal style document, have LLM recursively down into sub-sub-projects and back up until the original proposal document can be translated into a completion report. Constantly correcting a humongous LLM with infinite context length that can keep everything in its head doesn't feel like the right approach.
I think it hasn’t received much attention because the frontier shifted to reasoning and multi-modal AI models. In accuracy benchmarks, all the top models are reasoning ones:
https://artificialanalysis.ai/
If someone took Kimi k2 and trained a reasoning model with it, I’d be curious how that model performs.
I imagine that's what they are going at MoonshotAI right now
> c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.)
And the 4-clause BSD license says:
> 3. All advertising materials mentioning features or use of this software must display the following acknowledgement: This product includes software developed by the organization.
Both of these licenses are not just non-controversially open-source licenses; they're such central open-source licenses that IIRC much of the debate on the adoption of the OSD was centered on ensuring that they, or the more difficult Artistic license, were not excluded.
It's sort of nonsense to talk about neural networks being "open source" or "not open source", because there isn't source code that they could be built from. The nearest equivalent would be the training materials and training procedure, which isn't provided, but running that is not very similar to recompilation: it costs millions of dollars and doesn't produce the same results every time.
But that's not a question about the license.
"The license must not discriminate against any person or group of persons."
"The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research."
By having a clause that discriminates based on revenue, it cannot be Open Source.
If they had required everyone to provide attribution in the same manner, then we would have to examine the specifics of the attribution requirement to determine if it is compatible... but since they discriminate, it violates the open source definition, and no further analysis is necessary.
My personal feeling is that almost every project (I'll hedge a little because life is complicated) should prefer an OSI certified license and NOT make up their own license (even if that new license is "just" a modification of an existing license). License proliferation[1] is generally considered a Bad Thing for good reason.
[1]: https://en.wikipedia.org/wiki/License_proliferation
A lot of open source, copyleft things already have attribution clauses. You're allowed commerical use of someone else's work already, regardless of scale. Attribution is a very benign ask.
What makes us comfortable with the "traditional open source licenses" is that people have been using them for decades and nothing bad has happened. But that's mostly because breaking an open source license is rarely litigated against, not because we have some special knowledge of what those licenses mean and how to abide by that
OK, fair enough. Pretend I said "not well understood" instead. The point is, the long-standing, well known licenses that have been around for decades are better understood that some random "I made up my own thing" license. And yes, some of that may be down to just norms and conventions, and yes, not all of these licenses have been tested in court. But I think most people would feel more comfortable using an OSI approved license, and are hesitant to foster the creation of even more licenses.
If nothing else, license proliferation is bad because of the combinatorics of understanding license compatibility issues. Every new license makes the number of permutations that much bigger, and creates more unknown situations.
Being required to display branding in that way contradicts "run the program as you wish".
I think basically everybody considers CC BY to be open source, so a strictly more permissive license should be too, I think.
nothing gucci there
OSI purism is deleterious and has led to industry capture.
Non-viral open source is simply a license for hyperscalers to take advantage. To co-opt offerings and make hundreds of millions without giving anything back.
We need more "fair source" licensing to support sustainable engineering that rewards the small ICs rather than mega conglomerate corporations with multi-trillion dollar market caps. The same companies that are destroying the open web.
This license isn't even that protective of the authors. It just asks for credit if you pass a MAU/ARR threshold. They should honestly ask for money if you hit those thresholds and should blacklist the Mag7 from usage altogether.
The resources put into building this are significant and they're giving it to you for free. We should applaud it.
The majority of open source code is contributed by companies, typically very large corporations. The thought of the open source ecosystem being largely carried by lone hobbyist contributors in their spare time after work is a myth. There are such folks (heck I'm one of them) and they are appreciated and important, but their perception far exceeds their real role in the open source ecosystem.
https://github.com/MoonshotAI/Kimi-K2/blob/main/docs/deploy_...
"The smallest deployment unit for Kimi-K2 FP8 weights with 128k seqlen on mainstream H200 or H20 platform is a cluster with 16 GPUs with either Tensor Parallel (TP) or "data parallel + expert parallel" (DP+EP)."
16 GPUs costing ~$30k each. No one is running a ~$500k server at home.
Once that's running it can serve the needs of many users/clients simultaneously. It'd be too expensive and underutilized for almost any individual to use regularly, but it's not unreasonable for them to do it in short intervals just to play around with it. And it might actually be reasonable for a small number of students or coworkers to share a $70/hr deployment for ~40hr/week in a lot of cases; in other cases, that $70/hr expense could be shared across a large number of coworkers or product users if they use it somewhat infrequently.
So maybe you won't host it at home, but it's actually quite feasible to self-host, and is it ever really worth physically hosting anything at home except as a hobby?
No comments yet
No comments yet
Open-weight. As usual, you don't get the dataset, training scripts, etc.
Tangent: I don't understand the contingent that gets upset about open LLMs not shipping with their full training regimes or source data. The software a company spent hundreds of millions of dollars creating, which you are now free to use and distribute with essentially no restrictions, is open source. It has weights in it, and a bunch of related software for actually running a model with those weights. How dare they!
Perhaps their open source model release doesn't look so good compared to this one