Soya solving a critical pain point for founders manually finding your TA online (soya-platform.vercel.app)

I tried Kimi on a few coding problems that Claude was spinning on. It’s good. It’s huge, way too big to be a “local” model — I think you need something like 16 H200s to run it - but it has a slightly different vibe than some of the other models. I liked it. It would definitely be useful in ensemble use cases at the very least.

summarity · 4h ago

Reasonable speeds are possible with 4bit quants on 2 512GB Mac Studios (MLX TB4 Ring - see https://x.com/awnihannun/status/1943723599971443134) or even a single socket Epyc system with >1TB of RAM (about the same real world memory throughput as the M Ultra). So $20k-ish to play with it.

For real-world speeds though yeah, you'd need serious hardware. This is more of a "deploy your own stamp" model, less a "local" model.

gpm · 2h ago

> or even a single socket Epyc system with >1TB of RAM

How many tokens/second would this likely achieve?

No comments yet

refulgentis · 4h ago

I write a local LLM client, but sometimes, I hate that local models have enough knobs to turn that people can advocate they're reasonable in any scenario - in yesterday's post re: Kimi k2, multiple people spoke up that you can "just" stream the active expert weights out of 64 GB of RAM, and use the lowest GGUF quant, and then you get something that rounds to 1 token/s, and that is reasonable for use.

Good on you for not exaggerating.

I am very curious what exactly they see in that, 2-3 people hopped in to handwave that you just have it do agent stuff overnight and it's well worth it. I can't even begin to imagine unless you have a metric **-ton of easily solved problems that aren't coding. Even a 90% success rate gets you into "useless" territory quick when one step depends on the other, and you're running it autonomoously for hours

segmondy · 2h ago

I do deepseek at 5tk/sec at home and I'm happy with it. I don't need to do agent stuff to gain from it, I was saving to eventually build out enough to run it at 10tk/sec, but with kimi k2, plan has changed and the savings continue with a goal to run it at 5 tk/sec at home.

fzzzy · 2h ago

I agree, 5 tokens per second is plenty fast for casual use.

refulgentis · 2h ago

Cosign for chat, that's my bar for usable on mobile phone (and correlates well with avg. reading speed)

handzhiev · 2h ago

I tried it a couple of times in comparison to Claude. Kimi wrote much simpler and more readable code than Claude's over-engineered solutions. It missed a few minor subtle edge cases that Claude took care of though.

nathan_compton · 56m ago

The first question I gave it (a sort of pretty simple recreational math question I asked it to code up for me) and it was outrageously wrong. In fairness, and to my surprise, OpenAI's model also failed with this task, although with some prompting, sort of got it.

airstrike · 1h ago

Claude what? Sonnet? 3.7? 3.5? Opus? 4?

moffkalast · 4h ago

Still pretty good, someone with enough resources could distil it down to a more manageable size for the rest of us.

ozgune · 5h ago

This is a very impressive general purpose LLM (GPT 4o, DeepSeek-V3 family). It’s also open source.

I think it hasn’t received much attention because the frontier shifted to reasoning and multi-modal AI models. In accuracy benchmarks, all the top models are reasoning ones:

https://artificialanalysis.ai/

If someone took Kimi k2 and trained a reasoning model with it, I’d be curious how that model performs.

GaggiX · 5h ago

>If someone took Kimi k2 and trained a reasoning model with it

I imagine that's what they are going at MoonshotAI right now

satvikpendem · 4h ago

This is not open source, they have a "modified MIT license" where they have other restrictions on users over a certain threshold.

    Our only modification part is that, if the Software (or any derivative works
    thereof) is used for any of your commercial products or services that have
    more than 100 million monthly active users, or more than 20 million US dollars
    (or equivalent in other currencies) in monthly revenue, you shall prominently
    display "Kimi K2" on the user interface of such product or service.

alt187 · 19m ago

What part of this goes against the four fundamental freedoms? Can you point at it?

drawnwren · 2m ago

It's silly, but in the LLM world - "open source" is usually used to mean "weights are published". This is not to be confused with the software licensing meaning of "open source".

kragen · 4h ago

I feel like those restrictions don't violate the OSD (or the FSF's Free Software Definition, or Debian's); there are similar restrictions in the GPLv2, the GPLv3, the 4-clause BSD license, and so on. They just don't have user or revenue thresholds. The GPLv2, for example, says:

> c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.)

And the 4-clause BSD license says:

> 3. All advertising materials mentioning features or use of this software must display the following acknowledgement: This product includes software developed by the organization.

Both of these licenses are not just non-controversially open-source licenses; they're such central open-source licenses that IIRC much of the debate on the adoption of the OSD was centered on ensuring that they, or the more difficult Artistic license, were not excluded.

It's sort of nonsense to talk about neural networks being "open source" or "not open source", because there isn't source code that they could be built from. The nearest equivalent would be the training materials and training procedure, which isn't provided, but running that is not very similar to recompilation: it costs millions of dollars and doesn't produce the same results every time.

But that's not a question about the license.

mindcrime · 33m ago

It may not violate the OSD, but I would still argue that this license is a Bad Idea. Not because what they're trying to do is inherently bad in any way, but simply because it's yet another new, unknown, not-fully-understood license to deal with. The fact that we're having this conversation illustrating that very fact.

My personal feeling is that almost every project (I'll hedge a little because life is complicated) should prefer an OSI certified license and NOT make up their own license (even if that new license is "just" a modification of an existing license). License proliferation[1] is generally considered a Bad Thing for good reason.

[1]: https://en.wikipedia.org/wiki/License_proliferation

diggan · 4h ago

That seems like a combination of Llama's "prominently display “Built with Llama”" and "greater than 700 million monthly active users" terms but put into one and masquerading as "slightly changed MIT".

moffkalast · 4h ago

That's basically less restrictive than OpenStreetMap.

echelon · 4h ago

> This is not open source

OSI purism is deleterious and has led to industry capture.

Non-viral open source is simply a license for hyperscalers to take advantage. To co-opt offerings and make hundreds of millions without giving anything back.

We need more "fair source" licensing to support sustainable engineering that rewards the small ICs rather than mega conglomerate corporations with multi-trillion dollar market caps. The same companies that are destroying the open web.

This license isn't even that protective of the authors. It just asks for credit if you pass a MAU/ARR threshold. They should honestly ask for money if you hit those thresholds and should blacklist the Mag7 from usage altogether.

The resources put into building this are significant and they're giving it to you for free. We should applaud it.

teiferer · 2h ago

> small ICs

The majority of open source code is contributed by companies, typically very large corporations. The thought of the open source ecosystem being largely carried by lone hobbyist contributors in their spare time after work is a myth. There are such folks (heck I'm one of them) and they are appreciated and important, but their perception far exceeds their real role in the open source ecosystem.

wredcoll · 2h ago

I've heard people go back and fortg on this before but you seem pretty certain about it, can you share some stats so I can see also?

satvikpendem · 2h ago

That's great, nothing wrong with giving away something for free, just don't call it open source.

exegeist · 1h ago

Technical strengths aside, I’ve been impressed with how non-robotic Kimi K2 is. Its personality is closer to Anthropic’s best: pleasant, sharp, and eloquent. A small victory over botslop prose.

fzysingularity · 4h ago

If I had to guess, the OpenAI open-source model got delayed because Kimi K2 stole their thunder and beat their numbers.

irthomasthomas · 4h ago

Someone at openai did say it was too big to host at home, so you could be right. They will probably be benchmaxxing, right now, searching for a few evals they can beat.

johnb231 · 11m ago

These are all "too big to host at home". I don't think that is the issue here.

https://github.com/MoonshotAI/Kimi-K2/blob/main/docs/deploy_...

"The smallest deployment unit for Kimi-K2 FP8 weights with 128k seqlen on mainstream H200 or H20 platform is a cluster with 16 GPUs with either Tensor Parallel (TP) or "data parallel + expert parallel" (DP+EP)."

16 GPUs costing ~$30k each. No one is running a ~$500k server at home.

emacdona · 3h ago

To me, K2 is a mountain and SOTA is “summits on the air”. I saw that headline and thought “holy crap” :-)

jug · 4h ago

I like new, solid non-reasoning models that push the frontier. These still have nice use cases (basically anything where logic puzzles or STEM subjects don't apply) where you don't want to spend cash on reasoning tokens.

awestroke · 4h ago

This is the model release that made Sam Altman go "Oh wait actually we can't release the new open source model this week, sorry. Something something security concerns".

Perhaps their open source model release doesn't look so good compared to this one

bhouston · 3h ago

Impressive benchmarks!

data_maan · 4h ago

"Open source" lol

Open-weight. As usual, you don't get the dataset, training scripts, etc.

CaptainFever · 4h ago

It's not even open-weight. It's weight-available. It uses a "modified MIT license":

    Modified MIT License
    
    Copyright (c) 2025 Moonshot AI
    
    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the “Software”), to deal
    in the Software without restriction, including without limitation the rights
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
    copies of the Software, and to permit persons to whom the Software is
    furnished to do so, subject to the following conditions:
    
    The above copyright notice and this permission notice shall be included in all
    copies or substantial portions of the Software.
    
    THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    SOFTWARE.
    
    Our only modification part is that, if the Software (or any derivative works
    thereof) is used for any of your commercial products or services that have
    more than 100 million monthly active users, or more than 20 million US dollars
    (or equivalent in other currencies) in monthly revenue, you shall prominently
    display "Kimi K2" on the user interface of such product or service.

mitthrowaway2 · 3h ago

This seems significantly more permissive than GPL. I think it's reasonable to consider it open-weight.

MallocVoidstar · 3h ago

4-clause BSD is considered open source by Debian and the FSF and has a similar requirement.

mistercheph · 4h ago

Wont happen under the current copyright regime, it is impossible to train SOTA without copyrighted text, how do you propose distributing that?

irthomasthomas · 4h ago

List the titles.

mixel · 4h ago

But probably they don't have the rights to actually train on them and that's why they do not publish the list. Otherwise it may be laziness who knows

msk-lywenn · 4h ago

Bibtex

38 · 2h ago

The web chat has extremely low limits FYI. I ran into the limit twice before getting a sane answer and gave up

ChrisArchitect · 5h ago

[dupe] https://news.ycombinator.com/item?id=44533403

brcmthrowaway · 5h ago

Is Kimi the new deep seek?

DataDaemon · 4h ago

Oops, China is leading with AI, when the Nasdaq investors check their AI investments?

Context Forge – A CLI-first tool for managing Claude context and AI workflows (github.com)

Tuam: Babies and toddlers came to be buried in an unmarked mass grave (bbc.co.uk)

Show HN: Do Things That Don't Scale Simulation (mythosgym.com)

SaaS Morphion.Art hit 400 Users (morphion.art)

Show HN: Cellular-engine – ready-to-use backend for building coding agents (github.com)

Sca-fuzzer: Revizor – a fuzzer to search for microarchitectural leaks in CPUs (github.com)

With farmed fish, RelationFish swims against the tide in fine dining (japantimes.co.jp)

xAI issues apology for Grok's antisemitic posts (nbcnews.com)

Zig's New Async I/O (kristoff.it)

Ask HN: Is it still worth applying to YC after raising $1M–$5M from VCs?

FMD Android: secure open source alternative to Google's Find My Device (gitlab.com)

Riverside issues citations for illegal fireworks spotted by police drones (cbsnews.com)

Institute for the Study of the Neurologically Typical (erikengdahl.se)

Great Software (maraoz.com)

Light exposure at night predicts incidence of cardiovascular diseases (medrxiv.org)

Simple live reload for developing static sites (leanrada.com)

Show HN: AGAI – A minimal, model-driven Go web framework (github.com)

What Happens to Creativity at Infinity? (mahirbansal.com)

How do we live finite at infinity? (mahirbansal.com)

Where Does "We" End? (mahirbansal.com)

Biodegradable Polymer for Transient Organic Memory (onlinelibrary.wiley.com)

Stigmergy (en.wikipedia.org)

Wallsync – A Curated Collection of Wallpaper Links for Your Desktop (github.com)

Everyutil C – Utility Library C (github.com)

It's Time to Let Go of 'African American' (nytimes.com)

Looking for a technical cofounder with experience in Synthetic Data ASAP

China's AI Gambit: Code as Standards (chinatalk.media)

Scanned Piano Rolls Database (pianorollmusic.org)

Unpythonic: Supercharge your Python with parts of Lisp and Haskell (github.com)

I Solved the Century-Old Mystery of a Miraculous Shipwreck Survivor (thewalrus.ca)

Hope Catches a Tailwind (nautil.us)

Does Anybody Know What Time Is? (nautil.us)

Rapid bursts of ageing are causing us to rethink how we grow old (newscientist.com)

Daniel Kleppner, Physicist Who Brought Atomic Clock Precision to GPS, Has Died (nytimes.com)

Soya solving a critical pain point for founders manually finding your TA online (soya-platform.vercel.app)

Rivian R1S review: second time's the charm (theverge.com)

Fictitious Telephone Numbers (en.wikipedia.org)

Show HN: Brand Photography Images what present you in 10 mins with AI (gostudio.ai)

A spacecraft carrying human remains and cannabis crashes into the ocean (phys.org)

Multisynapse optical network outperforms digital AI models (phys.org)

Easy dynamic dispatch using GLIBC Hardware Capabilities (kvr.at)

ML/AI Pipelines: The "So What" of Crypto Data Ecosystems (davanti-research.medium.com)

She Wanted to Save the World from A.I. Then the Killings Started (nytimes.com)

Internet Credit Union 2011-2015: RIP (blog.archive.org)

Ask HN: Are there any tools for tracking GPU prices over time?

GitHub – pollen-robotics/AmazingHand: Code and model to control the AH (github.com)

Start Budgeting Now (theatlantic.com)

Computational understanding of Li-ion batteries (2016) (nature.com)

Why many American workers feel guilty about taking the vacation they've earned? (theconversation.com)

A universal interface connecting you to today's AI models (tenzorro.com)

Kimi k2 largest open source SOTA model?

Comments (44)