MiniMax-M1 open-weight, large-scale hybrid-attention reasoning model

150 danboarder 15 6/18/2025, 6:53:34 AM github.com ↗

Comments (15)

swyx · 2h ago
1. this is apparently MiniMax's "launch week" - they did M1 on Monday and Hailuo 2 on Tuesday (https://news.smol.ai/issues/25-06-16-chinese-models). remains to be seen if they can keep up the pace of model releases for the rest of this week - these 2 were big ones, they aren't yet known for much else beyond llm and video models. just watch https://x.com/MiniMax__AI for announcements.

2. minimax m1's tech report is worthwhile: https://github.com/MiniMax-AI/MiniMax-M1/blob/main/MiniMax_M... while they may not be the SOTA open weights model, they do make some very big/notable claims on lightning attention and their GRPO variant (CISPO).

(im unaffiliated, just sharing what ive learned so far since no comments have been made here yet

vintermann · 1h ago
"We publicly release MiniMax-M1 at this https url" in the arxiv paper, and it isn't a link to an empty repo!

I like these people already.

markkitti · 17m ago
Please come up with better names for these models. This sounds like the processor in my Mac Studio.
chvid · 6m ago
npteljes · 1h ago
This is stated nowhere on the official pages, but it's a Chinese company.

https://en.wikipedia.org/wiki/MiniMax_(company)

iLoveOncall · 1h ago
Why would you expect them to mention that on their project's page?
noelwelsh · 58m ago
1. It's conventional to do so.

2. It's a legal requirement in some jurisdictions (e.g. https://www.gov.uk/running-a-limited-company/signs-stationer...)

3. It's useful for people who may be interested in applying for jobs

noelwelsh · 1h ago
A few thoughts:

* A Singapore based company, according to LinkedIn. There doesn't seem to be much of a barrier to entry to building a very good LLM.

* Open weight models + the development of Strix Halo / Ryzen AI Max makes me optimistic that running great LLMs locally will be relatively cheap in a few years.

rfoo · 1h ago
> A Singapore based company, according to LinkedIn

Nah, this is a Shanghai-based company.

manc_lad · 1h ago
It seems more and more like an inevitability we will run models locally. Exciting and concerning implications.

If anyone has any suggestions of people thinking about this space they respect, I'd love to listen to more ideas and thoughts on the developments.

noelwelsh · 44m ago
I think the main limitation, right now, is hardware. For GPUs the main limit is the VRAM available on consumer models. CPUs have plenty of memory but don't have the bandwidth or vector compute power for LLMs. This is why I think the Strix Halo is so exciting: it has bandwidth + compute power plus a lot of memory. It's not quite where it needs to be to replace a dedicated GPU, but in a few iterations it could be.

I'm interested in other opinions. I'm no expert on this stuff.

jb1991 · 27m ago
How does the shared memory model for GPUs on Apple Silicon factor into this? These are technically consumer grade and not very expensive, but they can offer a huge amount of memory since all the memory is shared between CPU and GPU, even a midtier machine can easily have 100 GB of GPU memory.
noelwelsh · 13m ago
If you squint the M4 is the same as the Strix Halo. Roughly:

* Double the bandwidth

* Half the compute

* Double the price for comparable memory (128GB)

I'm more interested in the AMD chips because of cost plus, while I have an Apple laptop, I do most of my work on a Linux desktop. So a killer AMD chip works better for me. If you don't mind paying the Apple tax then a Mac is a viable option. I'm not sure on the software side of LLMs on Apple Silicon but I cannot imagine it's unusable.

An example of desktop with the Strix Halo is the Framework desktop (AI Max+ 395 is the marketing name for the Strix Halo chip with the most juice): https://frame.work/gb/en/products/desktop-diy-amd-aimax300/c...

pantulis · 40m ago
Honest question: what is the concerning aspect to it?
htrp · 59m ago