Qwen3-Next

Comments (5)

jychang · 4m ago

Coolest part of Qwen3-Next, in my opinion, is that they do MTP without adding another un-embedding matrix.

Deepseek R1 also has a MTP layer (layer 61) https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/mod...

But Deepseek R1 adds embed_tokens and shared_head.head tensors, which are [129280, 7168] or about 2GB in size at FP8.

Qwen3-Next doesn't have that, so it saves a few GB in active parameters for MTP, which is a Big Deal.

croemer · 9m ago

https://archive.is/JH9XL

Jgoauh · 17m ago

Seems impressive, i believe better architectures are really the path forward, i don't think you need more than 100B params taking this model and what GPT OSS 120B can acchieve

NitpickLawyer · 1m ago

New arch seems cool, and it's amazing that we have these published in the open.

That being said, qwen models are extremely overfit. They can do some things well, but they are very limited in generalisation, compared to closed models. I don't know if it's simply scale, or training recipes, or regimes. But if you test it ood the models utterly fail to deliver, where the closed models still provide value.

croemer · 10m ago

ERR_NAME_NOT_RESOLVED

Google Ends Support for Lynx Browser

Google Doesn't Rank My Site for My Own Brand Name

Ask HN: What's a modern alternative to Confluence for small dev teams?

Ask HN: How do you stay on top of new research?

Ask HN: Is MSFT hotmail down for you?

Ask HN: Good resources for DIY-ish animatronic kits for Halloween?

Linter for Your Docs

Ask HN: Who wants to be hired? (September 2025)

Ask HN: Looking for headless CMS recommendation

Ask HN: Who is hiring? (September 2025)

Old timers, what did you read to study Unix or languages by the book?

Ask HN: Would you use a CAPTCHA that blocks browser agents?

Ask HN: Were programmers more surprised than general public by ChatGPT in 2022?

Which would you subscribe: HBR / The Economist / Farnam Street?

Plex Update: Notice of a potential security incident

Ask HN: How are you preparing for upcoming short-lived SSL renewals?

Ask HN: What are the best AI video generator online? Both free and paid

Qwen3-Next

Comments (5)