Qwen3-Coder-30B-A3B-Instruct

29 swesnow 10 7/31/2025, 2:44:01 PM huggingface.co ↗

Comments (10)

danielhanchen · 11h ago
I uploaded quantized and full precision GGUFs for local llama.cpp inference to https://huggingface.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-.... Docs to run them: https://docs.unsloth.ai/basics/qwen3-coder-how-to-run-locall...

Also fixed tool calling for the 480B Coder https://docs.unsloth.ai/basics/qwen3-coder-how-to-run-locall... and made 1 million context ones as well.

matznerd · 10h ago
Love the work you're doing at Unsloth!
danielhanchen · 3h ago
Oh thank you! :)
incomingpain · 8h ago
Huge high five!
danielhanchen · 3h ago
:)
d_sc · 3h ago
Between this and models like exaone 4 32b reasoning is it possible to have a TUI setup similar to Claude code?
matznerd · 10h ago
If you're on Mac you can download LM Studio and get the MLX version (edit - links below). I am running it on 64 gb M1 and it takes about ~30 gb ram. I've been on the hunt for a local orchestrator model that interprets input with speech to text (STT) from WhisperX, then can decide what to do. I have only been running it for a day, but it may be overkill for my setup.

For simple tasks it can quickly respond and then understand to use MCP servers for tasks and other things, but offloading all the heavy lifting to claude code via sdk and cli, then bringing the results back in a summary or with clarifying questions as text to speech (TTS). I'm playing with Kyutai TTs b/c have great models that sound real and can do conversational streaming with VAD (though my mbp is too slow with it for now but see https://unmute.sh/ for demo).

I am looking for an orchestrator model that runs on 10-15 gb of ram and can do really good tool calling and model routing. I'm will likely move to something even smaller designed specifically for this, like Jan Nano and then spin up an intermediate model like Qwen if needed, or try a smaller Qwen. https://github.com/menloresearch/jan?tab=readme-ov-file

Ultimately, I want something that can see my screen and know what is going on and have full context and be live, so I was excited about Gemma 3N multi-modal, but its not really available yet fully with vision at least for MLX. https://deepmind.google/models/gemma/gemma-3n/

Next 6 months in this area is going to be pretty wild though.

edit: Fixed links below, thanks

https://huggingface.co/lmstudio-community/Qwen3-Coder-30B-A3... https://huggingface.co/mlx-community/Qwen3-Coder-30B-A3B-Ins...

DarmokJalad1701 · 9h ago
bijant · 11h ago
Trying this out for the last few minutes feels like how Unix-Admins must have felt when they first used Linux. Sure, it's still a bit rough around the edges but you instantly realize that it's just a question of time before its "game over" for all commercial Unix vendors.
incomingpain · 8h ago
Trying this out.

Qwen Code just straight up fails to use it. Fails to use tools and crashes it. I dunno.

Aider was a huge waste of time.

Openhands is working really well but occasional:

Parameter 'command=str_replace</parameter' is not allowed for function 'str_replace_editor'. Allowed parameters: {'view_range', 'path', 'command', 'o│ │ld_str', 'insert_line', 'file_text', 'new_str'}

Jury is still out on whether this is better than Devstral.