Version of OpenAIs's new open source 20B model, optimized to run on Mac (MLX)

1 matznerd 1 8/6/2025, 9:40:41 PM huggingface.co ↗

Comments (1)

matznerd · 7h ago
These are 8bit versions On mac, use LM Studio to download it, just search "oss mlx", and note there is an mlx toggle box on search.

Link for the 120B version: https://huggingface.co/lmstudio-community/gpt-oss-120b-MLX-8...

Its taking 21 gb of memory on my 64 gb mbp, still tuning it and settling on context size, temp, and other settings.

My comment from yesterday:

"thanks openai for being open ;) Surprised there are no official MLX versions and only one mention of MLX in this thread. MLX basically converst the models to take advntage of mac unified memory for 2-5x increase in power, enabling macs to run what would otherwise take expensive gpus (within limits). So FYI to any one on mac, the easiest way to run these models right now is using LM Studio (https://lmstudio.ai/), its free. You just search for the model, usually 3rd party groups mlx-community or lmstudio-community have mlx versions within a day or 2 of releases. I go for the 8-bit quantizations (4-bit faster, but quality drops). You can also convert to mlx yourself...

Once you have it running on LM studio, you can chat there in their chat interface, or you can run it through api that defaults to http://127.0.0.1:1234

You can run multiple models that hot swap and load instantly and switch between them etc.

Its surpassingly easy, and fun.There are actually a lot of cool niche models comings out, like this tiny high-quality search model released today as well (and who released official mlx version) https://huggingface.co/Intelligent-Internet/II-Search-4B

Other fun ones are gemma 3n which is model multi-modal, larger one that is actually solid model but takes more memory is the new Qwen3 30b A3B (coder and instruct), Pixtral (mixtral vision with full resolution images), etc. Look forward to playing with this model and see how it compares."