Version of OpenAIs's new open source 20B model, optimized to run on Mac (MLX)

Comments (1)

matznerd · 7h ago

These are 8bit versions On mac, use LM Studio to download it, just search "oss mlx", and note there is an mlx toggle box on search.

Link for the 120B version: https://huggingface.co/lmstudio-community/gpt-oss-120b-MLX-8...

Its taking 21 gb of memory on my 64 gb mbp, still tuning it and settling on context size, temp, and other settings.

My comment from yesterday:

"thanks openai for being open ;) Surprised there are no official MLX versions and only one mention of MLX in this thread. MLX basically converst the models to take advntage of mac unified memory for 2-5x increase in power, enabling macs to run what would otherwise take expensive gpus (within limits). So FYI to any one on mac, the easiest way to run these models right now is using LM Studio (https://lmstudio.ai/), its free. You just search for the model, usually 3rd party groups mlx-community or lmstudio-community have mlx versions within a day or 2 of releases. I go for the 8-bit quantizations (4-bit faster, but quality drops). You can also convert to mlx yourself...

Once you have it running on LM studio, you can chat there in their chat interface, or you can run it through api that defaults to http://127.0.0.1:1234

You can run multiple models that hot swap and load instantly and switch between them etc.

Its surpassingly easy, and fun.There are actually a lot of cool niche models comings out, like this tiny high-quality search model released today as well (and who released official mlx version) https://huggingface.co/Intelligent-Internet/II-Search-4B

Other fun ones are gemma 3n which is model multi-modal, larger one that is actually solid model but takes more memory is the new Qwen3 30b A3B (coder and instruct), Pixtral (mixtral vision with full resolution images), etc. Look forward to playing with this model and see how it compares."

Tech company reaches gender quotas by replacing half workers with female AI bots (betootaadvocate.com)

Ask HN: What's the best career move you made in tech–and why?

Wary of sticker shock, retailers clash with brands on price hikes (reuters.com)

Elvis is alive How 'AI' stunts modern mythmaking (bsdly.blogspot.com)

Onion-Lang (github.com)

Apple hit by string of departures in AI talent war (ft.com)

The rise of couples location sharing (theguardian.com)

Official Reserve Revaluations: The International Experience (federalreserve.gov)

Actual LLM agents are coming (pleias.fr)

Fun Command-Line Tricks You Should Try (nxgntools.com)

New Gemini app tools to help students learn, understand and study better (blog.google)

Sleep Ledger (domofutu.substack.com)

Your LLM Does Not Care About MCP (hackteam.io)

AI in production: reflecting on one year, five projects and factories deployed (medium.com)

Run LLM's Locally on iPhone (github.com)

Slab City, California (en.wikipedia.org)

Anyone Bored at Work?

Terrence Tao Loses Funding (thebulwark.com)

Explicit tail calls are now available on Rust Nightly (become keyword) (old.reddit.com)

Ivanpah Solar Power Facility (en.wikipedia.org)

APIs Don't Make Good MCP Tools (reillywood.com)

Cligen: A Native API-Inferred Command-Line Interface Generator for Nim (github.com)

Can Local Contribution-Based Currencies Replace Crypto?

SF tech CEO offers buyouts to let workers flee 'extreme' work culture (sfgate.com)

Federal court filing system hit in sweeping hack (politico.com)

America may be copying the worst part of Europe's real estate market (businessinsider.com)

Show HN: Rust framework for advanced file recognition and identification (crates.io)

Ethiopia avoided colonization in the late 19th century but lagged in the 20th (africanistperspective.com)

FDA approves eye drops that fix near vision without glasses (newatlas.com)

Meet the AI Vegans (theguardian.com)

Why Every AI Company Eventually Has to Pay the Universe (A Gangster Comedy) (medium.com)

New York Times Revenue Jumps 9.7% from Subscriptions and Ads (nytimes.com)

Communication with Extraterrestrial Intelligence [pdf] (nsa.gov)

How to Review Code (endler.dev)

Coursera's Monetization Journey: From Zero to IPO (classcentral.com)

Apple made a 24k gold and glass statue for Donald Trump (theverge.com)

Efficient AI: KV Caching and KV Sharing (blog.gaurav.ai)

Thorsten Ball on Technical Blogging (writethatblog.substack.com)

A Lightweight Solution to Speed Up Queries by Dumping Data to Files (github.com)

Madhava of Sangamagrama (en.wikipedia.org)

Show HN: Self-contained MLX implementation of GPT-OSS models (github.com)

Any solution for "reset bug" on Nvidia GPUs? (medium.com)

Reasoning about systems' state spaces (youtube.com)

Their Last Love Token: A Dinosaur Rebuilt from Its Excavated Bones (nytimes.com)

Robotic frames offer new insights into honeybee behavior and honey storage (phys.org)

Is Altos Labs gearing up for clinical trials? (longevity.technology)

Show HN: OrcaCam – A Flappy Bird-style game celebrating Cambridge innovation (orcascan.com)

Notes on Growing Old(er) (ian-leslie.com)

SPy (Static Python) Language (github.com)

Why Are We Building Superintellgence?

Version of OpenAIs's new open source 20B model, optimized to run on Mac (MLX)

Comments (1)