Ask HN: Which laptop can run the largest LLM model?

3 grokblah 4 8/14/2025, 4:30:31 PM

I’d like to experiment with LLMs locally and understand their infrastructure better.

Comments (4)

PaulHoule · 3h ago

Don’t the M-series processors for Mac book pro’s have a huge amount of HBM which is good for models? I see you can get a pro with 48MB of unified memory whereas Alienware will sell you a machine with 32GB of regular ram and 24GB of graphics RAM on a 5090 discrete GPU. So the pro has twice the RAM accessible to the GPU.

grokblah · 59m ago

Looks like the MacBook Pro might be more cost effective? I like the support for larger models. Thanks!

incomingpain · 3h ago

https://rog.asus.com/us/laptops/rog-flow/rog-flow-z13-2025/s...

The out of stock one has 128gb of unified system ram. AMD 395 ai chip.

So easily run 70B models on that much vram; but slower, probably in that 30-40tokens/s which is very usable.

Qwen 3 30b will be in that 60 tokens/s range.

llama 4 scout will be around 20-30tokens/s

grokblah · 53m ago

Interesting, I found it on Amazon for $5k: https://a.co/d/h085rvP

That’s the same price as an M4 Max MBP with the same ram and storage. Any idea how they compare in performance?

GPT-5 (openai.com)

Fight Chat Control (fightchatcontrol.eu)

GitHub is no longer independent at Microsoft after CEO resignation (theverge.com)

I tried every todo app and ended up with a .txt file (al3rez.com)

Claude Sonnet 4 now supports 1M tokens of context (anthropic.com)

Ultrathin business card runs a fluid simulation (github.com)

I want everything local – Building my offline AI workspace (instavm.io)

Wikipedia loses challenge against Online Safety Act (bbc.com)

Emailing a one-time code is worse than passwords (blog.danielh.cc)

FFmpeg 8.0 adds Whisper support (code.ffmpeg.org)

Debian 13 “Trixie” (debian.org)

Vibechart (vibechart.net)

Claude Code is all you need (dwyer.co.za)

VC-backed company just killed my EU trademark for a small OSS project

Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

Nginx introduces native support for ACME protocol (blog.nginx.org)

Claude says “You're absolutely right!” about everything (github.com)

PYX: The next step in Python packaging (astral.sh)

How I code with AI on a budget/free (wuu73.org)

Try and (ygdp.yale.edu)

GPT-5: Key characteristics, pricing and system card (simonwillison.net)

Show HN: Building a web search engine from scratch with 3B neural embeddings (blog.wilsonl.in)

Wikimedia Foundation Challenges UK Online Safety Act Regulations (wikimediafoundation.org)

This website is for humans (localghost.dev)

OpenFreeMap survived 100k requests per second (blog.hyperknot.com)

Jim Lovell, Apollo 13 commander, has died (nasa.gov)

Search all text in New York City (alltext.nyc)

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Historical Tech Tree (historicaltechtree.com)

Why are there so many rationalist cults? (asteriskmag.com)

Cursed Knowledge (immich.app)

Meta Leaks Part 1: Israel and Meta (archive.org)

The Chrome VRP Panel has decided to award $250k for this report (issues.chromium.org)

Monero appears to be in the midst of a successful 51% attack (twitter.com)

The Framework Desktop is a beast (world.hey.com)

GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2 (magazine.sebastianraschka.com)

Flipper Zero dark web firmware bypasses rolling code security (rtl-sdr.com)

Getting good results from Claude Code (dzombak.com)

StarDict sends X11 clipboard to remote servers (lwn.net)

GPT-5 for Developers (openai.com)

Linear sent me down a local-first rabbit hole (bytemash.net)

Show HN: Engineering.fyi – Search across tech engineering blogs in one place (engineering.fyi)

Trump Orders National Guard to Washington and Takeover of Capital’s Police (nytimes.com)

OpenSSH Post-Quantum Cryptography (openssh.com)

Vanishing from Hyundai’s data network (techno-fandom.org)

My Lethal Trifecta talk at the Bay Area AI Security Meetup (simonwillison.net)

The surprise deprecation of GPT-4o for ChatGPT consumers (simonwillison.net)

F-Droid build servers can't build modern Android apps due to outdated CPUs

Windows XP Professional (win32.run)

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models [pdf] (arxiv.org)

Ask HN: Which laptop can run the largest LLM model?

Comments (4)