WebGPU enables local LLM in the browser. Demo site with AI chat

27 andreinwald 11 8/2/2025, 2:09:12 PM andreinwald.github.io ↗

Comments (11)

maxmcd · 1h ago
Looks like this is a wrapper around: https://github.com/mlc-ai/web-llm

Which has a full web demo: https://chat.webllm.ai/

refulgentis · 19m ago
I am to see it regardless - projects been very low activity for months. Just last night I was thinking about ripping it out before launch. No observable future.
petermcneeley · 12m ago
This demo only works if you have the webgpu feature "f16". You can find out if you have this by checking for the feature list in https://webgpureport.org/ . The page itself can of course check for this but since f16 support is common they probably just didnt bother.
andreinwald · 1m ago
andreinwald · 1h ago
Browser LLM demo working on JavaScript and WebGPU. WebGPU is already supported in Chrome, Safari, Firefox, iOS (v26) and Android.

Demo, similar to ChatGPT https://andreinwald.github.io/browser-llm/

Code https://github.com/andreinwald/browser-llm

- No need to use your OPENAI_API_KEY - its local model that runs on your device

- No network requests to any API

- No need to install any program

- No need to download files on your device (model is cached in browser)

- Site will ask before downloading large files (llm model) to browser cache

- Hosted on Github Pages from this repo - secure, because you see what you are running

cgdl · 2m ago
Which model does the demo use?
asim · 17m ago
What's the performance of a model like vs an OpenAI API? What's the comparable here? Edit: I see it's same models locally that you'd run using Ollama or something else. So basically just constrained by the size of the model, GPU and perf of the machine.
pjmlp · 41m ago
Beware of opening this on mobile Internet.
lukan · 14m ago
Well, I am on a mobile right now, can someone maybe share anything about the performance?
andreinwald · 39m ago
Demo site is asking before download
andsoitis · 1h ago
very cool. improvement would be if the input text box is always on screen, rather than having to manually scroll down as the screen fills.