Rolldown-Vite: a Rust-Rewrite of Rollup (voidzero.dev)

1 points by thunderbong 30s ago 0 comments

Is "The Phoenician Scheme" Wes Anderson's Most Emotional Film? (newyorker.com)

1 points by prismatic 1m ago 0 comments

The Steve Ballmer Interview: The Complete History and Strategy (acquired.fm)

2 points by tambourine_man 5m ago 0 comments

How to post when no one is reading (jeetmehta.com)

1 points by j4mehta 5m ago 0 comments

Show HN: MBCompass - Android Compass App (github.com)

3 points by nativeforks 7m ago 0 comments

Price Index Could Clarify Opaque GPU Costs for AI (spectrum.ieee.org)

1 points by neom 14m ago 0 comments

Projected Outcomes of Removing Fluoride from US Public Water Systems (jamanetwork.com)

1 points by zzzeek 23m ago 0 comments

MailLM (maillm.com)

1 points by PurnataHassan 29m ago 1 comments

LFSR CPU Running Forth (github.com)

5 points by izabera 31m ago 0 comments

INTERCAL Rides Again – Restoring a Lost Compiler (adventofcomputing.libsyn.com)

1 points by matt_d 39m ago 1 comments

Autonomous Software Maintenance Has Arrived (tembo.io)

2 points by pjungwir 41m ago 0 comments

The Relation of Mathematics and Physics (1964) (feynmanlectures.caltech.edu)

1 points by fisheuler 44m ago 0 comments

Turning used cooking oil into soap where deep-Fried foods rule (bbc.com)

1 points by 1659447091 49m ago 0 comments

Ask HN: How do you find ideas?

1 points by nbbaier 56m ago 4 comments

Disaster of a product – so many things wrong [video] (youtube.com)

2 points by josephcsible 1h ago 0 comments

Inventing Japanese Braille (historyworkshop.org.uk)

2 points by zdw 1h ago 0 comments

2024 Pay for S&P 500 CEOs (wsj.com)

2 points by J253 1h ago 0 comments

Show HN: A small library for stack-trace-like error messages in Rust (docs.rs)

2 points by AnyTimeTraveler 1h ago 0 comments

Does U.S. Need to Build Hardened Aircraft Shelters for Combat Aircraft? (2024) (twz.com)

3 points by walterbell 1h ago 1 comments

Show HN: I built an AI Agent that uses the iPhone (github.com)

2 points by rounak 1h ago 0 comments

Automatic rollbacks are a last resort (octopus.com)

2 points by gpi 1h ago 0 comments

How Can AI Researchers Save Energy? By Going Backward (quantamagazine.org)

7 points by pseudolus 1h ago 0 comments

Bugs Love Starlink [video] (reddit.com)

5 points by elsewhen 1h ago 1 comments

Building a Newsroom Technology Culture (werd.io)

2 points by benwerd 1h ago 0 comments

Transitive Closure in PostgreSQL (engineering.remind.com)

3 points by thunderbong 1h ago 0 comments

Show HN: LMStudio Client in Elixir (github.com)

2 points by arthurcolle 1h ago 0 comments

What megalodon ate to meet its 100k-calorie daily requirement (cnn.com)

2 points by newsuser 1h ago 0 comments

Silicon Valley wants to help me make a superbaby (sfstandard.com)

17 points by user72343432754 2h ago 6 comments

Uploading the Human Mind Could One Day Become a Reality, Predicts Neuroscientist (sciencealert.com)

3 points by m463 2h ago 2 comments

The Princeton INTERCAL Compiler's source code (esoteric.codes)

34 points by surprisetalk 2h ago 4 comments

Exponential Functions and Euler's Formula (deaneyang.com)

7 points by surprisetalk 2h ago 0 comments

I Miss My Fan Regulator (rishikeshs.com)

11 points by surprisetalk 2h ago 14 comments

Not Everything Is on the Internet (2024) (bruh.ltd)

7 points by surprisetalk 2h ago 2 comments

The Heat Mirage: My least-favorite internet maneuver (dynomight.net)

4 points by surprisetalk 2h ago 0 comments

Measles vaccines save lives each year (ourworldindata.org)

7 points by surprisetalk 2h ago 0 comments

Claude Code: An Analysis (southbridge-research.notion.site)

3 points by amrrs 2h ago 1 comments

Harvard Has Trained So Many Chinese Officials, They Call It Their 'Party School' (wsj.com)

10 points by mudil 2h ago 1 comments

My clerk landing page template, for free (twitter.com)

2 points by Theoya 2h ago 0 comments

Show HN: Agno – A full-stack framework for building Multi-Agent Systems (github.com)

2 points by bediashpreet 2h ago 0 comments

New sonar tool is a 'game changer' for mapping the sea floor (science.org)

5 points by ipunchghosts 2h ago 1 comments

TPDE: A Fast Adaptable Compiler Back-End Framework (arxiv.org)

10 points by npalli 2h ago 2 comments

Schulte Grid Training – Django app for attention/reaction speed training (schultetable.net)

2 points by AdamRichic 2h ago 1 comments

Future of Professionals Report (2024) [pdf] (thomsonreuters.com)

4 points by gnabgib 2h ago 0 comments

Generative AI will probably make blogs better (pcloadletter.dev)

3 points by ronbenton 2h ago 0 comments

Structured Exercise After Adjuvant Chemotherapy for Colon Cancer (nejm.org)

5 points by ckcheng 3h ago 1 comments

Sylvain Chomet Won't Be Using AI Anytime Soon (hollywoodreporter.com)

3 points by colinprince 3h ago 0 comments

How to name binary (inorganic) compounds given their chemical formula? (2018) (chemistry.stackexchange.com)

2 points by susam 3h ago 0 comments

Lessons From Cursor's System Prompt (byteatatime.dev)

4 points by ByteAtATime 3h ago 0 comments

Ask HN: Best GTM strategies to engage indie hackers as early users?

2 points by maxime_wellapp 3h ago 0 comments

US vets agency orders scientists not to publish in journals without clearance (theguardian.com)

9 points by mitchbob 3h ago 3 comments

Show HN: OfflineLLM: Live Voice Chat with DeepSeek, Llama on iOS and VisionOS

4 bilaal_dc5631 2 6/1/2025, 7:47:28 PM offlinellm.bilaal.co.uk ↗

Hi, this is something I've been working on for the past 18 months. There are an abundance of tools to run LLMs locally on desktops (e.g. ollama, LM Studio), but other devices have been left out. This is has been a project to run these models onto iOS and visionOS, which has turned out to work really well. Even an iPhone 14 Pro can quite easily run the 3B parameter version of Llama 3.2. CLIP models also work well too!

It also has a Live Voice Chat which gives a 2-way conversation experience, functionality similar to the cloud-based Gemini Live feature that Google offers.

Under the hood it can run most GGUF models, using a heavily forked and diverged verison of llama.cpp which has helped performance on the mobile devices.

The next steps are to integrate Apple's on device 3B model which hopefully they will be opening up access to at WWDC in a week's time. I'm also in the middle of adding in support for Gemma 3 and Qwen 3.

Let me know what you think!

Comments (2)

35jelly35 · 7h ago

> Even an iPhone 14 Pro can quite easily run the 3B parameter version of Llama 3.2

Wow. I never thought a non-Apple Intelligence phone would be able to run this. Does the phone get hot at all?

Also, how long did it take you to build this and how easy is it to test this in Xcode?

bilaal_dc5631 · 7h ago

Thanks for the questions.

> Does the phone get hot at all?

It's pretty reasonable and similar to the heat you'll get when playing an intensive game. If you're sensible it's pretty usable.

> how long did it take you to build this

I first started in 2023 and managed to get an MVP out the same year. That was pretty basic and a lot of work has been done since. I don't have an accurate measure of how much time has been spent, but it's had a lot of my attention since I released the first MVP.

> how easy is it to test this in Xcode?

This is pretty nice actually. It runs absolutely fine in the simulator, which is where I do most of my testing. The only time I have to move to a physical device is for performance testing, which isn't a huge drain on productivity.