Show HN: Aotol AI – Offline LLM app runs on iOS with voice and multilingual
What it does: - Fully offline LLM chat (works even with no signal) - Multilingual text + voice (switch languages on the fly) - Conversations stay on your phone (no data sent out)
How it works: - Model: Llama 3.2 3B (quantized q4f16) - Size: ~2 GB app bundle - Inference: uses MLC-LLM + TVM runtime, optimized for iOS - Average response: ~1–2s/token on iPhone 15 Pro Max - Added voice chat using AVSpeechSynthesizer + SFSpeechRecognizer
Why: I wanted to test if “desktop-grade” LLM experiences could run locally on a phone, both for privacy and offline availability.
Limitations: - Accuracy is ~70% for general QA (small model, quantized) - Long prompts will slow down - Memory footprint is tight on older devices
Download (iOS): https://apps.apple.com/app/aotol-ai-private-on-device-ai/id6...
I’d love feedback from anyone experimenting with: - Smaller models on-device (sub-4B) - Optimizing quantization for speed vs accuracy - UX patterns for chat when inference can stall
No comments yet