Top Power Moves in AI Governance This Week (aigovernancelead.substack.com)

2025 feels like a breakout year for local models. Open‑weight releases are getting genuinely useful: from Google’s Gemma to recent *gpt‑oss* drops, the gap with frontier commercial models keeps narrowing for many day‑to‑day tasks.

Yet outside of this community, local LLMs still don’t seem mainstream. My hunch: *great UX and durable apps are still thin on the ground.*

If you are using local models, I’d love to learn from your setup and workflows. Please be specific so others can calibrate:

Model(s) & size: exact name/version, and quantization (e.g., Q4_K_M).

Runtime/tooling: e.g., Ollama, LM studio, etc.

Hardware: CPU/GPU details (VRAM/RAM), OS. If laptop/edge/home servers, mention that.

Workflows where local wins: privacy/offline, data security, coding, huge amount extraction, RAG over your files, agents/tools, screen capture processing—what’s actually sticking for you?

Pain points: quality on complex reasoning, context management, tool reliability, long‑form coherence, energy/thermals, memory, Windows/Mac/Linux quirks.

Favorite app today: the one you actually open daily (and why).

Wishlist: the app you wish existed.

Gotchas/tips: config flags, quant choices, prompt patterns, or evaluation snippets that made a real difference.

If you’re not using local models yet, what’s the blocker—setup friction, quality, missing integrations, battery/thermals, or just “cloud is easier”? Links are welcome, but what helps most is concrete numbers and anecdotes from real use.

A simple reply template (optional):

``` Model(s): Runtime/tooling: Hardware: Use cases that stick: Pain points: Favorite app: Wishlist: ```

Also curious how people think about privacy and security in practice. Thanks!

Comments (1)

incomingpain · 1h ago

Python coding is practically the only usecase for local for me.

Cloud llm are able to run 1 trillion parameters and have all of python knowledge in a transparent rag that's 100gbit or faster. Of course they'll be the bestest on the block.

But when the new GPT coding benchmarks only barely behind grok 4 or gpt5 with high reasoning.

>Model(s) & size: exact name/version, and quantization (e.g., Q4_K_M).

My most reliable setup is Devstral + openhands. unsloth Q6_K_XL, 85,000 context, flash attention, kcache and vcache quant at Q8.

Second most reliable. GPT-OSS-20B + opencode. Default MXFP4, I can only load up 31,000 context or it fails?(still plenty but hoping this bug gets fixed), you cant use flash attention or kv or v quantization or it becomes dumb as rocks. This harmony stuff is annoying.

Still preliminary, just got working today, but testing is really good. Qwen3-30b-a3b-thinking-2507 + roo code or qwencode, 80,000 context, unsloth q4_k_xl, flash attention, kcache and vcache quant at Q8.

>Runtime/tooling: e.g., Ollama, LM studio, etc.

LM studio. I need vulkan for my setup. rocm is just a pain in the ass. They need to support way more linux distros.

24gb vram.

Top Power Moves in AI Governance This Week (aigovernancelead.substack.com)

Some thoughts from my short trip to London (nandinfinitum.com)

Bank Python (calpaterson.com)

Concurrent TLS connection segfault in x509 storage (regression on 3.0.17) (github.com)

Disable built-in DNS clients in Chromium based apps (saneef.com)

Floats Don't Work for Storing Cents (moderntreasury.com)

Tesla's Dojo supercomputer is DOA – now what? (theverge.com)

Celaut: A peer-to-peer architecture for software design and distribution (github.com)

DummyIDP: Test SAML and SCIM without setting up a full-blown identity provider (dummyidp.com)

MapYourGrid (mapyourgrid.org)

Llmswap v1.5.0 – Added IBM watsonx support to my multi-LLM Python library (pypi.org)

A Mate Selection Theory of Feminization (richardhanania.com)

Prohibition never works, but that didn't stop the UK's Online Safety Act (theregister.com)

Why I'm excited about the Hierarchical Reasoning Model (medium.com)

Trapping D-lactate from microbiota improves blood glucose, fatty liver disease (sciencedirect.com)

Neovim Integration with Cursor Agent CLI (github.com)

Qron.ai (qron.ai)

Why Remote Work Just Works (For Me) (megalomaniacbore.blogspot.com)

Show HN: GPT-5 Document Retrieval – AI Assistant with Inline Citations (smartresearch-ai.com)

An LLM Codegen Hero's Journey (harper.blog)

Show HN: Bringing Tech News from HN to My Community (sh4jid.me)

GPT-5 on SWE-bench: Cost and performance deep-dive (mini-swe-agent.com)

There won't be code at all (twitter.com)

Benchmarking GPT-5 (coderabbit.ai)

Agent Zero AI Framework (github.com)

UAE offers free open-source AI as alternative to US and China (restofworld.org)

Show HN: Suitely. Your C-Suite, Reimagined by AI (suitely.prisen.co)

Show HN: 'Backed by Y Combinator' Embeddable Widget (yc-widget.pages.dev)

The Origins of the <Blink> Tag (2009) (web.archive.org)

We have to bring remote work to the country (fortune.com)

Show HN: Password Encrypted Manager (github.com)

Magic Patterns – AI-driven web and app prototyping and design (magicpatterns.com)

Reversing a Downlevel Offer (tryexponent.com)

We built an open-source asynchronous coding agent (blog.langchain.com)

The Science of Mediation Posture (theeightfoldpath.substack.com)

Show HN: YooAI – All-in-One Platform with Advanced Models, No Subscriptions (yooai.co)

Show HN: LLM from URL –– A free AI chat completion service directly from URL (818233.xyz)

Cloud Bits: API Gateways – Cloud System's Reception Desk (distributed-computing-musings.com)

HRT's Python Fork: Leveraging PEP 690 for Faster Imports (hudsonrivertrading.com)

Alien planet glimpsed in star's 'habitable zone' (nature.com)

Neurosymbolic AI: The 3rd Wave (muratbuffalo.blogspot.com)

Self-Guaranteeing Promises (stephango.com)

Trump signs executive order going after debanking (nytimes.com)

How to Make Marketing That Converts? (businessofsoftware.org)

U.S. Preparing IPO for Fannie Mae and Freddie Mac Later This Year (wsj.com)

Using electrons to make art, scientist's biology images grace rock albums,stamps (science.org)

NASA's New CLD Strategy Will Lose Mars, Leo to China (payloadspace.com)

Show HN: Avoid LLM Sycophantry – Two Models Debate Any Claim (botbicker.com)

OSIRIS4CubeSat: The Smallest Commercially Available Laser Communication Terminal (mdpi.com)

Two Religions Invented a Bad Lamp [video] (youtube.com)

Ask HN: Are you running local LLMs? What are your key use cases?

Comments (1)