Show HN: Welcome to "Voice AI Stack" Weekly – A Home for Voice AI Builders (videosdkweekly.substack.com)

Hey HN! We've run our privacy-focused open-source inference company for a while now, and we're launching a flat monthly subscription similar to Anthropic's. It should work with Cline, Roo, KiloCode, Aider, etc — any OpenAI-compatible API client should do. The rate limits at every tier are higher than the Claude rate limits, so even if you prefer using Claude it can be a helpful backup for when you're rate limited, for a pretty low price. Let me know if you have any feedback!

Comments (4)

logicprog · 2h ago

I was literally just wishing there was something like this, this is perfect! Do you do prompt caching?

reissbaker · 2h ago

Aw thanks! We don't currently, but from a cost perspective as a user it shouldn't matter much since it's all bundled into the same subscription (we rate-limit by requests, not by tokens — our request rate limits are set to "higher than the amount of messages per hour that Claude Code promises", haha). We might at some point just to save GPUs though!

logicprog · 42m ago

Yeah I wasn't worried so much about costs to me, as sustainability of your own prices — don't want to run into a "we're lowering quotas" situation like CC did :P

reissbaker · 14m ago

Lol fair! I think we're safe for now; our most popular model (and my personal favorite coding model) is GLM-4.5, which fits on a ~relatively small node compared to the rumored sizes of Anthropic's models. We can throw a lot of tokens at it before running into issues — it's kind of nice to launch without prompt caching, since it means if we're flying too close to the sun on tokens we still have some pretty large levers left to pull on the infra side before needing to do anything drastic with rate limits.

Show HN: SwiftAI – open-source library to easily build LLM features on iOS/macOS (github.com)

Show HN: Yoink AI – macOS AI app that edits directly in any textfield of any app (useyoink.ai)

Show HN: A private, flat monthly subscription for open-source LLMs (synthetic.new)

Show HN: Smart Buildings Powered by SparkplugB, Aklivity Zilla, and Kafka (github.com)

Show HN: Grammit – Local-only AI grammar checker (Chrome extension) (chromewebstore.google.com)

Show HN: Meetup.com and eventribe alternative to small groups (github.com)

Show HN: Knowledgework – AI Extensions of Your Coworkers (knowledgework.ai)

Show HN: Persistent Mind Model (PMM) – Update: an model-agnostic "mind-layer" (github.com)

Show HN: Branchlet – Git Worktree TUI for Claude Code/Cursor/Codex etc. (github.com)

Show HN: FilterQL – A tiny query language for filtering structured data (github.com)

Show HN: Welcome to "Voice AI Stack" Weekly – A Home for Voice AI Builders (videosdkweekly.substack.com)

Show HN: MCPcat – A free open-source library for MCP server monitoring (github.com)

Show HN: A zoomable, searchable archive of BYTE magazine (byte.tsundoku.io)

Show HN: MarkFlowy – A Markdown editor, which is lighter, smarter and purer (github.com)

Show HN: Async – Claude code and Linear and GitHub PRs in one opinionated tool (github.com)

Show HN: I integrated my from-scratch TCP/IP stack into the xv6-riscv OS (github.com)

Show HN: Csvqsh – SQL-like query language for CSV in Awk (github.com)

Show HN: Turn Markdown into React/Svelte/Vue UI at runtime, zero build step (markdown-ui.com)

Show HN: Txtos for LLMs – 60 SEC setup, long memory, boundary guard, MIT (github.com)

Show HN: Spart – A Rust library for fast spatial search with Python bindings

Show HN: I fine-tuned GPT4.1 on my iMessage history (jonyork.net)

Show HN: Base, an SQLite database editor for macOS (menial.co.uk)

Show HN: Created a Node.js's addon that can handle 1M req/s

Show HN: Diggit.dev – Git history for architecture archaeologists (diggit.dev)

Show HN: Regolith – Regex library that prevents ReDoS CVEs in TypeScript (github.com)

Show HN: Chat with Nano Banana Directly from WhatsApp (wassist.app)

Show HN: Smooth – Faster, cheaper browser agent API (smooth.sh)

Show HN: Gonzo – A Go-based TUI for log analysis (OpenTelemetry/OTLP support) (github.com)

Show HN: AI Agent in Jupyter – Runcell (runcell.dev)

Show HN: A more usable Docusign (formabledocs.com)

Show HN: React Web Camera – Fix <input type=file> single-photo limit (shivantra.com)

Show HN: Sideko – Hybrid deterministic/LLM generator for API SDKs and docs (github.com)

Show HN: precision asteroid orbital dynamics library (github.com)

Show HN: Karton is a simple, type-safe RPC and state-syncing framework (OSS,MIT) (github.com)

Show HN: Timep – A next-gen profiler and flamegraph-generator for bash code (github.com)

Show HN: I Built a XSLT Blog Framework (vgr.land)

Show HN: Simple PDF Scanner – fast, one-time iOS scanner app (simplepdfscanner.se)

Show HN: Spoon-Bending – a framework for analyzing GPT-5 alignment behavior (github.com)

Show HN: AIKit - Minimal library for calling OpenAI, Anthropic, Gemini gen APIs (github.com)

Show HN: Port Kill – A lightweight macOS status bar development port monitor (github.com)

Show HN: Pocket Agent: run Claude, Cursor, Codex and more from your phone (pocket-agent.xyz)

Show HN: Bicyclopedia (bicyclopedia.lemoing.ca)

Show HN: AI-powered video analysis tool that generates 800 word content prompts (video2prompt.org)

Show HN: Clearcam – Add AI object detection to your IP CCTV cameras (github.com)

Show HN: Stagewise – frontend coding agent for real codebases (stagewise.io)

Show HN: GrowChief – open-source social media outreach tool (github.com)

Show HN: CashLedger – Offline-first PWA for cash tracking (cashflow-friend-pwa.vercel.app)

Show HN: Built a tool to analyze the performance and risk of your IBKR portfolio (ibviz.com)

Show HN: Testronaut – AI-powered mission-based browser testing (testronaut.app)

Show HN: Sping – An HTTP/TCP latency tool that's easy on the eye (dseltzer.gitlab.io)

Show HN: A private, flat monthly subscription for open-source LLMs

Comments (4)