Show HN: Inworld Runtime – A C++ graph-based runtime for production AI apps
We built it to solve the common problem we and our customers had: engineers spend more time on AI ops and plumbing than on actual feature development. This was often due to the challenge of using Python for I/O-bound, high-concurrency workloads and complexity maintaining pipelines with streams that use always-changing ML models.
Our solution is a high-performance runtime written in C++ with the core idea of defining AI logic as graphs. For instance, a basic voice-to-voice agent consists of STT → LLM → TTS nodes, while the connecting edges stream data and enforce conditions. This graph engine is portable (Linux, Windows, macOS) and can run on-device.
We built a few key features on top of this C++ core:
- Extensions. Runtime architecture decouples graph definition from implementation. If a pre-built component doesn't exist, you can register your own custom node/code and reuse it in any graph without writing any glue code.
- Routers. You can dynamically select models/settings on the per-node basis depending on the traffic as well as configure policies for fallbacks and retries to get the app ready for production.
- The Portal. A web-based control plane UI to deploy graphs, push config changes instantly, run A/B tests on live traffic, and monitor your app with logs, traces, and metrics.
- Unified API. Use our optimized models or route to providers like OpenAI, Anthropic, and Google through a single, consistent interface and one API key.
We have a Node.js SDK out now, with Python, Unity, Unreal, and native C++ coming soon. We plan to open-source the SDKs, starting with Node.js.
The docs are here: https://docs.inworld.ai/docs/runtime/overview
We're eager for feedback from fellow engineers and builders. What do you think?
No comments yet