Show HN: Turn any workflow diagram into compilable, running and stateful code (workflows.diagrid.io)

Today, many software teams are adding AI features into their apps — like customer support bots, writing tools, or internal copilots — by writing “prompts” directly into their code. These prompts tell the AI what to say or do. But once the product is live, there's no visibility into what the AI is actually saying to users, how much it’s costing, or when things silently go wrong — like hallucinations, tone drift, or token overuse. I’m hoping to build a solution that helps teams keep these AI features healthy and reliable in production. They’ll have a central database to store all their prompts, test different versions across multiple AI models, compare costs and outputs, and — most importantly — evaluate the “human touch” of the responses. The platform would enable A/B testing across prompt versions to identify which responses perform best — whether in terms of marketing impact, sales conversion, engagement, or overall usage. It would track every AI response, detect unusual or risky behavior, and suggest — or even apply — fixes automatically. Think of it as a real-time quality control system for the AI layer of your product. The system would be powered by lightweight autonomous agents that watch every model call, flag anomalies, and make context-aware recommendations — or take direct action when safe to do so. These agents would monitor prompt behavior over time, compare version performance, and optimize for clarity, safety, and cost. Technically, it’s a real-time observability and correction runtime — like Datadog + LaunchDarkly, but built specifically for managing AI prompts and agentic behavior in production.

Comments (2)

airylizard · 6h ago

I like the idea, TSCE framework should make the individual agents more reliable and deterministic: https://github.com/AutomationOptimization/tsce_demo

Venkymatam · 12m ago

Thanks for sharing this! I appreciate it. Is it good enough in your opinion for YC?

Spaced repetition systems have gotten way better (domenic.me)

Thank you Google for breaking my YouTube addiction (rakhim.exotext.com)

AniSora: Open-source anime video generation model (komiko.app)

Coding without a laptop: Two weeks with AR glasses and Linux on Android (holdtherobot.com)

Lessons from Mixing Rust and Java: Fast, Safe, and Practical (medium.com)

Show HN: Turn any workflow diagram into compilable, running and stateful code (workflows.diagrid.io)

Project Verona: Fearless Concurrency for Python (microsoft.github.io)

Experts have it easy (2024) (boydkane.com)

Mystical (suberic.net)

Pglocks.org (pglocks.org)

Directory of MCP Servers (github.com)

What Every Programmer Should Know About Enumerative Combinatorics (leetarxiv.substack.com)

ARMv9 Architecture Helps Lift Arm to New Financial Heights (nextplatform.com)

Push Ifs Up and Fors Down (matklad.github.io)

How to have the browser pick a contrasting color in CSS (webkit.org)

If nothing is curated, how do we find things (tadaima.bearblog.dev)

Dead Stars Don’t Radiate (johncarlosbaez.wordpress.com)

Show HN: Pixelagent – Build your Stateful Agent Framework in 200 lines of code (github.com)

Palette lighting tricks on the Nintendo 64 (30fps.net)

Mice grow bigger brains when given this stretch of human DNA (nature.com)

Crypto has become the ultimate swamp asset (economist.com)

Understanding Transformers via N-gram Statistics (arxiv.org)

Memetics – A Growth Industry in US Military Operations (2006) [pdf] (apps.dtic.mil)

Mexican Navy ship crashes into Brooklyn Bridge leaving two people dead (theguardian.com)

Bike-mounted sensor could boost the mapping of safe cycling routes (newatlas.com)

O2 VoLTE: locating any customer with a phone call (mastdatabase.co.uk)

Espanso – Cross-Platform Text Expander Written in Rust (github.com)

Internet Phone Book (internetphonebook.net)

FreeBASIC is a free/open source BASIC compiler for Windows DOS and Linux (freebasic.net)

Pyrefly: A new type checker and IDE experience for Python (engineering.fb.com)

Proton threatens to quit Switzerland over new surveillance law (techradar.com)

“Streaming vs. Batch” Is a Wrong Dichotomy, and I Think It's Confusing (morling.dev)

Climbing trees 1: what are decision trees? (mathpn.com)

Postgres with data branching and PII anonymization (xata.io)

The Lost Japanese ROM of the Macintosh Plus (journaldulapin.com)

Hacker News Anti-Paywall (greasyfork.org)

Anatomy of a $70M Auction Flop (nytimes.com)

Japan's IC cards are weird and wonderful (aruarian.dance)

Show HN: I built a knife steel comparison tool (new.knife.day)

How I fixed the infamous Basilisk II Windows “Black Screen” bug in 2013 (downtowndougbrown.com)

Weather Report from Saturn's Moon Titan (sci.news)

Catalog of Novel Operating Systems (github.com)

Moment of heart's formation captured in images for first time (theguardian.com)

A library of words: Discovering Roget's Thesaurus (2023) (austinkleon.substack.com)

Fortran for C Programmers (flang.llvm.org)

Xata: Postgres at scale, with copy-on-write branching and anonymization (xata.io)

Why Are There So Many 'Alternative Devices' All of a Sudden? (theatlantic.com)

A Simulation in C++ of Joseph Weizenbaum's 1966 Eliza (github.com)

Telegram's Extraordinary Business Model (softpagecms.com)

LLMs are more persuasive than incentivized human persuaders (arxiv.org)

What do y'all think – weeknd project

Comments (2)