Show HN: Play with an AI agent that debugs incidents in our sandbox
We’ve been working on an AI Agent that helps SRE/DevOps teams diagnose incidents without juggling dozens of dashboards. Instead of learning another tool, you can talk to it in plain English, and it correlates metrics, traces, and logs to explain what’s really happening.
Why we built this:
We’ve used a lot of observability tools ourselves as SREs, and also helped others set up observability systems. The idea of observability is great, but in practice we kept running into the same problems: steep learning curve, and constant context-switching between tools when debugging.
We thought AI might help here. By correlating data from multiple sources, you can just ask questions in natural language and let the AI gather and analyze the data for you.
Right now we’re focused on incident analysis: when an incident happens, the AI Agent looks across metrics, logs, traces, and extra eBPF signals we collect, then tries to explain the causal chain leading to the issue. You can also add the agent to a Slack channel and chat with it directly during incidents.
It’s powered by eBPF for deep kernel-level visibility, an LLM-based agent for reasoning, and integrates with your existing observability stack so you don’t need to replace tools.
To make it easy to try, we built a sandbox: https://sandbox.syn-cause.com
How to use the sandbox: 1. Deploy a test app
2. Inject failures (e.g. latency, errors, resource exhaustion)
3. Chat with the AI Agent as it analyzes the signals and explains the incident
We’d love your feedback: * Does this approach feel useful for diagnosing incidents?
* What would you want to see it do better/differently?
* Any concerns about usability or integration?
Thanks!
No comments yet