Show HN: AgentCheck – Snapshot and Replay AI Agents Like Real Software

1 hvardhan878 0 7/2/2025, 1:18:26 PM github.com ↗

Hey HN,

I built AgentCheck, an open-source testing tool for LLM agents. It lets you:

Snapshot full agent runs (prompt, LLM calls, tool outputs, final answer)

Replay the trace locally — no API calls, no token costs

Diff agent behavior over time

Assert outputs to catch regressions

Why? Because today, most AI agents are tested by spot-checking outputs or rerunning flaky evals — which breaks CI, costs money, and misses edge cases. AgentCheck works more like Jest or VCR.py, but for LLM workflows. It records and replays traces so you can test agents like real software.

It’s CLI-first, dev-friendly, and designed to plug into LangChain/OpenAI workflows.

Still early I’d love feedback, contributors, and use cases from folks building agentic systems. The code’s here: https://github.com/hvardhan878/agentcheck

Thanks!

Fei-Fei Li: Spatial intelligence is the next frontier in AI [video] (youtube.com)

Trans-Taiga Road (2004) (jamesbayroad.com)

Whole-genome ancestry of an Old Kingdom Egyptian (nature.com)

Third Interstellar Object Discovered (minorplanetcenter.net)

Next month, saved passwords will no longer be in Microsoft’s Authenticator app (cnet.com)

Exploiting the IKKO Activebuds “AI powered” earbuds (2024) (blog.mgdproductions.com)

Nano-engineered thermoelectrics enable scalable, compressor-free cooling (jhuapl.edu)

ASCIIMoon: The moon's phase live in ASCII art (asciimoon.com)

That XOR Trick (2020) (florian.github.io)

Conversations with a Hit Man (magazine.atavist.com)

Show HN: CSS generator for a high-def glass effect (glass3d.dev)

Gmailtail – Command-line tool to monitor Gmail messages and output them as JSON (github.com)

Vitamin C Boosts Epidermal Growth via DNA Demethylation (jidonline.org)

Couchers is officially out of beta (couchers.org)

AI note takers are flooding Zoom calls as workers opt to skip meetings (washingtonpost.com)

What to build instead of AI agents (decodingml.substack.com)

Physicists Start to Pin Down How Stars Forge Heavy Atoms (quantamagazine.org)

A Higgs-Bugson in the Linux Kernel (blog.janestreet.com)

LLMs as Compilers (resync-games.com)

Features of D That I Love (bradley.chatha.dev)

Websites hosting major US climate reports taken down (apnews.com)

The Evolution of Caching Libraries in Go (maypok86.github.io)

The Zen of Quakerism (2016) (friendsjournal.org)

The uncertain future of coding careers and why I'm still hopeful (jonmagic.com)

Sony's Mark Cerny Has Worked on "Big Chunks of RDNA 5" with AMD (overclock3d.net)

Gene therapy restored hearing in deaf patients (news.ki.se)

A list is a monad (alexyorke.github.io)

MindsDB (YC W20) is hiring an AI solutions engineer (job-boards.greenhouse.io)

Don’t use “click here” as link text (2001) (w3.org)

Escher's art and computer science (github.com)

Private sector lost 33k jobs, badly missing expectations of 100k increase (cnbc.com)

Nightmares Linked to Faster Ageing and Premature Mortality (emjreviews.com)

More assorted notes on Liquid Glass (morrick.me)

NIH Scientists Link Air Pollution and Lung Cancer Mutations in Non-Smokers (insideclimatenews.org)

Efficient set-membership filters and dictionaries based on SAT (github.com)

Evidence of a 12,800-year-old shallow airburst depression in Louisiana (scienceopen.com)

CEOs Start Saying the Quiet Part Out Loud: AI Will Wipe Out Jobs (wsj.com)

WebAssembly Troubles part 4: Microwasm (2019) (troubles.md)

Cloudflare Introduces Default Blocking of A.I. Data Scrapers (nytimes.com)

Wayback Machine: One Trillion Web Pages Archived (blog.archive.org)

The Unseen Fury of Solar Storms (noemamag.com)

NYT to start searching deleted ChatGPT logs after beating OpenAI in court (arstechnica.com)

A proof-of-concept neural brain implant providing speech (arstechnica.com)

Reuleaux Kinematic Mechanisms Collection (digital.library.cornell.edu)

Super Simple "Hallucination Traps" to detect interview cheaters

Sparsity Is Cool (tilderesearch.com)

Bridging Shopify and Shipstation on Heroku: A Story of Custom Fulfillment (kevinhq.com)

Hexagon fuzz: Full-system emulated fuzzing of Qualcomm basebands (srlabs.de)

Huawei releases an open weight model trained on Huawei Ascend GPUs (arxiv.org)

The "personal computer" model scales better than the "terminal" model (utcc.utoronto.ca)

Show HN: AgentCheck – Snapshot and Replay AI Agents Like Real Software

Comments (0)