Show HN: Ask-human-mcp – zero-config human-in-loop hatch to stop hallucinations
ask-human-mcp pauses your agent whenever it’s stuck, logs a question into ask_human.md in your root directory with answer: PENDING, and then resumes as soon as you fill in the correct answer.
the pain:
your agent screams out an endpoint that never existed it makes confident assumptions and you spend hours debugging false leads
the fix:
ask-human-mcp gives your agent an escape hatch. when it’s unsure, it calls ask_human(), writes a question into ask_human.md, and waits. you swap answer: PENDING for the real answer and it keeps going.
some features:
- zero config: pip install ask-human-mcp + one line in .cursor/mcp.json → boom, you’re live - cross-platform: works on macOS, Linux, and Windows—no extra servers or webhooks. - markdown Q\&A: agent calls await ask_human(), question lands in ask_human.md with answer: PENDING. you write the answer, agent picks back up - file locking & rotation: prevents corrupt files, limits pending questions, auto-rotates when ask_human.md hits ~50 MB
the quickstart
pip install ask-human-mcp ask-human-mcp --help
add to .cursor/mcp.json and restart: { "mcpServers": { "ask-human": { "command": "ask-human-mcp" } } }
now any call like:
answer = await ask_human( "which auth endpoint do we use?", "building login form in auth.js" )
creates:
### Q8c4f1e2a ts: 2025-01-15 14:30 q: which auth endpoint do we use? ctx: building login form in auth.js answer: PENDING
just replace answer: PENDING with the real endpoint (e.g., `POST /api/v2/auth/login`) and your agent continues.
link:
github -> https://github.com/Masony817/ask-human-mcp
feedback:
I'm Mason a 19yo solo-founder at Kallro. Happy to hear any bugs, feature requests, or weird edge cases you uncover - drop a comment or open an issue! buy me a coffee -> coff.ee/masonyarbrough
Not much is stopping you from buying products from a retailer and selling them at a wholesaler, but you'd lose money in doing so.
"If you don't know the answer to a question and need the answer to continue, ask me before continuing"
Will you have some other person answer the question?
No comments yet
What you are asking for is AGI. We still need human in the loop for now.
I like the idea but would rather it use Slack or something if it's meant to ask anyone.
> (problem description) your agent […] makes confident assumptions
> (solution description) when it’s unsure
I read this as a contradiction: in one sentence you describe the problem as an agent being confident while hallucinating and in the next phrase the solution is that the agent can ask you if it’s unsure.
You tool is interesting but you may consider rephrasing that part.
So not at all, but that doesn't mean it's not useful.
Yes, it may not need to know with perfect certainty when it's unsure or stuck, but even to meet a lower bar of usefulness, it'll need at least an approximate means of determining that its knowledge is inadequate. To purport to help with the hallucination problem requires no less.
To make the issue a bit more clear, here are some candidate components to a stuck() predicate:
- possibilities considered
- time taken
- tokens consumed/generated (vs expected? vs static limit? vs dynamic limit?)
If the unsure/stuck determination is defined via more qualitative prompting, what's the prompt? How well has it worked?
(But you could build one that does this, and ask the LLM to call it and give your MCP that data... when it feels like it)
So you'd be using this by telling the LLM to run it when it thinks it's stuck. Or needs human input.
1: I am not anything even approaching deeply knowledgeable about MCP, so please, someone correct me if I'm wrong! There do seem to be some bi-directional messaging abilities, e.g. notification, but to figure out thinking time / token use / etc you would need to have access to the infrastructure running the LLM, e.g. Cursor itself or something.
You can probably get some where by indeed running a task 1000 times and looking for outliers in the execution time or token count. But that is of minimal use and anything more advanced than that is akin to water divining.
a) It doesn't know when it's hallucinating.
b) It can't provide you with any accurate confidence score for any answer.
c) Your library is still useful but any claim that you can make solutions more robust is a lie. Probably good enough to get into YC / raise VC though.
this is a streamlined implementation of a interanlly scrapped together tool that i decided to open-source for people to either us or build off of.
I’m interested. Where can I read more about this?
You've just described AGI.
If this were possible you could create an MCP server that has a continually updated list of FAQ of everything that the model doesn't know.
Over time it would learn everything.
It is amazing how bad LLMs are when it comes to reasoning about simple dynamics within trivial electronic circuits and how eager they are to insist the opposite of how things work in the real world is the secured truth.