If an AI agent can't figure out how your API works, neither can your users

96 mattmarcus 49 5/20/2025, 2:52:21 PM stytch.com ↗

Comments (49)

neves · 7h ago
In one of my first uses of a LLM for helping me code, was to use a not popular Python library API. It completely hallucinated the API calls, but what was interesting is that the invented API was much more pythonic than the real one. It created a better API.

Now I'm waiting for the change to use LLMs for creating an API for a package of mine. It will be averaged from all other apis and won't have unexpected calls.

energy123 · 7h ago
> It completely hallucinated the API calls, but what was interesting is that the invented API was much more pythonic than the real one. It created a better API.

Something I realized when refactoring is it's easier to vibe refactor a codebase that was itself vibe coded because all the code is "in distribution". If I try to vibe refactor a codebase I wrote, it doesn't cohere with what it expects to see and hallucinates more.

dustingetz · 7h ago
well, yes, because the imaginary world has no real world constraints
hombre_fatal · 7h ago
But it's not necessarily constraints that drive most library API design. I'd say most of it is arbitrary. Kinda of like which classes you decided to make in an OOP system. We tend to pick something that works and roll with it. Maybe only refactoring it if it's too hard to test.
ModernMech · 7h ago
> It created a better API.

Did it though? It didn't create an API, it created the appearance of an API. The reality of the API, something the library author had to wrestle with and the LLM didn't, is probably much more complex and nuanced than the LLM is hallucinating.

Maybe the API it's hinting at would be better if made real. But it pains me you're telling us the LLM, a tool known for making things up and being wrong, despite not even having done it, can do it better than a person who actually spent their time to make and put a real thing out into the world. Maybe one day, but just making up BS is not creating a better API.

eddd-ddde · 6h ago
And API is just a set of functions and types that determine an interface. I fully believe an LLM is better at creating APIs than most programmers.
ModernMech · 6h ago
> And API is just a set of functions and types that determine an interface.

That's the kind of thinking that leads to janky APIs. When you say "just" you're doing the same thing the LLM does - you're removing all the nuance and complexity from the activity.

For example, your concept of an API as just a set of functions does not consider how the API changes over time. Library authors who take this into account will have a better time evolving the library API. Library authors who don't might write themselves into a corner, which might force some sort of API version schism which causes half the API to be nice while the other half has questionable decisions, causing perpetual confusion and frustration with users for decades.

The LLM hallucinating some nice looking function calls doesn't really take any of that into account.

palmotea · 2h ago
>> And API is just a set of functions and types that determine an interface.

> That's the kind of thinking that leads to janky APIs. When you say "just" you're doing the same thing the LLM does - you're removing all the nuance and complexity from the activity.

I wonder if that kind of thinking is a component of the difference between LLM fans and LLM skeptics. If the programmer who is not sensitive to "nuance and complexity" gets something lacking it from an LLM, they're happy today. A programmer who is sensitive to "nuance and complexity" would be unhappy with the same response.

Groxx · 7h ago
Indeed, because the only things worth building are things you could give to a junior dev with little oversight.

That's why programming salaries are so low and why nobody stays in the field after a couple years - it's too hard to make a living when you have to compete with people fresh out of training bootcamps.

98codes · 7h ago
/s, presumably
diggan · 7h ago
> That's why programming salaries are so low

Lol, where you live/work? Almost everywhere you get paid more as a software developer than as a nurse (just one example), and the difference of impact on your health/sanity from both roles is huge.

I think people who never worked in anything else than software don't know how easy they have it.

RexySaxMan · 7h ago
They are being sarcastic.
unshavedyak · 7h ago
I took their comments as sarcasm, fwiw.
dramdass868 · 7h ago
It’s a great forcing function for:

* Simplicity of input knobs - way too many APIs are unapproachable with the number and complexity of inputs

* Complete documentation - if you don’t document a parameter or endpoint, expect an AI agent is never going to use it (or at least use it the way you want it to) especially in multi-agentic systems where your tool needs to be chosen by the LLM

* Clear, descriptive API outputs so that an agentic system knows if and how to include in its final output

* No overloading of endpoint functionality - one endpoint, one purpose

* Handle errors and retries gracefully

These are not rocket science guidelines but API design had shifted to be too user “unfriendly” and lacked empathy for users. It’s ironic that user empathy needs to increase again now with agent users :)

P.S. this topic is near and dear to me heart since I just designed and implemented an agent friendly API at Tako. Check out the docs here - docs.trytako.com and the API playground (trytako.com/playground) [requires login for free custom queries] we built to showcase how easy the API is to use. Feedback and discussion welcome!

thechao · 6h ago
Todd Veldhuizen — of C++ template metaprogramming (in)fame(y) — had a paper called "Parsimony Principles in Software Programming" or something like that. His argument was that libraries should be built in decomposable layers. The bottom layer should be all of the basic utilities that can be cobbled together to build the thing you want. On top of that should be convenience layers that follow "standard happy paths". The unbreakable rule of the upper layers was that they could only be written to the public API of the lower layers, and they should try to expose their internals (given appropriate invariants) as much as possible. Such libraries then come "knob free" for people who just need to "knock stuff together"; but, if you had to dig down, there was a discipline and a public API you could use for the parts that the higher levels were built from.

I mean — maybe in some glorious alternate timeline, but not ours, I guess?

shoobs · 6h ago
I did an experiment a few weeks back as a bit of a litmus test for a new data format my team and I have been working on. I fed the spec documentation into ChatGPT and asked it to write a file reader for it. It spat out a mostly-correct implementation almost immediately, and with some extra coaxing managed to fix the small bugs in it too.

To be fair, it's a very simple format, but it made me feel good about the quality of the documentation.

exabrial · 7h ago
We used to have things called WSDLs and XSD schemas that made it _extraordinarily_ easy to make remote calls. Granted, a bunch of ding dongs never loaded their own WSDL to look at it, creating a bad rap.

We do have:

* WADLs: https://en.wikipedia.org/wiki/Web_Application_Description_La...

* JSON Schema:https://json-schema.org/learn/miscellaneous-examples

And when they are available they're incredible, but nobody uses them.

marcosdumay · 7h ago
If you ever tried to read an actual WSDL file, you'd have noticed it was not great.

Yes, having an API type declaration is really important. And yes, somehow a lot of people just don't use those things. But the WSDL was one of the worst standards for that in all times. (And also, it inherited all of the shitness from XML, even the allowing non-deterministic processing and side effects while reading the file.)

Anyway, I'm not really disagreeing on your main point. Read the JSON Schema docs, people, use it.

exabrial · 5h ago
> Granted, a bunch of ding dongs never loaded their own WSDL to look at it

...

Groxx · 7h ago
Swagger/OpenAPI has been reasonably popular for a while: https://swagger.io/docs/specification/v3_0/about/
liampulles · 6h ago
I think one can define a good API or a bad API using WSDLs (as with most commonly used schemas), but I have to say my experience with consuming WSDLs has universally been the latter.

"My theory" is that the ease at which one can turn a function into an exposed, documented API is inversely proportional to the likelihood of it being a quality API. I think automagic annotations which turn functions into JSON APIs obey the same principle, for what its worth.

duttish · 7h ago
Json schema is great, we used it at my last job and I've built a fuzzer generating data based on it. So the data passes basic validation, but possibly crashes the backend afterwards. Found lots of problems the first run.
liampulles · 7h ago
A good API is upfront with its caveats, e.g. saying that you should do an exponential backoff X times with Y initial delay. But the only way to verify that claim is through usage. I wouldn't expect a junior or an AI agent to be able to assess that in the initial client implementation, and I really wouldn't want a junior or AI agent to hit the prod instance of this service in myriad different ways to try and unearth these quirks.
charlimangy · 7h ago
Often API errors are intentionally vague to discourage hacking attacks. Especially public APIs that create accounts or control access like the example given in the article.

In fact, I suspect that endpoints that create users and upgrade permissions will probably have to have special attention to protect against AI agent attacks.

"Claude -- sign me up for a new account so I can get free shipping on my first purchase!"

NitpickLawyer · 7h ago
Surely there are better ways to gatekeep user creation than intentionally bad APIs, right?! Plus, with the various browser integrations out there, the agents will follow the same UX you have for your human users. Make it too hard on them and you're in that bears and trashcans meme.
blopker · 7h ago
I get the feeling we're going to end up in a place where we don't make docs any more. A project will have a trusted agent that can see the actual code, maybe just the API surface, and that agent acts like a customer service rep to a user's agent. It will generate docs on the fly, with specific examples for the task needed. Maybe the agents will find bugs together and update the code too.

Not exactly where I'd like to see us go, but at least we'll never get outdated information.

levkk · 7h ago
There are lots of things that neither the code nor the docs cover, so I suspect that's not quite possible, yet.

For example, if you're deploying a Postgres proxy, it will have a TCP timeout setting that you can tweak. Neither the docs nor the code will tell you what the value should be set to though.

Your engineers might know, because they have seen your internal network fail dozens of times and have a good intuition about it.

Software complexity has a wide range. If you're thinking of simple things like Sendgrid, Twilio or Stripe APIs, sure, an agent can easily write some boilerplate. But I think in certain sectors, we would need to attach some more inputs to the model that we currently don't have to get it to a good spot.

amelius · 7h ago
Apparently the API of my kitchen is very bad.
pcwelder · 7h ago
I believe this applies to all AI use cases to varying degrees.

If AI can't use X, then there is something wrong with X.

X in { website, codebase, function, language, library, mcp, ...}

lesser23 · 7h ago
“AI” can’t use X so we have to dumb it down to the point a next token predictor can figure it out. Every day it seems like we are using spicy autocomplete as a measure of understandability which seems entirely silly to me. My own employer has ascribed some sort of spiritual status to prompts. The difference between prompting an LLM and a seance with psychedelics is getting smaller and smaller.

The next AI winter is going to be brutal and highly profitable for actual skilled devs.

nativeit · 2h ago
Your description of your employer struck a chord that's been resonating in me for the last several months. I'm legitimately concerned about the knowledge gap with regards to how LLMs work, and a new generation of cults using them as quasi-deities (in both good/bad faith, as it were).
kurthr · 7h ago
I don't know, man. I also find AI slop to be uncanny and strangely repulsive, but the other side is that, if there's not enough information in your API and documentation for an LLM, then there's a clever intern that will get it just exactly wrong.

The challenge with making things idiot proof is the ingenuity of idiots. Remember 50% of people are "bellow" median.

john-h-k · 5h ago
> a next token predictor can figure it out

Describing LLMs as "next token predictors" is disingenuous and wrong

poly2it · 4h ago
How so? Autoregressive LLMs are quite literally "next token predictors", just very sophisticated ones.
jtrn · 7h ago
I wanted to just reply “this”, but i feared it would be seen as unbecoming on hn, but really truly and deeply mean it: this!!!
koakuma-chan · 7h ago
I agree, sometimes API docs are missing key details and LLM has to make assumptions
jelambs · 7h ago
Thanks for sharing! Would love to hear how others are thinking about this problem.
stronglikedan · 5h ago
I mean, that's kinda why we're all here, so...

EDIT: Coincidentally, it just dawned on me that I'm very likely replying to an AI that knows how to use the HN API.

poly2it · 4h ago
I don't think they are based on their previous activity. Seems a bit to sporadic and tempered.
sublinear · 7h ago
As someone who is reasonably skeptical about a lot of this stuff, I have a few takeaways.

Docs will always have things missing regardless if a human or an AI writes them. A fuzzer might overshoot and document a ton of "unintended features" (bugs). Bugs are inevitable for similar reasons. And lastly, is this how the rest of the world finally realizes how hard this stuff really is? Can we please get rid of pointy haired bosses and iron fisted management that refuse to cut some slack for lower level problems like this?

I'm all for living in this century including the AI, but that also includes new ways of running a business and the people we hire.

datpuz · 7h ago
AI agents can't even do very basic things without causing absolute mayhem. Do you really think it makes sense for your basic assumption to be that your user is dumber than a token generator?
furyofantares · 7h ago
The article title stinks, I agree. Users are way better at figuring out APIs than AI agents right now.

The actual contents of the article are more about using an AI agent to playtest your docs. The premise is actually the opposite of the title: if an AI agent can figure out your API then your users probably can too.

blitzar · 7h ago
> your basic assumption to be that your user is dumber than a token generator

No. You should assume your user is dumber than a brick - and not one of the smart or clever bricks, one of the really dumb ones.

MacsHeadroom · 7h ago
YES 100%, and severely disabled too.
energy123 · 7h ago
I legitimately wish developers of consumer apps would assume I am extremely stupid (and impatient).

I'll grant you your silly hamburger icon, I can memorize that one, if you grant me text icons for everything else, and low latency, and leave your ego and whatever you think you know about customers wanting to memorize 8 different icons for each app they use at the door.

Zambyte · 7h ago
I wish developers of consumer applications wouldn't assume anything about me, and would let me use my computer to my whims of the moment.
diggan · 7h ago
> Do you really think it makes sense for your basic assumption to be that your user is dumber than a token generator?

Have you seen the average user in action? I'm fairly sure that's true at least on average. Even putting huge red warnings like "This action is irreversible" for some things will lead to user reaching out to you saying they didn't see it.

datpuz · 6h ago
The average user of a developer API is quite a bit smarter than the average user of, say, a mobile game
MyOutfitIsVague · 7h ago
> AI agents can't even do very basic things without causing absolute mayhem.

Agreed.

> Do you really think it makes sense for your basic assumption to be that your user is dumber than a token generator?

Absolutely. Users will fuck up basic things. I wrote a simple inotify wrapper and half of the issues I got were that it didn't work on Windows and MacOS.