I thought this article was going to be a bunch of security theater nonsense - maybe the relatively bland title - but after reading I found it to be incredibly insightful, particularly this:
> MCP discards this lesson, opting for schemaless JSON with optional, non-enforced hints. Type validation happens at runtime, if at all. When an AI tool expects an ISO-8601 timestamp but receives a Unix epoch, the model might hallucinate dates rather than failing cleanly. In financial services, this means a trading AI could misinterpret numerical types and execute trades with the wrong decimal precision. In healthcare, patient data types get coerced incorrectly, potentially leading to wrong medication dosing recommendations. Manufacturing systems lose sensor reading precision during JSON serialization, leading to quality control failures.
Having worked with LLMs every day for the past few years, it is easy to see every single one of these things happening.
I can practically see it playing out now: there is some huge incident of some kind, in some system or service with an MCP component somewhere, with some elaborate post-mortem revealing that some MCP server somewhere screwed up and output something invalid, the LLM took that output and hallucinated god knows what, its subsequent actions threw things off downstream, etc.
It would essentially be a new class of software bug caused by integration with LLMs, and it is almost sure to happen when you combine it with other sources of bug: human error, the total lack of error checking or exception handling that LLMs are prone to (they just hallucinate), a bunch of gung-ho startups "vibe coding" new services on top of the above, etc.
I foresee this being followed by a slew of Twitter folks going on endlessly about AGI hacking the nuclear launch codes, which will probably be equally entertaining.
cookiengineer · 5h ago
Let's put it this way:
Before 2023 I always thought that all the bugs and glitches of technology in Star Trek were totally made up and would never happen this way.
Post-LLM I am absolutely certain that they will happen exactly that way.
I am not sure what LLM integrations have to do with engineering anymore, or why it makes sense to essentially put all your company's infrastructure into external control. And that is not even scratching the surface with the lack of reproducibility at every single step of the way.
It "somehow works" isn't engineering.
withinboredom · 4h ago
For someone who isn't a trek fan -- can you elaborate on this?
whoknowsidont · 4h ago
The computer, at least aboard the enterprise, is kind of portrayed as a singular monolithic AI that can access the majority of the ship's subsystems (different networks, other computer/control units, etc) and functions. It can control nearly every aspect of the ship while talking with its human crew / commanding officers.
So very much like an LLM accessing multiple pieces of functionality across different tools and API endpoints (if you want to imagine it that way).
While it is seemingly very knowledgeable, it is rather stupid. It gets duped by nefarious actors or has a class of bugs that are elementary that put the crew into awkward positions.
Most professional software engineers might have previously looked as these scenarios as implausible, given the "failure model" of current software is quite blunt, and especially given how far into the future the series took place.
Now we see that computational tasks are becoming less predictable, less straight-forward, with cascading failures instead of blunt, direct failures. Interacting with an LLM might be compared to talking with a person in psychosis when it starts to hallucinate.
Which make a lot more sense, become a lot more plausible and a lot more relatable with our current implementations of AI/LLM's.
wredcoll · 4h ago
> It "somehow works" isn't engineering.
But it sure is fast.
_kb · 1h ago
Move faster and break more things.
drweevil · 2h ago
It somehow works, until it doesn't. That's when the pain starts.
cle · 5h ago
I don't understand this criticism by the author. MCP supports JSON Schema, and server responses must conform to the schema. If the schema requires an ISO-8601 timestamp (ex by specifying a "date" format in the schema) but the server sends a Unix epoch timestamp, then it is violating the protocol.
The author even later says that MCP supports JSON Schema, but also claims "you can't generate type-safe clients". Which is plainly untrue, there exist plenty of JSON Schema code generators.
ohdeargodno · 5h ago
Except that any properly written software will respond to protocol and schema violations by throwing an error.
Claude will happily cast your int into a 2023 Toyota Yaris and keep on hallucinating things.
starkparker · 5h ago
For the curious:
> Cast an integer into the type of a 2023 Toyota Yaris using Javascript
(GPT-4o mini)
> To cast an integer into the type of a 2023 Toyota Yaris in JavaScript, you would typically create a class or a constructor function that represents the Toyota Yaris. Then, you can create an instance of that class using the integer value. Here's an example of how you might do this:
// Define a class for the 2023 Toyota Yaris
class ToyotaYaris {
constructor(modelYear, value) {
this.modelYear = modelYear;
this.value = value;
}
displayInfo() {
console.log(`Model Year: ${this.modelYear}, Value: ${this.value}`);
}
}
// Function to cast an integer into a Toyota Yaris object
function castToYaris(integerValue) {
const modelYear = 2023; // Set the model year
return new ToyotaYaris(modelYear, integerValue);
}
// Example usage
const integerValue = 20000; // Example integer value
const yaris = castToYaris(integerValue);
yaris.displayInfo(); // Output: Model Year: 2023, Value: $20000
tempaccount420 · 2h ago
You really tried to inflict the most amount of damage to the reader by choosing GPT-4o mini
mrits · 5h ago
Billy, it's becoming increasingly harder to believe you are writing this code yourself
cle · 5h ago
I just tried this in Claude Code. I made an MCP server whose tool output is declared as an integer but it returns a string at runtime.
Claude Code validated the response against the schema and did not pass the response to the LLM.
test - test_tool (MCP)(input: "foo")
⎿ Error: Output validation error: 'bar' is not of type 'integer'
whoknowsidont · 2h ago
How many times does this need to be repeated.
It works in this instance. On this run. It is not guaranteed to work next time. There is a error percentage here that makes it _INEVITABLE_ that eventually, with enough executions, the validation will pass when it should fail.
It will choose not to pass this to the validator, at some point in the future. It will create its own validator, at some point in the future. It will simply pretend like it did any of the above, at some point in the future.
This might be fine for your B2B use case. It is not fine for underlying infrastructure for a financial firm or communications.
cle · 1h ago
Every time the LLM uses this tool, the response schema is validated--deterministically. The LLM will never see a non-integer value as output from the tool.
ohdeargodno · 4h ago
This time.
Can you guarantee it will validate it every time ? Can you guarantee the way MCPs/tool calling are implemented (which is already an incredible joke that only python brained developers would inflict upon the world) will always go through the validation layer, are you even sure of what part of Claude handles this validation ? Sure, it didn't cast an int into a Toyota Yaris. Will it cast "70Y074" into one ? Maybe a 2022 one. What if there are embedded parsing rules into a string, will it respect it every time ? What if you use it outside of Claude Code, but just ask nicely through the API, can you guarantee this validation still works ? Or that they won't break it next week ?
The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.
dragonwriter · 3h ago
> Can you guarantee it will validate it every time ?
Yes, to the extent you can guarantee the behavior of third party software, you can (which you can't really guarantee no matter what spec the software supposedly implements, so the gaps aren't an MCP issue), because “the app enforces schema compliance before handing the results to the LLM” is deterministic behavior in the traditional app that provides the toolchain that provides the interface between tools (and the user) and the LLM, not non-deterministic behavior driven by the LLM. Hence, “before handing the results to the LLM”.
> The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.
The toolchain is parsing, validating, and mapping the data into the format preferred by the chosen models promot template, the LLM has nothing to do with doing that, because that by definition has to happen before it can see the data.
You aren't trusting the LLM.
whoknowsidont · 2h ago
>The toolchain is parsing, validating, and mapping the data into the format preferred by the chosen models promot template, the LLM has nothing to do with doing that
The LLM has everything to do with that. The LLM is literally choosing to do that. I don't know why this point keeps getting missed or side-stepped.
It WILL, at some point in the future and given enough executions, as a matter of statistical certainty, simply not do that above, or pretend to do the above, or do something totally different at some point in the future.
whartung · 1h ago
I don’t know how this works, just to start off.
How does the AI bypass the MCP layer to make the request? The assumption is (as I understand it) the AI says “I want to make MCP request XYZ with data ABC” and it sends that off to the MCP interface which does the heavy lifting.
If the MCP interface is doing the schema checks, and tossing errors as appropriate, how is the AI routing around this interface to bypass the schema enforcement?
cle · 50m ago
What you described is essentially how it works. The LLM has no control over how the inputs & outputs are validated, nor in how the result is fed back into it.
The MCP interface (Claude Code in this case) is doing the schema checks. Claude Code will refuse to provide the result to the LLM if it does not pass the schema check, and the LLM has no control over that.
dragonwriter · 2h ago
> The LLM has everything to do with that. The LLM is literally choosing to do that.
No, the LLM doesn't control on a case-by-caae basis what the toolchain does between the LLM putting a tool call request in an output message and the toolchain calling the LLM afterwards.
If the toolchain is programmed to always validate tool responses against the JSON schema provided by MCP server before mapping into the LLM prompt template and calling the LLM again to handle the response, that is going to happen 100% of the time. The LLM doesn't choose it. It CAN'T because the only way it even knows that the data has come back from the tool call is that the toolchain has already done whatever it is programmed to do, ending with mapping the response into a prompt and calling the LLM again.
Even before MCPs or even models specifically trained and with vendor-provided templates for tool calling (but after the ReAct architecture was described), it was like a weekend project to implement a basic framework supporting tooling calling around a local or remote LLM. I don't think you need to do that to understand how silly the claim that the LLM controls what the toolchain does with each response and might make it not validate it is, but certainly doing it will give you a visceral understanding of how silly it is.
whoknowsidont · 1h ago
I think you are, for whatever reason, missing a fact of causality here and I'm not sure I can fix that over text. I mean that in the most respectful way possible.
gnat · 21m ago
Are you two talking at cross-purposes because you don't have a shared understanding of control and data flow?
The pieces here are:
* Claude Code, a Node (Javascript) application that talks to MCP server(s) and the Claude API
* The MCP server, which exposes some tools through stdin or HTTP
* The Claude API, which is more structured than "text in, text out".
* The Claude LLM behind the API, which generates a response to a given prompt
Claude Code is a Node application. CC is configured in JSON with a list of MCP servers. When CC starts up, CC"s Javascript initialises each server and as part of that gets a list of callable functions.
When CC calls the LLM API with a user's request, it's not just "here is the user's words, do it". There are multiple slots in the request object, one of which is a "tools" block, a list of the tools that can be called. Inside the API, I imagine this is packaged into a prefix context string like "you have access to the following tools: tool(args) ...". The LLM API probably has a bunch of prompts it runs through (figure out what type of request the user has made, maybe using different prompts to make different types of plan, etc.) and somewhere along the way the LLM might respond with a request to call a tool.
The LLM API call then returns the tool call request to CC, in a structured "tool_use" block separate from the freetext "hey good news, you asked a question and got this response". The structured block means "the LLM wants to call this tool."
CC's JS then calls the server with the tool request and gets the response. It validates the response (e.g., JSON schemas) and then calls the LLM API again bundling up the success/failure of the tool call into a structured "tool_result" block. If it validated and was successful, the LLM gets to see the MCP server's response. If it failed to validate, the LLM gets to see that it failed and what the error message was (so the LLM can try again in a different way).
The idea is that if a tool call is supposed to return a CarMakeModel string ("Toyota Tercel") and instead returns an int (42), JSON Schemas can catch this. The client validates the server's response against the schema, and calls the LLM API with
So the LLM isn't choosing to call the validator, it's the deterministic Javascript that is Claude Code that chooses to call the validator.
There are plenty of ways for this to go wrong: the client (Claude Code) has to validate; int vs string isn't the same as "is a valid timestamp/CarMakeModel/etc"; if you helpfully put the thing that failed into the error message ("Expect string, got integer (42)") then the LLM gets 42 and might choose to interpret that as a CarMakeModel if it's having a particularly bad day; the LLM might say "well, that didn't work, but let's assume the answer was Toyota Tercel, a common car make and model", ... We're reaching here, yet these are possible.
But the basic flow has validation done in deterministic code and hiding the MCP server's invalid responses from the LLM. The LLM can't choose not to validate. You seemed to be saying that the LLM could choose not to validate, and your interlocutor was saying that was not the case.
I hope this helps!
cle · 4h ago
This is deterministic, it is validating the response using a JSON Schema validator and refusing to pass it to an LLM inference.
I can't gaurantee that behavior will remain the same more than any other software. But all this happens before the LLM is even involved.
> The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.
You are describing why MCP supports JSON Schema. It requires parsing & validating the input using deterministic software, not LLMs.
whoknowsidont · 2h ago
>This is deterministic, it is validating the response using a JSON Schema validator and refusing to pass it to an LLM inference.
No. It is not. You are still misunderstanding how this works. It is "choosing" to pass this to a validator or some other tool, _for now_. As a matter of pure statistics, it will simply not do this at some point in the future on some run.
You are quite wrong. The LLM "chooses" to use a tool, but the input (provided by the LLM) is validated with JSON Schema by the server, and the output is validated by the client (Claude Code). The output is not provided back to the LLM if it does not comply with the JSON Schema, instead an error is surfaced.
dboreham · 4h ago
imho it's a fantasy to expect type safe protocols except in the case that both client and server are written in the same (type safe) language. Actually even that doesn't work. What language actually allows a type definition for "ISO-8601 timestamp" that's complete? Everything ends up being some construction of strings and numbers, and it's often not possible to completely describe the set of valid values except by run-time checking, certainly beyond trivial cases like "integer between 0 and 10".
mgh95 · 4h ago
> What language actually allows a type definition for "ISO-8601 timestamp" that's complete?
It is absolutely possible to do this, and to generate client code which complies with ISO-8601 in JS/TS. Large amounts of financial services would not work if this was not the case.
You've misunderstood his statement and proven his point.
`DateTime` is not an ISO-8601 type. It can _parse_ an ISO-8601 formatted string.
And even past that, there are Windows-specific idiosyncrasies with how the `DateTime` class implements the parsing of these strings and how it stores the resulting value.
mgh95 · 3m ago
> `DateTime` is not an ISO-8601 type. It can _parse_ an ISO-8601 formatted string.
This is exactly the point: a string is just a data interchange format in the context of a DateTime, and C# provides (as far as I can tell) a complete way of accessing the ISO-8601 specification on the language object. It also supports type-safe generation of clients and client object (or struct) generation from the ISO-8601 string format.
> And even past that, there are Windows-specific idiosyncrasies with how the `DateTime` class implements the parsing of these strings and how it stores the resulting value.
Not really. The windows statements on the article (and I use this on linux for financial services software) are related to automated settings of the preferences for generated strings. All of these may be set within the code itself.
cle · 4h ago
Generally you'd use a time library to model ISO-8601 dates in a typesafe way. Some fancier languages might have syntactic support for it, but they ultimately serve the same purpose.
Related but distinct from serialization.
jongjong · 4h ago
At its core, the article was just ramblings from someone being upset that LLMs didn't make things more complicated so that they could charge more billable hours to solve invented corporate problems... Which some people built their career on.
The merchants of complexity are disappointed. It turns out that even machines don't care for 'machine-readable' formats; even the machines prefer human-readable formats.
The only entities on this planet who appreciate so-called 'machine-readability' are bureaucrats; and they like it for the same reason that they like enterprise acronyms... Literally the opposite of readability.
nxobject · 2h ago
I look forward to waiting a decade and seeing what MCP ends up reinventing.
tomrod · 6h ago
We already have PEBKAC - problem exists between chair and keyboard.
LLMs are basically automating PEBKAC
hinkley · 5h ago
> In healthcare, patient data types get coerced incorrectly, potentially leading to wrong medication dosing recommendations.
May have changed, but unlikely. I worked with medical telemetry as a young man and it was impressed upon me thoroughly how important parsing timestamps correctly was. I have a faint memory, possibly false, of this being the first time I wrote unit tests (and without the benefit of a test framework).
We even accounted for lack of NTP by recalculating times off of the timestamps I. Their message headers.
And the reasons I was given were incident review as well as malpractice cases. A drug administered three seconds before a heart attack starts is a very different situation than one administered eight seconds after the patient crashed. We saw recently with the British postal service how lives can be ruined by bad data, and in medical data a minute is a world of difference.
deathanatos · 5h ago
> May have changed, but unlikely. I worked with medical telemetry as a young man and it was impressed upon me thoroughly how important parsing timestamps correctly was.
I also work in healthcare, and we've seen HL7v2 messages with impossible timestamps. (E.g., in the spring-forward gap.)
hinkley · 4h ago
Since we were getting low latency data inside HTTP responses we could work off of the response header clock skew to narrow origination time down to around one second, and that’s almost as good as NTP can manage anyway.
As RPC mechanisms go, HTTP is notable for how few of the classic blunders they made in 1.0 of the spec. Clock skew correction is just my favorite. Technically it exists for cache directives, but it’s invaluable for coordination across machines. There are reasons HTTP 2.0 waited decades to happen. It just mostly worked.
avereveard · 5h ago
MCP focuses on transport and managing context and doesn't absolve the user for sensibly implementing the interface (i.e. defining a schema and doing schema validation)
this is like saying "HTTP doesn't do json validation", which, well, yeah.
jongjong · 4h ago
To me, the article was just rambling about all sorts of made up issues which only exist in the minds of people who never spent any time outside of corporate environments... A lot of 'preventative' ideas which make sense in some contexts but are mis-applied in different contexts.
The stuff about type validation is incorrect. You don't need client-side validation. You shouldn't be using APIs you don't trust as tools and you can always add instructions about the LLM's output format to convert to different formats.
MCP is not the issue. The issue is that people are using the wrong tools or their prompts are bad.
If you don't like the format of an MCP tool and don't want to give formatting instructions the LLMs, you can always create your own MCP service which outputs data in the correct format. You don't need the coercion to happen on the client side.
lowbloodsugar · 4h ago
I’ve been successfully using AI for several months now, and there’s still no way I’d trust it to execute trades, or set the dose on an XRay machine. But startups gonna start. Let them.
oblio · 5h ago
We keep repeating this.
When desktop OSes came out, hardware resources were scarce so all the desktop OSes (DOS, Windows, MacOS) forgot all the lessons from Unix: multi user, cooperative multitasking, etc. 10 years later PC hardware was faster than workstations from the 90s yet we're still stuck with OSes riddled with limitations that stopped making sense in the 80s.
When smartphones came out there was this gold rush and hardware resources were scarce so OSes (iOS, Android) again forgot all the lessons. 10 years later mobile hardware was faster than desktop hardware from the 00s. We're still stuck with mistakes from the 00s.
AI basically does the same thing. It's all lead by very bright 20 and 30 year olds that weren't even born when Windows was first released.
It's all gold rushes and nobody does Dutch urban infrastructure design over decades. Which makes sense as this is all driven by the US, where long term plan I is anathema.
lovich · 2h ago
Our economic system punishes you for being born later, unless you manage to flip the table in terms of the status quo in the economy.
Of course this keeps happening
throwawaymaths · 6h ago
i mean isnt all this stuff up to the mcp author to return a reasonable error to the agent and ask for it to repeat the call with amendments to the json?
dotancohen · 6h ago
Yes. And this is where culture comes in. The culture of discipline of the C++ and the JavaScript communities are at extreme odds of the spectrum. The concern here is that the culture of interfacing with AI tools, such as MCP, is far closer to the discipline of the JavaScript community than to the C++ community.
fidotron · 3h ago
The fundamental difference is the JS community believe in finding the happy path that results in something they can sell before they have filled in all those annoying problem areas around it.
If an LLM can be shown to be useful 80% of the time to the JS mindset this is fine, and the remaining 20% can be resolved once we're being paid for the rest, Pareto principle be damned.
dragonwriter · 2h ago
> i mean isnt all this stuff up to the mcp author
Mostly, no. Whether its the client sending (statically) bad data or the server returning (statically) bad data, schema validation on the other end (assuming somehow it is allowed by the toolchain on the sending end) should reject it before it gets to the custom code of the MCP server or MCP client.
For arguments that are the right type but wrong because of the state of the universe, yes, the server receiving it should send a useful error message back to the client. But that's a different issue.
stouset · 5h ago
This is no different than the argument that C is totally great as long as you just don’t make mistakes with pointers or memory management or indexing arrays.
At some point we have to decide as a community of engineers that we have to stop building tools that are little more than loaded shotguns pointed at our own feet.
throwawaymaths · 4h ago
no, it's not because the nature of llms mean that even if you fully validate your communications with the llm statistically anything can happen, so any usage/threat model must already take nasal demons into account.
andersa · 5h ago
It's clearly a much better design if the shotguns are pointed at someone else's feet.
cwilkes · 5h ago
This implies that the input process did a check when it imported the data from somewhere else.
GIEMGO garbage in even more garbage out
nativeit · 6h ago
What's your point? It's up to a ship's captain to keep it afloat, doesn't mean the hundreds of holes in the new ship's hull aren't relevant.
GeneralMayhem · 6h ago
> MCP promises to standardize AI-tool interactions as the “USB-C for AI.”
Ironically, it's achieved this - but that's an indictment of USB-C, not an accomplishment of MCP. Just like USB-C, MCP is a nigh-universal connector with very poorly enforced standards for what actually goes across it. MCP's inconsistent JSON parsing and lack of protocol standardization is closely analogous to USB-C's proliferation of cable types (https://en.wikipedia.org/wiki/USB-C#Cable_types); the superficial interoperability is a very leaky abstraction over a much more complicated reality, which IMO is worse than just having explicitly different APIs/protocols.
cnst · 5h ago
I'd like to add that the culmination of USB-C failure was Apple's removal of USB-A ports from the latest M4 Mac mini, where an identical port on the exact same device, now has vastly different capabilities, opaque to the final user of the system months past the initial hype on the release date.
Previously, you could reasonably expect a USB-C on a desktop/laptop of an Apple Silicon device, to be USB4 40Gbps Thunderbolt, capable of anything and everything you may want to use it for.
Now, some of them are USB3 10Gbps. Which ones? Gotta look at the specs or tiny icons, I guess?
Apple could have chosen to have the self-documenting USB-A ports to signify the 10Gbps limitation of some of these ports (conveniently, USB-A is limited to exactly 10Gbps, making it perfect for the use-case of having a few extra "low-speed" ports at very little manufacturing cost), but instead, they've decided to further dilute the USB-C brand. Pure innovation!
With the end user likely still having to use a USB-C to USB-A adapters anyways, because the majority of thumb drives, keyboards and mice, still require a USB-A port — even the USB-C ones that use USB-C on the kb/mice itself. (But, of course, that's all irrelevant because you can always spend 2x+ as much for a USB-C version of any of these devices, and the fact that the USB-C variants are less common or inferior to USB-A, is of course irrelevant when hype and fanaticism are more important than utility and usability.)
afeuerstein · 5h ago
Yeah, I loughed out loud when I read that line. Mission accomplished, I guess?
rickcarlino · 7h ago
> SOAP, despite its verbosity, understood something that MCP doesn’t
Unfortunately, no one understood SOAP back.
(Additional context: Maintaining a legacy SOAP system. I have nothing good to say about SOAP and it should serve as a role model for no one)
jchw · 5h ago
Agreed. In practice, SOAP was a train wreck. It's amazing how overly complicated they managed to make concepts that should've been simple, all the way down to just XML somehow being radically more complex than it looks to the wacky world of ill-defined standards for things like WSDLs and weird usage of multi-part HTTP and, to top it all off, it was all for nothing, because you couldn't guarantee that a SOAP server written in one language would be interoperable with clients in other languages. (I don't remember exactly what went wrong, but I hit issues trying to use a SOAP API powered by .NET from a Java client. I feel like that should be a pretty good case!)
It doesn't take very long for people to start romanticizing things as soon as they're not in vogue. Even when the painfulness is still fresh in memory, people lament over how stupid new stuff is. Well I'm not a fan of schemaless JSON APIs (I'm one of those weird people that likes protobufs and capnp much more) but I will take 50 years of schemaless JSON API work over a month of dealing with SOAP again.
chasd00 · 5h ago
It’s been a while but isn’t soap just xml over http-post? Seems like all the soap stuff I’ve done is just posting lots of xml and getting lots of xml back.
/“xml is like violence, if it’s not working just use more!”
dragonwriter · 5h ago
> It’s been a while but isn’t soap just xml over http-post?
No.
SOAP uses that, but SOAP involves a whole lot of spec about how you do that, and that's even before (as the article seems to) treat SOAP as meaning SOAP + the set of WS-* standards built around it.
rcxdude · 5h ago
If it was some vaguely sensibly defined XML, it wouldn't be quite as bad. But it's a ludicrously over-complicated mapping between the service definition and the underlying XML, often auto-generated by a bunch of not very well designed nor compatible tooling.
pjmlp · 6h ago
I have plenty of good stuff to say, especially since REST (really JSON-RPC in practice), and GraphQL, seem to always being catching up to features the whole SOAP and SOA ecosystems already had.
Unfortunately as usual when a new technology cycle comes, everything gets thrown away, including the good parts.
SoftTalker · 7h ago
I have found that any protocol whose name includes the word "Simple" is anything but. So waiting for SMCP to appear....
yjftsjthsd-h · 6h ago
I dunno, SMTP wasn't bad last time I had to play with it. In actual use it wasn't entirely trivial, but most of that happened at layers that weren't really the mail transfer protocol's fault (SPF et al.). Although, I'm extremely open to that being one exception in flood of cases where you are absolutely correct:)
divan · 6h ago
No, letter S in MCP is reserved for "Security")
sirtaj · 6h ago
I recall two SOAP-based services refusing to talk to each other because one nicely formatted the XML payload and the other didn't like that one bit. There is a lot we lost when we went to json but no, I don't look back at that stuff with any fondness.
And I actually like XML-based technologies. XML Schema is still unparalleled in its ability to compose and verify the format of multiple document types. But man, SOAP was such a beast for no real reason.
Instead of a simple spec for remote calls, it turned into a spec that described everything and nothing at the same time. SOAP supported all kinds of transport protocols (SOAP over email? Sure!), RPC with remote handles (like CORBA), regular RPC, self-describing RPC (UDDI!), etc. And nothing worked out of the box, because the nitty-gritty details of authentication, caching, HTTP response code interoperability and other "boring" stuff were just left as an exercise to the reader.
AnotherGoodName · 6h ago
I'll give a different viewpoint and it's that I hate everything about XML. In fact one of the primary issues with SOAP was the XML. It never worked well across SOAP libraries. Eg. The .net and Java SOAP libraries have huge threads on stackoverflow "why is this incompatible" and a whole lot of needing to very tightly specify the schema. To the point it was a flaw; it might sound reasonable to tightly specify something but it got to the point there were no reasonable common defaults hence our complaints about SOAP verbosity and the work needed to make it function.
Part of this is the nature of XML. There's a million ways to do things. Should some data be parsed as an attribute of the tag or should it be another tag? Perhaps the data should be in the body between the tags? HTML, based on XML, has this problem; eg. you can seriously specify <font face="Arial">text</font> rather than have the font as a property of the wrapping tag. There's a million ways to specify everything and anything and that's why it makes a terrible data parsing format. The reader and writer must have the exact same schema in mind and there's no way to have a default when there's simply no particular correct way to do things in XML. So everything had to be very very precisely specified to the point it added huge amounts of work when a non-XML format with decent defaults would not have that issue.
This become a huge problem for SOAP and why i hate it. Every implementation had different default ways of handling even the simplest data structure passing between them and were never compatible unless you took weeks of time to specify the schema down to a fine grained level.
In general XML is problematic due to the lack of clear canonical ways of doing pretty much anything. You might say "but i can specify it with a schema" and to that i say "My problem with XML is that you need a schema for even the simplest use case in the first place".
cyberax · 6h ago
Yes, XML has way too much flexibility. With some very dark corners like custom entities, DTDs, and BOMs (byte order marks). It's clearly a child of 90-s conceived before UTF-8, and the corrosive world of the modern networks.
But parts of XML infrastructure were awesome. I could define a schema for the data types, and have my IDE auto-complete and validate the XML documents as I typed them. I could also validate the input/output data and provide meaningful errors.
And yeah, I also worked with XML and got burned many times by small incompatibilities that always happen due to its inherent complexity. If XML were just a _bit_ simpler, it could have worked so much better.
hinkley · 5h ago
Ironically what put me entirely off SOAP was a tech presentation on SOAP.
Generally it worked very well when both ends were written in the same programming language and was horseshit if they weren’t. No wonder Microsoft liked SOAP so much.
rickcarlino · 5h ago
And that begs the question why have a spec at all if it is not easily interoperable? If the specification is impossible to implement and understand, just make it language specific and call it a reference implementation. You can reinvent the wheel and it will be round.
hinkley · 4h ago
That’s the poison pill of EEE. Give someone the illusion there is an exit to the trap right until it closes, so they don’t wriggle out of it.
IBM thought they were good at lockin, until Bill Gates came along.
ohdeargodno · 5h ago
Parsing SOAP responses on memory limited devices is such a fun experiment in just how miserable your life can get.
mac-mc · 5h ago
You're missing the most significant lesson of all that MCP knew. That all of those featureful things are way too overcomplicated for most places, so they will gravitate to the simple thing. It's why JSON over HTTP blobs is king today.
I've been on the other side of high-feature serialization protocols, and even at large tech companies, something like migrating to gRPC is a multi-year slog that can even fail a couple of times because it asks so much of you.
MCP, at its core, is a standardization of a JSON API contract, so you don't have to do as much post-training to generate various tool calling style tokens for your LLM.
prerok · 5h ago
What are HTTP blobs?
I think you meant that is why JSON won instead of XML?
mac-mc · 5h ago
JSON-over-HTTP blobs. Or blobs of schemaless json.
Not just XML, but a lot of other serialization formats and standards, like SOAP, protobuf in many cases, yaml, REST, etc.
People say REST won, but tell me how many places actually implement REST or just use it as a stand-in term for casual JSON blobs to HTTP URLs?
prerok · 3h ago
So, I just looked it up, thinking I might have overlooked something but, at least according to wikipedia, REST does not prescribe the format of the data transferred. So, I don't understand why you are comparing REST to xml, yaml, json or whatever.
Now, YAML has quite a few shortcomings compared to JSON (if you don't believe me, look at its handling of the string no, discussed on HN), so, at least to me, it's obvious why JSON won.
SOAP, don't get me started on that, it's worth less than XML, protobuf is more efficient but less portable, etc.
layer8 · 3h ago
JSON won because it’s what JavaScript in browsers understands natively. That’s also the reason why JSON even exists in the first place.
prerok · 3h ago
And who decided that? Why not XML?
That's backwards reasoning. XML was too complicated, so they decided on a simpler JSON.
zorked · 7h ago
CORBA emerged in 1991 with another crucial insight: in heterogeneous environments, you can’t just “implement the protocol” in each language and hope for the best. The OMG IDL generated consistent bindings across C++, Java, Python, and more, ensuring that a C++ exception thrown by a server was properly caught and handled by a Java client. The generated bindings guaranteed that all languages saw identical interfaces, preventing subtle serialization differences.
Yes, CORBA was such a success.
cortesoft · 6h ago
Yeah, the modern JSON centered API landscape came about as a response to failures of CORBA and SOAP. It didn’t forget the lessons of CORBA, it rejected them.
pjmlp · 6h ago
And then rediscovered why we need schemas in CORBA and SOAP, or orchestration engines.
EdiX · 5h ago
It didn't, though. JSON schema is basically dead in the water.
8n4vidtmkvmk · 4h ago
Doesn't MCP use json schema? And zod? And a myriad of other things? Hardly seems dead.
stouset · 5h ago
Just because they discovered it doesn’t mean they fixed it.
pjmlp · 5h ago
Yet I keep seeing it across several repos.
cyberax · 6h ago
And now we're getting a swing back to sanity. OpenAPI is an attempt to formally describe the Wild West of JSON-based HTTP interfaces.
And its complexity and size now are rivaling the specs of the good old XML-infused times.
sudhirb · 6h ago
I've worked somewhere where CORBA was used very heavily and to great effect - though I suspect the reason for our successful usage was that one of the senior software engineers worked on CORBA directly.
hinkley · 4h ago
I applied for a job at AT&T using CORBA around 1998 and I think that’s the last time I encountered it other than making JDK downloads slower.
Didn’t get that job, one of the interviewers asked me to write concurrent code, didn’t like my answer, but his had a race condition in it and I was unsuccessful in convincing him he was wrong. He was relying on preemption not occurring on a certain instruction (or multiprocessing not happening). During my tenure at the job I did take the real flaws in the Java Memory Model would come out and his answer became very wrong and mine only slightly.
cyberax · 6h ago
CORBA got a lot of things right. But it was unfortunately a child of the late 80-s telecom networks mixed with OOP-hype.
So it baked in core assumptions that the network is transparent, reliable, and symmetric. So you could create an object on one machine, pass a reference to it to another machine, and everything is supposed to just work.
Which is not what happens in the real world, with timeouts, retries, congested networks, and crashing computers.
Oh, and CORBA C++ bindings had been designed before the STL was standardized. So they are a crawling horror, other languages were better.
antonymoose · 7h ago
To be charitable, you can look at a commercially unsuccessful project and appreciate its technical brilliance.
drweevil · 17m ago
Just an interesting bit of trivia, the Large Hadron Collider uses/used (don't know if it still does) CORBA in its distributed control system. (On the control system I worked on we use Sun RPC, which was fine as things go but doesn't have the language support that CORBA has. We used a separate SOAP interface to the system to allow for languages such as Python. Today I'd use gRPC, or the BEAM.)
On a more general note, I see in many critical comments here what I perceive to be a category error. Using JSON to pass data between web client and server, even in more complex web apps, is not the same thing as supporting two-way communications between autonomous software entities that are tasked to do something, perhaps something critical. There could be millions of these exchanges in some arbitrarily short time period, thus any possibility of errors is multiplied accordingly, and the effect any error could cascade if it does not fail early. I really don't believe this is a case where "worse is better." To use an analogy, yes everyday English is a versatile language that works great for most use cases; but when you really need to nail things down, with no tolerance for ambiguity, you get legalese or some other jargon. Or CORBA, or gRPC, etc.
SillyUsername · 4h ago
MCP is flawed but it learnt one thing correctly from years of RPC - complexity is the biggest time sink and holds back adoption in deference to simpler competing standards (cf XML vs JSON)
- SOAP - interop needs support of DOC or RPC based between systems, or a combination, XML and schemas are also horribly verbose.
- CORBA - libraries and framework were complex, modern languages at the time avoided them in deference to simpler standards (e.g. Java's Jini)
- GPRC - designed for speed, not readability, requires mappings.
It's telling that these days REST and JSON (via req/resp, webhooks, or even streaming) are the modern backbone of RPC. The above standards either are shoved aside or for GPRC only used where extreme throughput is needed.
Since REST and JSON are the plat du jour, MCP probably aligns with that design paradigm rather than the dated legacy protocols.
ramoz · 4h ago
Many great points. I think we are thinking about MCP the wrong way.
The greater problem is industry misunderstanding and misalignment with what agents are and where they are headed.
Web platforms of the world believe agents will be embedded in networked distributed infrastructure. So we should ship an MCP platform in our service mesh for all of the agents running in containers to connect to.
I think this is wrong, and continues to be butchered as the web pushes a hard narrative that we need to enable web-native agents & their sdks/frameworks that deploy agents as conventional server applications. These are not agents nor the early evolutionary form of them.
Frontier labs will be the only providers of the actual agentic harnesses. And we are rapidly moving to computer use agents - MCP servers were intended to serve as single instance deployments for single harnesses. ie. a single mcp server on my desktop for my Claude Desktop.
Eisenstein · 4h ago
Exactly. The problem isn't that MCP is poorly designed for enterprise uses, it is that LLMs are being used for things where they are not appropriate.
> In financial services, this means a trading AI could misinterpret numerical types and execute trades with the wrong decimal precision.
If you are letting an LLM execute trades with no guardrails then it is a ticking time bomb no matter what protocol you use for the tool calls.
> When an AI tool expects an ISO-8601 timestamp but receives a Unix epoch, the model might hallucinate dates rather than failing cleanly.
If your process breaks because of a hallucinated date -- don't use an LLM for it.
abtinf · 7h ago
I wish someone would write a clear, crisp explanation for why MCP is needed over simply supporting swagger or proto.
dragonwriter · 7h ago
OpenAPI (or its Swagger predecessor) or Proto (I assume by this you mean protobuf?) don't cover what MCP does. It could have layered over them instead of using JSON-RPC, but I don't see any strong reason why they would be better than JSON-RPC as the basis (Swagger has communication assumptions that don't work well with MCP's local use case; protobuf doesn't cover communication at all and would require additional consideration in the protocol layered over it.)
You'd still need basically the entire existing MCP spec to cover the use cases if it replaced JSON-RPC with Swagger or protobuf, plus additional material to cover the gaps and complications that that switch would involve.
vineyardmike · 6h ago
Proto has a full associated spec (gRPC) on communication protocols and structured definitions for them. MCP could easily have built upon these and gotten a lot “for free”. Generally gRPC is better than JsonRPC (see below).
I agree that swagger leaves a lot unplanned. I disagree about the local use case because (1) we could just run local HTTP servers easily and (2) I frankly assume the future of MCP is mostly remote.
Returning back to JSON-RPC, it’s a poorly executed RPC protocol. Here is an excellent HackerNews thread on it, but the TLDR is parsing JSON is expensive and complex, we have tons of tools (eg load balancers) that make modern services, and making those tools parse json is very expensive. Many people in the below thread mention alternative ways to implement J-RPC but that depends on new clients.
MCP supports streaming responses. You could implement that by polling and a session state, but that's an inefficient hack.
lsaferite · 5h ago
Eh... No, it does not support streaming responses.
I know this because I wish it did. You can approximate streaming responses by using progress notifications. If you want something like the LLM partial response streaming, you'll have to extend MCP with custom capabilities flags. It's totally possible to extend it in this way, but then it's non standard.
Perhaps you are alluding to the fact that it's bidirectional protocol (by spec at least).
mockingloris · 8h ago
I read this thrice: ...When OpenAI bills $50,000 for last month’s API usage, can you tell which department’s MCP tools drove that cost? Which specific tool calls? Which individual users or use cases?...
It seems to be a game of catch up for most things AI. That said, my school of thought is that certain technologies are just too big for them to be figured out early on - web frameworks, blockchain, ...
- the gap starts to shrink eventually. With AI, we'll just have to keep sharing ideas and caution like you have here.
Such very interesting times we live in.
ipython · 6h ago
I am torn. I see this argument and intellectually agree with it (that interfaces need to be more explicit). However it seems that every time there is a choice between “better” design and “good enough”, the “good enough” wins handily.
Multics vs Unix, xml based soap vs json based rest apis, xhtml’s failure, javascript itself, … I could keep going on.
So I’ve resigned myself to admitting that we are doomed to reimplement the “good enough” every time, and continue to apply bandaid after bandaid to gradually fix problems after we rediscover them, slowly.
antonvs · 5h ago
It's the old Worse is Better observation, which is 36 years old now:
It's been confirmed over and over since then. And I say that as someone who naturally gravitates towards "better" solutions.
cookiengineer · 5h ago
Obligatory minute of silence for xforms 2.0
The world we could have lived in... working web forms validations, working microdata...
holografix · 3h ago
Author disregards why none of these technologies are relevant in the modern web.
Sure, they might still find themselves in highly regulated industries where risk avoidance trumps innovation everyday, all day.
MCP is for _the web_ , it started with stdio only because Anthropic was learning lessons from building Claude Code.
Author also seems to expect that the result from MCP tool usage will feed directly to an LLM. This is preposterous and a recipe for disaster. Obviously you’d validade structured response against a schema, check for harmful content, etc etc.
desolate_muffin · 2h ago
I am not sure what you mean. Stateless RPCs, cache controls, client-side typing, tracing/observability, and bidirectional streaming are all things that strike me as very relevant to the modern web for all but the smallest of toy projects, let alone projects in serious engineering organizations.
> Author also seems to expect that the result from MCP tool usage will feed directly to an LLM
Isn't this exactly what MCP is for? Most tools I've come across are to feed context from other sources directly to the LLM. I believe this is the most common use-case for the protocol.
BLanen · 6h ago
As I've been saying.
MCP is not a protocol. It doesn't protocolize anything of use. It's just "here's some symbols, do with them whatever you want.", leaving it there but then advertising that as a feature of its universality. It provides almost just as much of a protocol as TCP, but rebuild on 5 OSI layers, again.
It's not a security issue, it's a ontological issue.
lsaferite · 5h ago
And yet, TCP powers the Internet.
That being said. MCP as a protocol has a fairly simple niche. Provide context that can be fed to a model to perform some task. MCP covers the discovery process around presenting those tools and resources to an Agent in a standardized manner. An it includes several other aspects that are useful in this niche. Things like "sampling" and "elicitations". Is it perfect? Not at all. But it's a step in the right direction.
The crowd saying "just point it at an OpenAPI service" does not seem to fully understand the current problem space. Can many LLMs extract meaning from un-curated API response messages? Sure. But they are also burning up context holding junk that isn't needed. Part of MCP is the acknowledgement that general API responses aren't the right way to feed the model the context it needs. MCP is supposed to be taking a concrete task, performing all the activities need to gather the info or affect the change, then generate clean context meant for the LLM. If you design an OpenAPI service around those same goals, then it could easily be added to an Agent. You'd still need to figure out.all the other aspects, but you'd be close. But at that point you aren't pointing an Agent at a random API, you're pointing it at a purpose made API. And then you have to wonder, why not something like MCP that's designed for that purpose from the start?
I'll close by saying there are an enormous number of MCP Servers out there that are poorly written, thin wrappers on general APIs, or have some other bad aspects. I attribute a lot of this to the rise in AI Coding Agents allowing people with poor comprehension of the space enabling them to crank out this... Noise.
There are also great examples of MCP Servers to be found. They are the ones that have thoughtful designs, leverage the spec fully, and provide nice clean context for the Agent to feed to the LLM.
I can envision a future where we can simply point an agent at a series of OpenAPI services and the agent uses it's models to self-assemble what we consider the MCP server today. Basically it would curate accessing the APIs into a set of focused tools and the code needed to generate the final context. That's not quite where we are today. It's likely not far off though.
upghost · 5h ago
So I'm in the "MCP is probably not a great idea" camp but I couldn't say "this is how it SHOULD be done", and the author makes great criticisms but falls short of actual suggestions. I'm assuming the author is not seriously recommending we go back to SOAP and I've never heard of CORBA. I've heard of gRPC but I can't tell if the author is saying it is good or bad.
Also Erlang uses RPCs for pretty much all "synchronous" interactions but it's pretty minimal in terms of ceremony. Seems pretty reliable.
So this is a serious question because hand rolling "40 years" of best practices seems hard, what should we be using for RPC?
zwaps · 6h ago
The author seems to fundamentally misunderstand how MCPs are going to be used and deployed.
This is really obvious when they talk about tracing and monitoring, which seem to be the main points of criticism anyway.
They bemoan that they cant trace across MCP calls, assuming somehow there would be a person administering all the MCPs.
Of course each system has tracing in whatever fashion fits its system.
They are just not the same system, nor owned by the same people let alone companies.
Same as monitoring cost. Oh, you can’t know who racked up the LLM costs? Well of course you can, these systems are already in place and there are a million of ways to do this. It has nothing to do with MCP.
Reading this, I think its rather a blessing to start fresh and without the learnings of 40 years of failed protocols or whatever
oblio · 5h ago
> without the learnings of 40 years of failed protocols or whatever
1. Lessons.
2. Fairly sure all of Google is built on top of protobuf.
- a standardized way in which the costs associated with an MCP tool call can be communicated to the MCP Client and reported to central tracking - nothing here I see, but it's a really good idea!
- serialization issues e.g. "the server might report a date in a format unexpected by the client" - this isn't wrong, but since the consumer of most tool responses is itself an LLM, there's a fair amount of mitigation here. And in theory an MCP Client can use an LLM to detect under-specified/ambiguous tool specifications, and could surface these issues to the integrator.
Now, I can't speak to the speed at which Maintainers and Core Maintainers are keeping up with the community's momentum - but I think it's meaningful that the community has momentum for evolving the specification!
I see this post in a highly positive light: MCP shows promise because you can iterate on these kinds of structured annotations, in the context of a community that is actively developing their MCP servers. Legacy protocols aren't engaging with these problems in the same way.
dend · 2h ago
One of the MCP Core Maintainers here. I want to emphasize that "If you see something, say something" very much works with the MCP community - we've recently standardized on the Spec Enhancement Proposal (SEP) process, and are also actively (and regularly) reviewing the community proposals with other Core Maintainers and Maintainers. If there is a gap - open an issue or join the MCP Contributor Discord server (open for aspiring and established contributors, by the way), where a lot of contributors hang out and discuss on-deck items.
cratermoon · 4h ago
I feel like the article addressed this response in the section titled 'The “Just Use This Library” Trap'
al2o3cr · 10h ago
IMO worrying about type-safety in the protocol when any string field in the reply can prompt-inject the calling LLM feels like putting a band-aid on a decapitation, but YMMV
ComputerGuru · 9h ago
They’re 100% orthogonal issues.
dragonwriter · 6h ago
> MCP discards this lesson, opting for schemaless JSON with optional, non-enforced hints.
Actually, MCP uses a normative TypeScript schema (and, from that, an autogenerated JSON Schema) for the protocol itself, and the individual tool calls also are specified with JSON Schema.
> Type validation happens at runtime, if at all.
That's not a consequence of MCP "opting for schemaless JSON" (which it factually does not), that's, for tool calls, a consequence of MCP being a discovery protocol where the tools, and thus the applicable schemas, are discovered aruntime.
If you are using MCP as a way to wire up highly-static components, you can do discovery against the servers once they are wired up, statically build the clients around the defined types, and build your toolchain to raise errors if the discovery responses change in the future. But that's not really the world MCP is built for. Yes, that means that the toolchain needs, if it is concerned about schema enforcement, use and apply the relevant schemas at runtime. So, um, do that?
self_awareness · 7h ago
What's new?
- Electron disregards 40 years of best deployment practices,
- Web disregards 40 years of best GUI practices,
- Fast CPUs and lots of RAM disregards 40 years of best software optimization techniques,
there are probably many more examples.
xg15 · 6h ago
Yeah, and all three have evidently made software more shitty. More profitable and easier to develop, sure, but also much more unpleasant to use.
wredcoll · 4h ago
vim is in fact easier to use than vi.
windows 10 is easier to use than windows 95.
osx is easier to use than mac.. whatever they named their old versions.
It goes on and on. I can have 50 browser tabs open at the same time, each one hosting a highly complicated app, ranging from media playback to chat rooms to custom statistical calculators. I don't need to install anything for any of these apps, I just type in a short string in my url bar. And they all just work, at the same time.
Things are in fact better now.
cnst · 4h ago
What did you expect of a Medium article? The stereotype is simply being reinforced here.
cnst · 4h ago
I think this article is missing the point that MCP is simply using the mainstream building blocks that have already regressed from what we've had previously, namely, JSON in place of proper RCP.
The ISO8601 v Unix epoch example seems very weak to me. I'd certainly expect any model to be capable of distinguishing between these things, so, it doesn't seem like a big deal that either one would be allowed in a JSON.
Honestly, my view that nothing of value ever gets published on medium, is strongly reinforced here.
Hackbraten · 3h ago
The fact that the model can recognize a Unix timestamp when it sees one doesn't really help you if it then tries to work around the API mismatch by helpfully converting the timestamp into a hallucinated ISO date.
cnst · 3h ago
But the models can already hallucinate in any case, so, how's that JSON's fault?
cratermoon · 4h ago
> is simply using the mainstream building blocks that have already regressed from what we've had previously, namely, JSON in place of proper RCP.
But why did the designers make that choice when they had any of half a dozen other RCP protocols to choose from?
> The ISO8601 v Unix epoch example seems very weak to me. I'd certainly expect any model to be capable of distinguishing between these things
What about the medical records issue? How is the model to distinguish a weight in kgs from one in pounds?
cnst · 4h ago
Why would a hype protocol use outdated concepts instead of the hype JSON?
Wouldn't medical records actually be better in JSON, because the field could expressly have "kg" or "lb" suffix within the value of the field itself, or even in the name of the field, like "weight-in-kg" or "weight-in-lb"? This is actually the beauty of JSON compared other formats where these things may end up being just a unitless integer.
The biggest problem with medical data would probably remain the human factor, where regardless of the format used by the machines and by MCP, the underlying data may already be incorrect or not coded properly, so, if anything, AI would likely have a better chance of interpreting the data correctly than the API provider blindly mislabelling unitless data.
lowbloodsugar · 4h ago
MCP is what we needed right now, and what most people will need forever. Some of us will need more so we’ll write it.
calvinmorrison · 6h ago
MCP, aka, WSDL for REST
ElectricalUnion · 3h ago
Web Application Description Language is "WSDL for REST".
SigmundA · 6h ago
I thought that was OpenAPI?
gjsman-1000 · 8h ago
… or we’ll just invent MCP 2.0.
On that note; some of these “best practices” arguably haven’t worked out. “Be conservative with what you send, liberal with what you receive” has turned even decent protocols into a dumpster fire, so why keep the charade going?
jmull · 7h ago
Right...
Failed protocols such as TCP adopted Postel's law as a guiding principle, and we all know how that worked out!
dragonwriter · 6h ago
A generalized guiding principle works in one particular use case, so this proves it is a good generalized guiding principle?
oblio · 4h ago
TCP is basically the only example of that principle that works and it only works because the protocol is low level and constrained. Almost all the implementations of that principle from close to the app layer are abominations we're barely keeping running.
gjsman-1000 · 5h ago
Survivor’s bias.
jmull · 5h ago
Circular argument.
rcarmo · 7h ago
I’d rather we ditched MCP and used something that could leverage Swagger instead….
jongjong · 3h ago
The stuff about the utility of machine-readable Web Service Description Language got me rolling my eyes.
WSDL is just pure nonsense. The idea that software would need to decide which API endpoints it needs on its own, is just profoundly misguided... Literally nobody and nothing ever reads the WSDL definitions; it's just poor man's documentation, at best.
LLMs only reinforce the idea that WSDL is a dumb idea because it turns out that even the machines don't care for your 'machine-friendly' format and actually prefer human-friendly formats.
Once you have an MPC tool working with a specific JSON API, it will keep working unless the server makes breaking changes to the API while in production which is terrible practice. But anyway, if you use a server, it means you trust the server. Client-side validation is dumb; like people who need to put tape over their mouths because they don't trust themselves to follow through on their diet plans.
layer8 · 3h ago
WSDLs are routinely used to generate the language bindings for the SOAP actions. WSDL being language-agnostic ensures that bindings in different languages, and/on the client vs the server side, are consistent with each other.
WSDLs being available from the servers allows (a) clients to validate the requests they make before sending them to the server, and (b) developers (or in principle even AI) with access to the server to create a client without needing further out-of-band specifications.
jongjong · 3h ago
I think this is unwise. There are a lot of things that clients need to take into account which cannot be described by WSDLs (e.g. timing related or language specific considerations which require careful thinking through).
I don't buy this idea that code should be generated automatically without a human involved (at least as a reviewer).
I also don't buy the idea that clients should validate their requests before sending to the server. The client's code should trust itself. I object to any idea of code (or any entity) not trusting itself. That is a flawed trust model.
> MCP discards this lesson, opting for schemaless JSON with optional, non-enforced hints. Type validation happens at runtime, if at all. When an AI tool expects an ISO-8601 timestamp but receives a Unix epoch, the model might hallucinate dates rather than failing cleanly. In financial services, this means a trading AI could misinterpret numerical types and execute trades with the wrong decimal precision. In healthcare, patient data types get coerced incorrectly, potentially leading to wrong medication dosing recommendations. Manufacturing systems lose sensor reading precision during JSON serialization, leading to quality control failures.
Having worked with LLMs every day for the past few years, it is easy to see every single one of these things happening.
I can practically see it playing out now: there is some huge incident of some kind, in some system or service with an MCP component somewhere, with some elaborate post-mortem revealing that some MCP server somewhere screwed up and output something invalid, the LLM took that output and hallucinated god knows what, its subsequent actions threw things off downstream, etc.
It would essentially be a new class of software bug caused by integration with LLMs, and it is almost sure to happen when you combine it with other sources of bug: human error, the total lack of error checking or exception handling that LLMs are prone to (they just hallucinate), a bunch of gung-ho startups "vibe coding" new services on top of the above, etc.
I foresee this being followed by a slew of Twitter folks going on endlessly about AGI hacking the nuclear launch codes, which will probably be equally entertaining.
Before 2023 I always thought that all the bugs and glitches of technology in Star Trek were totally made up and would never happen this way.
Post-LLM I am absolutely certain that they will happen exactly that way.
I am not sure what LLM integrations have to do with engineering anymore, or why it makes sense to essentially put all your company's infrastructure into external control. And that is not even scratching the surface with the lack of reproducibility at every single step of the way.
It "somehow works" isn't engineering.
So very much like an LLM accessing multiple pieces of functionality across different tools and API endpoints (if you want to imagine it that way).
While it is seemingly very knowledgeable, it is rather stupid. It gets duped by nefarious actors or has a class of bugs that are elementary that put the crew into awkward positions.
Most professional software engineers might have previously looked as these scenarios as implausible, given the "failure model" of current software is quite blunt, and especially given how far into the future the series took place.
Now we see that computational tasks are becoming less predictable, less straight-forward, with cascading failures instead of blunt, direct failures. Interacting with an LLM might be compared to talking with a person in psychosis when it starts to hallucinate.
So you get things like this in the Star Trek universe: https://www.youtube.com/watch?v=kUJh7id0lK4
Which make a lot more sense, become a lot more plausible and a lot more relatable with our current implementations of AI/LLM's.
But it sure is fast.
The author even later says that MCP supports JSON Schema, but also claims "you can't generate type-safe clients". Which is plainly untrue, there exist plenty of JSON Schema code generators.
Claude will happily cast your int into a 2023 Toyota Yaris and keep on hallucinating things.
> Cast an integer into the type of a 2023 Toyota Yaris using Javascript
(GPT-4o mini)
> To cast an integer into the type of a 2023 Toyota Yaris in JavaScript, you would typically create a class or a constructor function that represents the Toyota Yaris. Then, you can create an instance of that class using the integer value. Here's an example of how you might do this:
Claude Code validated the response against the schema and did not pass the response to the LLM.
It works in this instance. On this run. It is not guaranteed to work next time. There is a error percentage here that makes it _INEVITABLE_ that eventually, with enough executions, the validation will pass when it should fail.
It will choose not to pass this to the validator, at some point in the future. It will create its own validator, at some point in the future. It will simply pretend like it did any of the above, at some point in the future.
This might be fine for your B2B use case. It is not fine for underlying infrastructure for a financial firm or communications.
Can you guarantee it will validate it every time ? Can you guarantee the way MCPs/tool calling are implemented (which is already an incredible joke that only python brained developers would inflict upon the world) will always go through the validation layer, are you even sure of what part of Claude handles this validation ? Sure, it didn't cast an int into a Toyota Yaris. Will it cast "70Y074" into one ? Maybe a 2022 one. What if there are embedded parsing rules into a string, will it respect it every time ? What if you use it outside of Claude Code, but just ask nicely through the API, can you guarantee this validation still works ? Or that they won't break it next week ?
The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.
Yes, to the extent you can guarantee the behavior of third party software, you can (which you can't really guarantee no matter what spec the software supposedly implements, so the gaps aren't an MCP issue), because “the app enforces schema compliance before handing the results to the LLM” is deterministic behavior in the traditional app that provides the toolchain that provides the interface between tools (and the user) and the LLM, not non-deterministic behavior driven by the LLM. Hence, “before handing the results to the LLM”.
> The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.
The toolchain is parsing, validating, and mapping the data into the format preferred by the chosen models promot template, the LLM has nothing to do with doing that, because that by definition has to happen before it can see the data.
You aren't trusting the LLM.
The LLM has everything to do with that. The LLM is literally choosing to do that. I don't know why this point keeps getting missed or side-stepped.
It WILL, at some point in the future and given enough executions, as a matter of statistical certainty, simply not do that above, or pretend to do the above, or do something totally different at some point in the future.
How does the AI bypass the MCP layer to make the request? The assumption is (as I understand it) the AI says “I want to make MCP request XYZ with data ABC” and it sends that off to the MCP interface which does the heavy lifting.
If the MCP interface is doing the schema checks, and tossing errors as appropriate, how is the AI routing around this interface to bypass the schema enforcement?
The MCP interface (Claude Code in this case) is doing the schema checks. Claude Code will refuse to provide the result to the LLM if it does not pass the schema check, and the LLM has no control over that.
No, the LLM doesn't control on a case-by-caae basis what the toolchain does between the LLM putting a tool call request in an output message and the toolchain calling the LLM afterwards.
If the toolchain is programmed to always validate tool responses against the JSON schema provided by MCP server before mapping into the LLM prompt template and calling the LLM again to handle the response, that is going to happen 100% of the time. The LLM doesn't choose it. It CAN'T because the only way it even knows that the data has come back from the tool call is that the toolchain has already done whatever it is programmed to do, ending with mapping the response into a prompt and calling the LLM again.
Even before MCPs or even models specifically trained and with vendor-provided templates for tool calling (but after the ReAct architecture was described), it was like a weekend project to implement a basic framework supporting tooling calling around a local or remote LLM. I don't think you need to do that to understand how silly the claim that the LLM controls what the toolchain does with each response and might make it not validate it is, but certainly doing it will give you a visceral understanding of how silly it is.
The pieces here are:
* Claude Code, a Node (Javascript) application that talks to MCP server(s) and the Claude API
* The MCP server, which exposes some tools through stdin or HTTP
* The Claude API, which is more structured than "text in, text out".
* The Claude LLM behind the API, which generates a response to a given prompt
Claude Code is a Node application. CC is configured in JSON with a list of MCP servers. When CC starts up, CC"s Javascript initialises each server and as part of that gets a list of callable functions.
When CC calls the LLM API with a user's request, it's not just "here is the user's words, do it". There are multiple slots in the request object, one of which is a "tools" block, a list of the tools that can be called. Inside the API, I imagine this is packaged into a prefix context string like "you have access to the following tools: tool(args) ...". The LLM API probably has a bunch of prompts it runs through (figure out what type of request the user has made, maybe using different prompts to make different types of plan, etc.) and somewhere along the way the LLM might respond with a request to call a tool.
The LLM API call then returns the tool call request to CC, in a structured "tool_use" block separate from the freetext "hey good news, you asked a question and got this response". The structured block means "the LLM wants to call this tool."
CC's JS then calls the server with the tool request and gets the response. It validates the response (e.g., JSON schemas) and then calls the LLM API again bundling up the success/failure of the tool call into a structured "tool_result" block. If it validated and was successful, the LLM gets to see the MCP server's response. If it failed to validate, the LLM gets to see that it failed and what the error message was (so the LLM can try again in a different way).
The idea is that if a tool call is supposed to return a CarMakeModel string ("Toyota Tercel") and instead returns an int (42), JSON Schemas can catch this. The client validates the server's response against the schema, and calls the LLM API with
So the LLM isn't choosing to call the validator, it's the deterministic Javascript that is Claude Code that chooses to call the validator.There are plenty of ways for this to go wrong: the client (Claude Code) has to validate; int vs string isn't the same as "is a valid timestamp/CarMakeModel/etc"; if you helpfully put the thing that failed into the error message ("Expect string, got integer (42)") then the LLM gets 42 and might choose to interpret that as a CarMakeModel if it's having a particularly bad day; the LLM might say "well, that didn't work, but let's assume the answer was Toyota Tercel, a common car make and model", ... We're reaching here, yet these are possible.
But the basic flow has validation done in deterministic code and hiding the MCP server's invalid responses from the LLM. The LLM can't choose not to validate. You seemed to be saying that the LLM could choose not to validate, and your interlocutor was saying that was not the case.
I hope this helps!
I can't gaurantee that behavior will remain the same more than any other software. But all this happens before the LLM is even involved.
> The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.
You are describing why MCP supports JSON Schema. It requires parsing & validating the input using deterministic software, not LLMs.
No. It is not. You are still misunderstanding how this works. It is "choosing" to pass this to a validator or some other tool, _for now_. As a matter of pure statistics, it will simply not do this at some point in the future on some run.
It is inevitable.
Or write a simple MCP server and a client that uses it. FastMCP is easy: https://gofastmcp.com/getting-started/quickstart
You are quite wrong. The LLM "chooses" to use a tool, but the input (provided by the LLM) is validated with JSON Schema by the server, and the output is validated by the client (Claude Code). The output is not provided back to the LLM if it does not comply with the JSON Schema, instead an error is surfaced.
It is absolutely possible to do this, and to generate client code which complies with ISO-8601 in JS/TS. Large amounts of financial services would not work if this was not the case.
See the c# support for ISO-8601 strings: https://learn.microsoft.com/en-us/dotnet/standard/base-types...
`DateTime` is not an ISO-8601 type. It can _parse_ an ISO-8601 formatted string.
And even past that, there are Windows-specific idiosyncrasies with how the `DateTime` class implements the parsing of these strings and how it stores the resulting value.
This is exactly the point: a string is just a data interchange format in the context of a DateTime, and C# provides (as far as I can tell) a complete way of accessing the ISO-8601 specification on the language object. It also supports type-safe generation of clients and client object (or struct) generation from the ISO-8601 string format.
> And even past that, there are Windows-specific idiosyncrasies with how the `DateTime` class implements the parsing of these strings and how it stores the resulting value.
Not really. The windows statements on the article (and I use this on linux for financial services software) are related to automated settings of the preferences for generated strings. All of these may be set within the code itself.
Related but distinct from serialization.
The merchants of complexity are disappointed. It turns out that even machines don't care for 'machine-readable' formats; even the machines prefer human-readable formats.
The only entities on this planet who appreciate so-called 'machine-readability' are bureaucrats; and they like it for the same reason that they like enterprise acronyms... Literally the opposite of readability.
LLMs are basically automating PEBKAC
May have changed, but unlikely. I worked with medical telemetry as a young man and it was impressed upon me thoroughly how important parsing timestamps correctly was. I have a faint memory, possibly false, of this being the first time I wrote unit tests (and without the benefit of a test framework).
We even accounted for lack of NTP by recalculating times off of the timestamps I. Their message headers.
And the reasons I was given were incident review as well as malpractice cases. A drug administered three seconds before a heart attack starts is a very different situation than one administered eight seconds after the patient crashed. We saw recently with the British postal service how lives can be ruined by bad data, and in medical data a minute is a world of difference.
I also work in healthcare, and we've seen HL7v2 messages with impossible timestamps. (E.g., in the spring-forward gap.)
As RPC mechanisms go, HTTP is notable for how few of the classic blunders they made in 1.0 of the spec. Clock skew correction is just my favorite. Technically it exists for cache directives, but it’s invaluable for coordination across machines. There are reasons HTTP 2.0 waited decades to happen. It just mostly worked.
this is like saying "HTTP doesn't do json validation", which, well, yeah.
The stuff about type validation is incorrect. You don't need client-side validation. You shouldn't be using APIs you don't trust as tools and you can always add instructions about the LLM's output format to convert to different formats.
MCP is not the issue. The issue is that people are using the wrong tools or their prompts are bad.
If you don't like the format of an MCP tool and don't want to give formatting instructions the LLMs, you can always create your own MCP service which outputs data in the correct format. You don't need the coercion to happen on the client side.
When desktop OSes came out, hardware resources were scarce so all the desktop OSes (DOS, Windows, MacOS) forgot all the lessons from Unix: multi user, cooperative multitasking, etc. 10 years later PC hardware was faster than workstations from the 90s yet we're still stuck with OSes riddled with limitations that stopped making sense in the 80s.
When smartphones came out there was this gold rush and hardware resources were scarce so OSes (iOS, Android) again forgot all the lessons. 10 years later mobile hardware was faster than desktop hardware from the 00s. We're still stuck with mistakes from the 00s.
AI basically does the same thing. It's all lead by very bright 20 and 30 year olds that weren't even born when Windows was first released.
Our field is doomed under a Cascade of Attention-Deficit Teenagers: https://www.jwz.org/doc/cadt.html (copy paste the link).
It's all gold rushes and nobody does Dutch urban infrastructure design over decades. Which makes sense as this is all driven by the US, where long term plan I is anathema.
Of course this keeps happening
If an LLM can be shown to be useful 80% of the time to the JS mindset this is fine, and the remaining 20% can be resolved once we're being paid for the rest, Pareto principle be damned.
Mostly, no. Whether its the client sending (statically) bad data or the server returning (statically) bad data, schema validation on the other end (assuming somehow it is allowed by the toolchain on the sending end) should reject it before it gets to the custom code of the MCP server or MCP client.
For arguments that are the right type but wrong because of the state of the universe, yes, the server receiving it should send a useful error message back to the client. But that's a different issue.
At some point we have to decide as a community of engineers that we have to stop building tools that are little more than loaded shotguns pointed at our own feet.
GIEMGO garbage in even more garbage out
Ironically, it's achieved this - but that's an indictment of USB-C, not an accomplishment of MCP. Just like USB-C, MCP is a nigh-universal connector with very poorly enforced standards for what actually goes across it. MCP's inconsistent JSON parsing and lack of protocol standardization is closely analogous to USB-C's proliferation of cable types (https://en.wikipedia.org/wiki/USB-C#Cable_types); the superficial interoperability is a very leaky abstraction over a much more complicated reality, which IMO is worse than just having explicitly different APIs/protocols.
Previously, you could reasonably expect a USB-C on a desktop/laptop of an Apple Silicon device, to be USB4 40Gbps Thunderbolt, capable of anything and everything you may want to use it for.
Now, some of them are USB3 10Gbps. Which ones? Gotta look at the specs or tiny icons, I guess?
Apple could have chosen to have the self-documenting USB-A ports to signify the 10Gbps limitation of some of these ports (conveniently, USB-A is limited to exactly 10Gbps, making it perfect for the use-case of having a few extra "low-speed" ports at very little manufacturing cost), but instead, they've decided to further dilute the USB-C brand. Pure innovation!
With the end user likely still having to use a USB-C to USB-A adapters anyways, because the majority of thumb drives, keyboards and mice, still require a USB-A port — even the USB-C ones that use USB-C on the kb/mice itself. (But, of course, that's all irrelevant because you can always spend 2x+ as much for a USB-C version of any of these devices, and the fact that the USB-C variants are less common or inferior to USB-A, is of course irrelevant when hype and fanaticism are more important than utility and usability.)
Unfortunately, no one understood SOAP back.
(Additional context: Maintaining a legacy SOAP system. I have nothing good to say about SOAP and it should serve as a role model for no one)
It doesn't take very long for people to start romanticizing things as soon as they're not in vogue. Even when the painfulness is still fresh in memory, people lament over how stupid new stuff is. Well I'm not a fan of schemaless JSON APIs (I'm one of those weird people that likes protobufs and capnp much more) but I will take 50 years of schemaless JSON API work over a month of dealing with SOAP again.
/“xml is like violence, if it’s not working just use more!”
No.
SOAP uses that, but SOAP involves a whole lot of spec about how you do that, and that's even before (as the article seems to) treat SOAP as meaning SOAP + the set of WS-* standards built around it.
Unfortunately as usual when a new technology cycle comes, everything gets thrown away, including the good parts.
And I actually like XML-based technologies. XML Schema is still unparalleled in its ability to compose and verify the format of multiple document types. But man, SOAP was such a beast for no real reason.
Instead of a simple spec for remote calls, it turned into a spec that described everything and nothing at the same time. SOAP supported all kinds of transport protocols (SOAP over email? Sure!), RPC with remote handles (like CORBA), regular RPC, self-describing RPC (UDDI!), etc. And nothing worked out of the box, because the nitty-gritty details of authentication, caching, HTTP response code interoperability and other "boring" stuff were just left as an exercise to the reader.
Part of this is the nature of XML. There's a million ways to do things. Should some data be parsed as an attribute of the tag or should it be another tag? Perhaps the data should be in the body between the tags? HTML, based on XML, has this problem; eg. you can seriously specify <font face="Arial">text</font> rather than have the font as a property of the wrapping tag. There's a million ways to specify everything and anything and that's why it makes a terrible data parsing format. The reader and writer must have the exact same schema in mind and there's no way to have a default when there's simply no particular correct way to do things in XML. So everything had to be very very precisely specified to the point it added huge amounts of work when a non-XML format with decent defaults would not have that issue.
This become a huge problem for SOAP and why i hate it. Every implementation had different default ways of handling even the simplest data structure passing between them and were never compatible unless you took weeks of time to specify the schema down to a fine grained level.
In general XML is problematic due to the lack of clear canonical ways of doing pretty much anything. You might say "but i can specify it with a schema" and to that i say "My problem with XML is that you need a schema for even the simplest use case in the first place".
But parts of XML infrastructure were awesome. I could define a schema for the data types, and have my IDE auto-complete and validate the XML documents as I typed them. I could also validate the input/output data and provide meaningful errors.
And yeah, I also worked with XML and got burned many times by small incompatibilities that always happen due to its inherent complexity. If XML were just a _bit_ simpler, it could have worked so much better.
Generally it worked very well when both ends were written in the same programming language and was horseshit if they weren’t. No wonder Microsoft liked SOAP so much.
IBM thought they were good at lockin, until Bill Gates came along.
I've been on the other side of high-feature serialization protocols, and even at large tech companies, something like migrating to gRPC is a multi-year slog that can even fail a couple of times because it asks so much of you.
MCP, at its core, is a standardization of a JSON API contract, so you don't have to do as much post-training to generate various tool calling style tokens for your LLM.
I think you meant that is why JSON won instead of XML?
Not just XML, but a lot of other serialization formats and standards, like SOAP, protobuf in many cases, yaml, REST, etc.
People say REST won, but tell me how many places actually implement REST or just use it as a stand-in term for casual JSON blobs to HTTP URLs?
Now, YAML has quite a few shortcomings compared to JSON (if you don't believe me, look at its handling of the string no, discussed on HN), so, at least to me, it's obvious why JSON won.
SOAP, don't get me started on that, it's worth less than XML, protobuf is more efficient but less portable, etc.
That's backwards reasoning. XML was too complicated, so they decided on a simpler JSON.
And its complexity and size now are rivaling the specs of the good old XML-infused times.
Didn’t get that job, one of the interviewers asked me to write concurrent code, didn’t like my answer, but his had a race condition in it and I was unsuccessful in convincing him he was wrong. He was relying on preemption not occurring on a certain instruction (or multiprocessing not happening). During my tenure at the job I did take the real flaws in the Java Memory Model would come out and his answer became very wrong and mine only slightly.
So it baked in core assumptions that the network is transparent, reliable, and symmetric. So you could create an object on one machine, pass a reference to it to another machine, and everything is supposed to just work.
Which is not what happens in the real world, with timeouts, retries, congested networks, and crashing computers.
Oh, and CORBA C++ bindings had been designed before the STL was standardized. So they are a crawling horror, other languages were better.
On a more general note, I see in many critical comments here what I perceive to be a category error. Using JSON to pass data between web client and server, even in more complex web apps, is not the same thing as supporting two-way communications between autonomous software entities that are tasked to do something, perhaps something critical. There could be millions of these exchanges in some arbitrarily short time period, thus any possibility of errors is multiplied accordingly, and the effect any error could cascade if it does not fail early. I really don't believe this is a case where "worse is better." To use an analogy, yes everyday English is a versatile language that works great for most use cases; but when you really need to nail things down, with no tolerance for ambiguity, you get legalese or some other jargon. Or CORBA, or gRPC, etc.
- SOAP - interop needs support of DOC or RPC based between systems, or a combination, XML and schemas are also horribly verbose.
- CORBA - libraries and framework were complex, modern languages at the time avoided them in deference to simpler standards (e.g. Java's Jini)
- GPRC - designed for speed, not readability, requires mappings.
It's telling that these days REST and JSON (via req/resp, webhooks, or even streaming) are the modern backbone of RPC. The above standards either are shoved aside or for GPRC only used where extreme throughput is needed.
Since REST and JSON are the plat du jour, MCP probably aligns with that design paradigm rather than the dated legacy protocols.
The greater problem is industry misunderstanding and misalignment with what agents are and where they are headed.
Web platforms of the world believe agents will be embedded in networked distributed infrastructure. So we should ship an MCP platform in our service mesh for all of the agents running in containers to connect to.
I think this is wrong, and continues to be butchered as the web pushes a hard narrative that we need to enable web-native agents & their sdks/frameworks that deploy agents as conventional server applications. These are not agents nor the early evolutionary form of them.
Frontier labs will be the only providers of the actual agentic harnesses. And we are rapidly moving to computer use agents - MCP servers were intended to serve as single instance deployments for single harnesses. ie. a single mcp server on my desktop for my Claude Desktop.
> In financial services, this means a trading AI could misinterpret numerical types and execute trades with the wrong decimal precision.
If you are letting an LLM execute trades with no guardrails then it is a ticking time bomb no matter what protocol you use for the tool calls.
> When an AI tool expects an ISO-8601 timestamp but receives a Unix epoch, the model might hallucinate dates rather than failing cleanly.
If your process breaks because of a hallucinated date -- don't use an LLM for it.
You'd still need basically the entire existing MCP spec to cover the use cases if it replaced JSON-RPC with Swagger or protobuf, plus additional material to cover the gaps and complications that that switch would involve.
I agree that swagger leaves a lot unplanned. I disagree about the local use case because (1) we could just run local HTTP servers easily and (2) I frankly assume the future of MCP is mostly remote.
Returning back to JSON-RPC, it’s a poorly executed RPC protocol. Here is an excellent HackerNews thread on it, but the TLDR is parsing JSON is expensive and complex, we have tons of tools (eg load balancers) that make modern services, and making those tools parse json is very expensive. Many people in the below thread mention alternative ways to implement J-RPC but that depends on new clients.
https://news.ycombinator.com/item?id=34211796
I know this because I wish it did. You can approximate streaming responses by using progress notifications. If you want something like the LLM partial response streaming, you'll have to extend MCP with custom capabilities flags. It's totally possible to extend it in this way, but then it's non standard.
Perhaps you are alluding to the fact that it's bidirectional protocol (by spec at least).
It seems to be a game of catch up for most things AI. That said, my school of thought is that certain technologies are just too big for them to be figured out early on - web frameworks, blockchain, ...
- the gap starts to shrink eventually. With AI, we'll just have to keep sharing ideas and caution like you have here. Such very interesting times we live in.
Multics vs Unix, xml based soap vs json based rest apis, xhtml’s failure, javascript itself, … I could keep going on.
So I’ve resigned myself to admitting that we are doomed to reimplement the “good enough” every time, and continue to apply bandaid after bandaid to gradually fix problems after we rediscover them, slowly.
https://en.m.wikipedia.org/wiki/Worse_is_better
It's been confirmed over and over since then. And I say that as someone who naturally gravitates towards "better" solutions.
The world we could have lived in... working web forms validations, working microdata...
Sure, they might still find themselves in highly regulated industries where risk avoidance trumps innovation everyday, all day.
MCP is for _the web_ , it started with stdio only because Anthropic was learning lessons from building Claude Code.
Author also seems to expect that the result from MCP tool usage will feed directly to an LLM. This is preposterous and a recipe for disaster. Obviously you’d validade structured response against a schema, check for harmful content, etc etc.
> Author also seems to expect that the result from MCP tool usage will feed directly to an LLM
Isn't this exactly what MCP is for? Most tools I've come across are to feed context from other sources directly to the LLM. I believe this is the most common use-case for the protocol.
MCP is not a protocol. It doesn't protocolize anything of use. It's just "here's some symbols, do with them whatever you want.", leaving it there but then advertising that as a feature of its universality. It provides almost just as much of a protocol as TCP, but rebuild on 5 OSI layers, again.
It's not a security issue, it's a ontological issue.
That being said. MCP as a protocol has a fairly simple niche. Provide context that can be fed to a model to perform some task. MCP covers the discovery process around presenting those tools and resources to an Agent in a standardized manner. An it includes several other aspects that are useful in this niche. Things like "sampling" and "elicitations". Is it perfect? Not at all. But it's a step in the right direction.
The crowd saying "just point it at an OpenAPI service" does not seem to fully understand the current problem space. Can many LLMs extract meaning from un-curated API response messages? Sure. But they are also burning up context holding junk that isn't needed. Part of MCP is the acknowledgement that general API responses aren't the right way to feed the model the context it needs. MCP is supposed to be taking a concrete task, performing all the activities need to gather the info or affect the change, then generate clean context meant for the LLM. If you design an OpenAPI service around those same goals, then it could easily be added to an Agent. You'd still need to figure out.all the other aspects, but you'd be close. But at that point you aren't pointing an Agent at a random API, you're pointing it at a purpose made API. And then you have to wonder, why not something like MCP that's designed for that purpose from the start?
I'll close by saying there are an enormous number of MCP Servers out there that are poorly written, thin wrappers on general APIs, or have some other bad aspects. I attribute a lot of this to the rise in AI Coding Agents allowing people with poor comprehension of the space enabling them to crank out this... Noise.
There are also great examples of MCP Servers to be found. They are the ones that have thoughtful designs, leverage the spec fully, and provide nice clean context for the Agent to feed to the LLM.
I can envision a future where we can simply point an agent at a series of OpenAPI services and the agent uses it's models to self-assemble what we consider the MCP server today. Basically it would curate accessing the APIs into a set of focused tools and the code needed to generate the final context. That's not quite where we are today. It's likely not far off though.
Also Erlang uses RPCs for pretty much all "synchronous" interactions but it's pretty minimal in terms of ceremony. Seems pretty reliable.
So this is a serious question because hand rolling "40 years" of best practices seems hard, what should we be using for RPC?
This is really obvious when they talk about tracing and monitoring, which seem to be the main points of criticism anyway.
They bemoan that they cant trace across MCP calls, assuming somehow there would be a person administering all the MCPs. Of course each system has tracing in whatever fashion fits its system. They are just not the same system, nor owned by the same people let alone companies.
Same as monitoring cost. Oh, you can’t know who racked up the LLM costs? Well of course you can, these systems are already in place and there are a million of ways to do this. It has nothing to do with MCP.
Reading this, I think its rather a blessing to start fresh and without the learnings of 40 years of failed protocols or whatever
1. Lessons.
2. Fairly sure all of Google is built on top of protobuf.
Point-by-point for the article's gripes:
- distributed tracing/telemetry - open discussion at https://github.com/modelcontextprotocol/modelcontextprotocol...
- structured tool annotation for parallelizability/side-effects/idempotence - this actually already exists at https://modelcontextprotocol.io/specification/2025-06-18/sch... but it's not well documented in https://modelcontextprotocol.io/specification/2025-06-18/ser... - someone should contribute to improving this!
- a standardized way in which the costs associated with an MCP tool call can be communicated to the MCP Client and reported to central tracking - nothing here I see, but it's a really good idea!
- serialization issues e.g. "the server might report a date in a format unexpected by the client" - this isn't wrong, but since the consumer of most tool responses is itself an LLM, there's a fair amount of mitigation here. And in theory an MCP Client can use an LLM to detect under-specified/ambiguous tool specifications, and could surface these issues to the integrator.
Now, I can't speak to the speed at which Maintainers and Core Maintainers are keeping up with the community's momentum - but I think it's meaningful that the community has momentum for evolving the specification!
I see this post in a highly positive light: MCP shows promise because you can iterate on these kinds of structured annotations, in the context of a community that is actively developing their MCP servers. Legacy protocols aren't engaging with these problems in the same way.
Actually, MCP uses a normative TypeScript schema (and, from that, an autogenerated JSON Schema) for the protocol itself, and the individual tool calls also are specified with JSON Schema.
> Type validation happens at runtime, if at all.
That's not a consequence of MCP "opting for schemaless JSON" (which it factually does not), that's, for tool calls, a consequence of MCP being a discovery protocol where the tools, and thus the applicable schemas, are discovered aruntime.
If you are using MCP as a way to wire up highly-static components, you can do discovery against the servers once they are wired up, statically build the clients around the defined types, and build your toolchain to raise errors if the discovery responses change in the future. But that's not really the world MCP is built for. Yes, that means that the toolchain needs, if it is concerned about schema enforcement, use and apply the relevant schemas at runtime. So, um, do that?
- Electron disregards 40 years of best deployment practices,
- Web disregards 40 years of best GUI practices,
- Fast CPUs and lots of RAM disregards 40 years of best software optimization techniques,
there are probably many more examples.
windows 10 is easier to use than windows 95.
osx is easier to use than mac.. whatever they named their old versions.
It goes on and on. I can have 50 browser tabs open at the same time, each one hosting a highly complicated app, ranging from media playback to chat rooms to custom statistical calculators. I don't need to install anything for any of these apps, I just type in a short string in my url bar. And they all just work, at the same time.
Things are in fact better now.
The ISO8601 v Unix epoch example seems very weak to me. I'd certainly expect any model to be capable of distinguishing between these things, so, it doesn't seem like a big deal that either one would be allowed in a JSON.
Honestly, my view that nothing of value ever gets published on medium, is strongly reinforced here.
But why did the designers make that choice when they had any of half a dozen other RCP protocols to choose from?
> The ISO8601 v Unix epoch example seems very weak to me. I'd certainly expect any model to be capable of distinguishing between these things
What about the medical records issue? How is the model to distinguish a weight in kgs from one in pounds?
Wouldn't medical records actually be better in JSON, because the field could expressly have "kg" or "lb" suffix within the value of the field itself, or even in the name of the field, like "weight-in-kg" or "weight-in-lb"? This is actually the beauty of JSON compared other formats where these things may end up being just a unitless integer.
The biggest problem with medical data would probably remain the human factor, where regardless of the format used by the machines and by MCP, the underlying data may already be incorrect or not coded properly, so, if anything, AI would likely have a better chance of interpreting the data correctly than the API provider blindly mislabelling unitless data.
On that note; some of these “best practices” arguably haven’t worked out. “Be conservative with what you send, liberal with what you receive” has turned even decent protocols into a dumpster fire, so why keep the charade going?
Failed protocols such as TCP adopted Postel's law as a guiding principle, and we all know how that worked out!
WSDL is just pure nonsense. The idea that software would need to decide which API endpoints it needs on its own, is just profoundly misguided... Literally nobody and nothing ever reads the WSDL definitions; it's just poor man's documentation, at best.
LLMs only reinforce the idea that WSDL is a dumb idea because it turns out that even the machines don't care for your 'machine-friendly' format and actually prefer human-friendly formats.
Once you have an MPC tool working with a specific JSON API, it will keep working unless the server makes breaking changes to the API while in production which is terrible practice. But anyway, if you use a server, it means you trust the server. Client-side validation is dumb; like people who need to put tape over their mouths because they don't trust themselves to follow through on their diet plans.
WSDLs being available from the servers allows (a) clients to validate the requests they make before sending them to the server, and (b) developers (or in principle even AI) with access to the server to create a client without needing further out-of-band specifications.
I don't buy this idea that code should be generated automatically without a human involved (at least as a reviewer).
I also don't buy the idea that clients should validate their requests before sending to the server. The client's code should trust itself. I object to any idea of code (or any entity) not trusting itself. That is a flawed trust model.