How and where will agents ship software?
122 stopachka 58 7/16/2025, 5:47:08 PM instantdb.com ↗
The linked article explains this in detail, but today we're releasing:
1. An API to spin up apps programmatically. This is great if you are building platforms, where you can spin up databases and backends with 0 additional compute
2. An MCP server, which lets you and your agents talk to Instant and create apps
3. Agent rules, which tell agents how to use Instant
If you want to try this yourself, we have a tutorial that lets you run Instant in your own workflow: https://www.instantdb.com/tutorial. Let us know what you think!
I suggest jumping straight to this document, which is designed to tell the agent how to work with Instant but is pretty great documentation for humans who want to understand what it can do at the same time: https://www.instantdb.com/mcp-tutorial/claude-rules.md
We have an llms.txt and llms-full.txt (~9k lines) which contains all our documentation. Feeding these to the claude didn't get great results, it was just too much information.
We manually compressed our llms-full.txt into a rules file (~1.5k lines) which declared the API upfront and provided snippets of how to do different things with callouts to common examples. This condensed version did better but would cause Claude to make subtle mistakes.
Looking at the kind of mistakes Claude made, it seemed like a human could make those mistakes too (very useful feedback for us to improve our API ). We thought “what's one of the smallest fully contained examples we can make that packs a bunch of info on how to use Instant?” That would probably be useful for both a human and an agent. And indeed it seemed to be the case.
This is something we've found for our API -- just having LLMs attempt to use it helps us identify things that we haven't documented well or placed enough emphasis on (for things that are critical but are non-obvious or may be drowned out by other less important information). Improvements that help the LLM tend to be good for developers too.
The experience was brilliant.
Pros:
+ Fast
+ Easy
+ "Vibe coding on steroids" basically
+ The sense of 'wow' that comes very rarely with new tech
Cons:
- It used Instant as the database/backend, but I wasn't sure what it had done / how exactly it worked and had to spend a bunch of time asking Claude + reading the code to get it. It seemed reasonable, but if I were doing a prod system vs a PoC, this is where the time would be spent. ("Vibe coding lets you create tech debt 10x faster")
Net-net: This is the way for prototyping / validating. This is probably the way for production systems in N months too once the toolchain + agents get better.
This made me wonder: can I share Claude Code's conversation history? Turns Claude stores them.
So I made a full-stack "snippet" app with Claude and Instant. You can:
1. Upload jsonl files 2. Share them in a nice UI
(Going meta) here's the first conversation I had with Claude in order to build it:
https://claude-code-viewer.vercel.app/view/c4ca91ac-9624-40f...
After I deployed, I asked it to fix the tool use UI:
https://claude-code-viewer.vercel.app/view/faf9b2cc-c3cf-4d0...
I used Instant's auth to gate uploads. Views are public, but limited only to the snippets you know (i.e have links for).
If you want to upload your own conversations:
1. They live in ~/.claude. Head on over and grab a file 2. Go to https://claude-code-viewer.vercel.app and sign up 3. Start uploading : )
Some notes:
* Be careful when sharing log files. Claude can include secrets in there. Some hackers may notice an adminToken in the convo. I rotated it before we pushed.
* It was fun to see Claude use the query language. It thought we had a `$startsWith` modifier. Right now we only have $like. But `$startsWith` is a great idea, we may just implement it real quick!
Haha, that's great. Turns out that "hallucinations" are just things that make sense in context, and that can translate to feature requests from our agents :)
I use Claude. I like Claude. But I’ve backed away from having Claude actually write my code other than in the most limited circumstances.
I caught it copying one of my TS Interfaces, for example. And modifying, then using, the copy. So my type-checks pass, yay! But wait what?
It wrote a test for a tricky bit of code. The test wouldn’t pass. So it re-wrote it in a way that couldn’t possibly fail, mocking all elements inside the test itself.
I’m not anti-AI. But I wouldn’t trust anything vibe-coded above the importance of, say, Wordle.
For what it's worth, Instant is fully open source. The UI, the sync engine, and the multi-tenant database live here:
https://github.com/instantdb/instant
But hey, rewriting the plethora of vibe-coded long tail* apps might be a major source of employment in the future.
* small but loyal and profitable userbases
Interesting point.
I keep coming back to the idea that users could request changes, and they could be experimentally deployed immediately.
Some open questions I had as I thought through extensions:
We talked about the data abstraction side: when you expose data, it's easier for end-users to build extensions. But there are questions on UIs and data modeling.
UIs: How cool would it be agents could "enter" into applications and change the UI? In one sense this hard, but at least a demo feels in reach. What if an app exposed the UI components that it was built out of? This would let the agent remix them.
Data modeling: Exposing data works, but what if users want to store extra information? Maybe each user could spin up their own separate "extra" database.
I have a friend who owns a small/medium sized marketing firm. They typically manage social media and advertising for local businesses (butchers, plumbers, NPOs, etc.). A major cost center for them is dev. They can generally handle developing assets (images, videos, text copy) and publishing them (Facebook, YouTube, Instagram) but if they need any kind of interactivity (even basic forms or CRM-like stuff) they used to hire programmers.
This friend is now "vibe coding" the simple interactivity that previously they had to outsource. In the last few months he has pitched, won and crucially delivered simple apps for a few clients. We're not talking complex web apps, it's mostly CRUD forms and basic workflows, the kind you see people go on about using n8n on Twitter. He's talking to me these days about React, Tailwind, DNS and all of that stuff.
His clients don't know, or care, how he delivers. The local butcher doesn't know about "best practices" or whatever. He just cares that if someone signs up for his newsletter that he gets a notification and that person gets his weekly meat deals email.
His firm is picking up more and more complex projects like these and saving a huge amount on costs. Turn-key services that enable guys like him are going to reap the rewards.
There's a lot more ideas and people who would love to put in effort to give to the world, then there are expert programmers to build them.
[1] https://apps.apple.com/us/app/go-deeper/id6745434359
I'm going to jump on this to think aloud about the unlock this ability gives the world for customized apps. My neighbor is a landscaper and he is constantly complaining to me about invoicing software. He has gone through 10+ apps trying to find one that fits his particular set of requirements. He was telling me recently that he spent several phone calls with a developer who had shipped an iOS app that was close to what he needed trying to explain what he wanted. He knows I am a programmer and is always hinting that I should develop an app that would meet his requirements.
But I know better. Invoicing/scheduling software is really difficult, especially to appeal to everyone. Each small business has so many tiny requirements that are specific to their business and their personality. You can't just have one piece of software that appeals to everyone, that meets all of the requirements, without it becoming bloated and complicated. And if I built to his particular requirements, I would have exactly 1 customer, which isn't sustainable as a business (I mean, he wants to pay ~20/month).
But now we have a world where that kind of highly customized software will be possible. As more and more LLM-ready building blocks emerge, custom software may become the norm rather than the exception.
Just two things:
- Wouldn't his firm be better served by website builders like WordPress, Squarespace, Wix, etc.? These services have enabled millions of less technical people to create and publish websites for decades now. Most of them support a large ecosystem of plugins and 3rd-party tools that make adding interactivity such as forms and CRMs a breeze.
I mean, it's great that your friend is enjoying getting into web development, and that LLMs are helping him, but I reckon he would be much more productive and deliver more value to his customers by using one of the established services on the market. Unless the projects require some bespoke solutions, or mobile apps, but it doesn't sound like it.
- What happens when one of his customers asks for authentication, session management, a comment system, payments, or something non-trivial or sensitive like that? If all requirements are trivial as you say, then a web site builder could handle it, but if they stop being trivial, then he is bound to run into issues.
LLMs will happily generate non-trivial code, but there are high chances that it will contain security issues or bugs that someone inexperienced won't be able to spot and fix.
So what happens then? He will deliver a seemingly working site to his customers with security issues and bugs, and it will only be a matter of time for them to be exploited. It doesn't matter that his customers don't know or care about "best practices". They surely care about a functioning product that doesn't leak or mishandle their customer data. These issues could be mitigated or avoided by hiring an experienced developer.
So I hope that he has the wisdom and humility to determine when a developer is still required and pay for them, instead of relying on the false confidence provided by LLMs. Or he could take the time to actually learn to program and adopt best practices instead of vibe coding, which sounds like he would be interested in doing anyway.
My understanding is the majority of his work is on WordPress. It's worth noting this is a partnership with 100+ clients, 5+ full time employees. They do television commercials, websites, banner ads, social media campaigns, etc. He is a partner at the firm and while he calls himself "non-technical" he does have experience with website design (HTML/CSS) and the administration of WordPress and databases.
To be clear: he was already delivering these kind of custom solutions to clients using contract programmers. He is well aware of requirements like authentication (in fact, our last conversation he mentioned a project he was working on that did just that). But previously, the cost of custom work was too high in some cases, since bringing on a contract programmer for certain kinds of projects pushed the budget out of range for the client. Vibe coding is opening up a new avenue for custom built functionality that was previously too expensive.
> I hope that he has the wisdom and humility
I notice this kind of thing frequently. I mean, who is lacking humility here? Someone thinking they have all of the facts, offering advice and "Why don't you just ..." kind of thinking based on assumptions. If you really think you can diagnose issues and offer advice based on the quick comment I made, you should reassess your own humility before recommending it to others.
I'm not debating that. What I am arguing for is for using these new tools smartly and conservatively, because they have and will continue to produce low quality software in hands of inexperienced developers. It's easy to be misled by their confident tone and the overhyped marketing around them into thinking that they're able to do things they realistically cannot. Those best practices you say that customers don't care about are precisely what help prevent quality issues from impacting them, regardless of the software complexity. Vibe coding throws all of that out the window. It's tempting to cut corners to keep the cost of projects down, but ignoring well established software development practices is not a safe way to do it.
> If you really think you can diagnose issues and offer advice based on the quick comment I made, you should reassess your own humility before recommending it to others.
I'm not offering advice. I'm going by what you said, and voicing a concern that the apparent utility of LLMs has some important caveats. I don't particularly care about your friend's firm nor their customers. What I do care about is that the widespread adoption of vibe coding is doing more harm than good to the software industry and society at large, which will have destructive consequences in the near future.
Instead of engaging with this argument and filling in any details I might be missing, you chose to attack me personally, which says more about you than me.
There is an assumption being made here that isn't being made explicit: the only way that malicious behavior can be avoided is by paying a programmer. Is that a valid assumption? Or the less strong: a plugin is less secure if developed by a coding agent when compared to any possible programmer. Is that a valid assumption? Aren't all of the well-known issues in WordPress plugins the fault of programmers?
What I feel in these comments isn't a genuine attempt to engage but rather Fear, Uncertainty and Doubt (FUD) writ large.
Also, for what it is worth, the most recent project he developed was using React, Tailwind and Postgres (which he called "Post ... something?"). It was very work-flowy (user uploads a doc, it goes into a queue for manual review, once approved it is converted and uploaded to Google Docs, an email is sent, etc). I asked him if he had investigated any workflow builders and he said no, he just vibe coded it. It's also worth noting that he is paying for QA, I think that existed already in house for his other projects. Well, actually what he said was "it is currently in testing", so I can't confirm if it is professional QA.
I officially logged into Wordpress for the last time six weeks ago.
I’m currently migrating a bunch of my sites over to Next.JS.
Claude has vibed the best SEO, E.E.A.T., CRO (CXL best practice), WCAG 2.0, and schema.org compared to any site I’ve ever built in Wordpress.
The audits OPUS was creating for each of these areas are astonishing.
I’m simply migrating them across to Next.JS and hosting them on Netlify.
I haven’t paid for any premium plugins to get these sites up and running; I just used Claude Max 100.
I won’t be renewing the AUD$3500 in Wordpress ecosystem subscriptions after they run out this year.
For my gardening business (I’m now a professional gardener), I’ve integrated a job route scheduling tool with Claude Code. This tool calculates travel times between my gardening jobs and provides basic CRM functionality for my clients. It uses the Google Distance Matrix API, and my week is laid out like a Kanban board.
For my new gardening website, I’ve created dozens of new service pages over the last ten days. I’ve also created a local admin dashboard that ingests my 1200 or so before and after pictures. This dashboard provides a neat interface to match before and after “pairs,” extracts the EXIF data, calculates the suburb, and allows me to tag by job type. It then moves the photos (stripped of EXIF) into the Next.JS public folder with AVIF and WebP versions and a JSON file that specifies their content.
Claude then uses the JSON to build custom gallery components for each service page.
None of this was conceivable for me two months ago.
I’m primarily building static JamStack sites that are secure.
Is Wordpress secure? I don’t think so.
I’ve done many months of work in the last twenty-one days.
Have I saved myself $50k by doing all this with Claude Code? No, because that was never an option previously.
I understand your concerns about false confidence, and I genuinely respect that perspective. I backed out of Firebase Studio a while ago because I lacked confidence in Gemini’s ability to create safe and functional Firebase rules.
However, the landscape is changing, and the new interface for CMS systems will no longer be the traditional wp-admin. Instead, it will be a user-friendly chat agent with a robust system prompt for building websites, forms, basic workflow rules, business logic, and authentication.
Although I’m not a programmer, I have experience as a digital producer, which has given me a good understanding of toolchains.
If I were a startup envisioning the next generation of CMS, I would be actively working on it and developing it as quickly as possible.
this is a hosted lamp stack, we had it 20 years ago. is cpanel is not fashionable anymore?
I suppose that most deployment/devops is done using existing git push workflows and IaaC. Has anyone had good experience with LLM/agent-compatible tools?
> Gemini 2.5 attempting to create firebase rules
That is very interesting. I wonder if Claude Code would do better on Firebase rules.
Make sure the agent knows how much it costs to query
Once built, the solution is plain-old-runnable-code (PORC :-), as long as the business logic implemented doesn't exit to LLM. So I don't fret so much about the AI hype story here.
For anyone starting off building with new tech, an AI assistant is really helpful.
And every one of them will be ads.
Who knows…
A Disneyland with no children.
Moloch.
The systems we use can only be as smart and intuitive as the people who prompt them.
On top of it, this (LLMs) is not AI, not even close, if anything they are glorified prediction systems that require human prompting.
puts in retainer; pushes glasses back up bridge of nose
Technically schpeaking, what you're talking about is the difference between weak AI and strong AI/artificial general intelligence (AGI). AGI is the kind of AI that has reached human levels of consciousness. We're not there yet. Personally, I hope we don't get there, but I'm not the one in charge, so shrug.
You can do a lot with glorified prediction systems that require human prompting. Actually, they are arguably more valuable than AGI because you can more easily communicate and utilize their value proposition. People don't need a machine that wonders the same stuff they do; they need something that does a specific task in lieu of their own effort.
>You can do a lot with glorified prediction systems that require human prompting >People don't need a machine that wonders the same stuff they do; they need something that does a specific task in lieu of their own effort.
This is the problem with our current revision with AI; the way I see it those two are in conflict with each other. In lieu of their own effort, the way a vast amount of the would be users think, is "without promoting" which would lend towards AGI than AI.
>Actually, they are arguably more valuable than AGI because you can more easily communicate and utilize their value proposition.
To you and I this might be true, but to your average non-techie I don't think it's quite as true as you would like it to be.
Short term it is very true, everyone sees the value until you realize it's inherit limitations and the 'shiny, wears off
Do you think that the LLM/AI tools today are better than those from 2 years ago? Do you think the LLM/AI tools in 2 years time will be no better than the ones we have today?
Interpreting your non-response: No, two years have not improved things and two more years will not either.