Show HN: We started building an AI dev tool but it turned into a Sims-style game

72 maxraven 52 8/18/2025, 6:51:23 PM youtube.com ↗

Hi HN! We’re Max and Peyton from The Interface (https://www.theinterface.com/).

We started out building an AI agent dev tool, but somewhere along the way it turned into Sims for AI agents.

Demo video: https://www.youtube.com/watch?v=sRPnX_f2V_c.

The original idea was simple: make it easy to create AI agents. We started with Jupyter Notebooks, where each cell could be callable by MCP—so agents could turn them into tools for themselves. It worked well enough that the system became self-improving, churning out content, and acting like a co-pilot that helped you build new agents.

But when we stepped back, what we had was these endless walls of text. And even though it worked, honestly, it was just boring. We were also convinced that it would be swallowed up by the next model’s capabilities. We wanted to build something else—something that made AI less of a black box and more engaging. Why type into a chat box all day if you could look your agents in the face, see their confusion, and watch when and how they interact?

Both of us grew up on simulation games—RollerCoaster Tycoon 3, Age of Empires, SimCity—so we started experimenting with running LLM agents inside a 3D world. At first it was pure curiosity, but right away, watching agents interact in real time was much more interesting than anything we’d done before.

The very first version was small: a single Unity room, an MCP server, and a chat box. Even getting two agents to take turns took weeks. Every run surfaced quirks—agents refusing to talk at all, or only “speaking” by dancing or pulling facial expressions to show emotion. That unpredictability kept us building.

Now it’s a desktop app (Tauri + Unity via WebGL) where humans and agents share 3D tile-based rooms. Agents receive structured observations every tick and can take actions that change the world. You can edit the rules between runs—prompts, decision logic, even how they see chat history—without rebuilding.

On the technical side, we built a Unity bridge with MCP and multi-provider routing via LiteLLM, with local model support via Mistral.rs coming next. All system prompts are editable, so you can directly experiment with coordination strategies—tuning how “chatty” agents are versus how much they move or manipulate the environment.

We then added a tilemap editor so you can design custom rooms, set tile-based events with conditions and actions, and turn them into puzzles or hazards. There’s community sharing built in, so you can post rooms you make.

Watching agents collude or negotiate through falling tiles, teleports, landmines, fire, “win” and “lose” tiles, and tool calls for things like lethal fires or disco floors is a much more fun way to spend our days.

Under the hood, Unity’s ECS drives a whole state machine and event system. And because humans and AI share the same space in real time, every negotiation, success, or failure also becomes useful multi-agent, multimodal data for post-training or world models.

Our early users are already using it for prompt-injection testing, social engineering scenarios, cooperative games, and model comparisons. The bigger vision is to build an open-ended, AI-native sim-game where you can build and interact with anything or anyone. You can design puzzles, levels, and environments, have agents compete or collaborate, set up games, or even replay your favorite TV shows.

The fun part is that no two interactions are ever the same. Everything is emergent, not hard-coded, so the same level played six times will play out differently each time.

The plan is to keep expanding—bigger rooms, more in-world tools for agents, and then multiplayer hosting. It’s live now, no waitlist. Free to play. You can bring your own API keys, or start with $10 in credits and run agents right away: www.TheInterface.com.

We’d love feedback on scenarios worth testing and what to build next. Tell us the weird stuff you’d throw at this—we’ll be in the comments.

Comments (52)

OtherShrezzing · 2h ago

I’ve been thinking about how traditional game AI can be improved by generative models. One of the biggest problems with games like Civ is that the AI strategy is predictable - especially if you’ve played a few dozen hours.

LLMs with some decent harnesses could build up unpredictable - but internally consistent - strategies per each new game you play.

This is close to a proof of concept for those improvements.

dahauns · 1m ago

I can't help it, the first thought that came to mind was "Huh...talk about sheer senseless brute force." Why use a Large Language Model on something as clearly defined in scope as a game instead of a model designed and trained for the task/ruleset? Sure, there's the argument of not having to train that model, but OTOH, "decent harnesses" does some very heavy lifting there...

tatjam · 2h ago

I wonder how could you keep the LLM from going bonkers as the game progresses? I have a feeling it's possibly better to re-create the prompts after some time, and have the LLM work more like one of those "reasoning models" with the game as something it can interact with.

Otherwise you run into the risk of "TOTAL NUCLEAR FINANCIAL LEGAL DESTRUCTION" ;)

peytonshields · 1h ago

This is something we've been working on and are planning to release a "decision" update to the game which should allow for multi-step, configurable options to choose if the LLM actually gets to contribute to the current world / chat. There's a lot of trial and error involved and we're all ears, if you have ideas we'd love to hear them! We actively monitor our discord https://discord.com/invite/theinterface

bob1029 · 1h ago

From a player perspective, oftentimes the best AI systems are the most trivial ones. You can get really far with an agent that is allowed to cheat. It's a hell of a lot easier to build and troubleshoot a model that manipulates the amount of in-game resources received per unit time than it is to implement actual strategic intelligence.

mvdtnz · 1h ago

I play strategy games a lot and cheating AI can be fun to play against at first, but the more you learn a game the more cheating AI sucks. When you're new to the game it just feels like you're playing against a good player, but you soon learn that what they are achieving isn't possible with the resources available. Once you hit that realisation it can be fun to beat them as a challenge but it never feels like a fair game.

mh- · 6m ago

Agreed, this is an instant turn-off for me when I realize this in e.g. an RTS game. Red Alert or C&C come to mind on higher difficulty, can't remember which.

peytonshields · 2h ago

Absolutely! Max and I were huge Civ fans and always tried to make the game AI deviate from its programmed strategy. We also believe you can get some really interesting story arcs by adjusting parameters like temperature and how context is presented. Some of the things you'll notice in the game is we have a no-holds barred approach -- you can fully modify system prompts and adjust how the LLM interprets the state of the world.

dawnerd · 1h ago

As someone that plays those games pretty heavily: I’d rather not have LLMs take over game AI like that. If I want different gameplay I’d play online. We don’t need to bog down already heavy games with LLMs.

yawnxyz · 1h ago

even being able to scheme with / backstab leaders, and they would "understand" all that's happened (and acts accordingly) would be so fun

ralusek · 1h ago

Definitely the case. That being said, I think it would be hard, at least in the immediate future, to translate the concept of difficulty to a universal LLM for a bespoke/specific game. I assume most game AIs are tuned by hand to feel fair for a given difficulty level...but if you just give an LLM some new game, explain the rules and what resources/abilities it has available to it, you're stuck with adding some addendum to the tune of "and you're meant to represent an entity of 'medium' difficulty." For very well established games, it might have a sense of how given actions might fall into a skill-level hierarchy, but not for anything new.

Fine tuned LLMs though with actual experience with the game, maybe?

pizzathyme · 1h ago

I worked on The Sims. From experience I can tell you these types of games require a ton of experimentation and building before you finally hit on something that feels "fun" and you get lost in playing it. Then it all kind of comes together at once.

Keep it up! Looking forward to what you figure out.

dclowd9901 · 1h ago

Do you have a blog or something where you talk about that work? I'd absolutely love to read more about it. Theory of game design is one of my favorite topics.

pavel_lishin · 1h ago

I'm not pizzathyme, but if you like reading about game design, iirc the developer of Cogmind had a tremendously good dev blog talking about designing their game: https://www.gridsagegames.com/blog/

gnerd00 · 1h ago

one of the original junior Sims C++ devs was an undergrad from Reed college.. he is now director of some kind for Overture Maps iir

xrd · 2h ago

I find this funny because Stewart Butterfield (and others) founded Slack and Flickr by pivoting from the games they were trying to build. This is the opposite, someone trying to build a product and then pivoting to a game. I think this is a better path, FWIW.

max-raven · 2h ago

Thanks! We believe so too :)

splatzone · 2h ago

I’ve felt for some time that there’s a gap in the market for a genuine spiritual successor to The Sims, using LLMs to power the interactions between agents to create a more realistic and immersive simulation of life. This seems like a step in the right direction.

maxraven · 1h ago

Thanks- Will Wright’s been a big inspiration for us, and that's where we're headed!

saberience · 3h ago

What's the actual gameplay loop?

I.e. what's the goal, how do you know you're doing well (or not), what makes it fun etc?

peytonshields · 2h ago

The loop is all about adapting, experimenting, and seeing which combinations work and which don't. Right now it's designed around mini-games which can have a different goal per game -- as a quick example I'm currently building agent tic-tac-toe but hidden trapdoors and power-ups

tayo42 · 1h ago

I don't think the Sims had that

shakna · 1h ago

The Sims had multiple goals for the player.

You had basic needs to fulfill, career advancement, relationships, and family generations.

Each of those fulfills the game loop.

deadbabe · 2h ago

In these sandbox games you just make up your own little stories and have fun watching them play out.

soared · 2h ago

I wonder if this would be good for vibe coding / natural language for enemy AI. IE, place an enemy down and tell it: “every 3 seconds fire an arrow at the player. If the player is within 7 tiles of you, stop firing arrows, path to the player, and attack it with a sword. When your health reaches 10% run away from the player”

peytonshields · 2h ago

This is the goal! We're working hard to give the AI more spatial "world" awareness with bespoke decision loops

_pdp_ · 1h ago

The reason text works is because it has higher bit rate then speech. This is way many believe that CLI tools are still considered supreme in terms of getting things done quick.

While fun this game-like interface is too casual and it certainly has lower bit rate which impacts communicate exchange between an AI and the human operator.

It will be a fine abstraction if the goal is to have high-level overview though.

peytonshields · 1h ago

Thanks for the comment! We're working towards using the game's own simulation data (from Unity) to feed back into your game's agents. We hope this will prove less noisy than speech / real-world instrument data, allowing the AI to learn more effectively with new data every time you play

thatha7777 · 3h ago

Kudos, this is a very novel take! What's the most surprising emergent behavior you've observed? Have you observed any "social dynamics" that you didn't explicitly program?

max-raven · 2h ago

Thanks for the comment! They can get pretty mad at each other relatively easily, frowning and battle crying, which is always fun to watch. When we turned on voice models (in the pipeline!!:)) their voices did as well

thatha7777 · 2h ago

seriously, this embodied interaction angle seems like a much more humane way to understand AI behavior than just staring at walls of text. even if it occasionally feels like you're running a very advanced digital terrarium

NietzscheanNull · 2h ago

Just a heads up: the signup form disclaimer ("by signing up to create an account, you are accepting our terms of service and privacy policy") appears to link to a ToS route (theinterface.com/terms), but clicking that immediately redirects back to the login page (/signin) on Firefox [141.0.3].

Same thing happened when I tried hitting the URL directly. Do I have to accept the ToS before I'm allowed to read it?

peytonshields · 2h ago

Fixed! Thanks for flagging!

indigodaddy · 1h ago

This is cool for sure. Is it only all about tiles? Lately I've been thinking it would be awesome to get an AI to play DXBall (bricks game) type game or perhaps lode runner etc. would that be doable here?

peytonshields · 1h ago

We've only just begun! Max and I started building this about 1.5 months ago and are planning to ship a torrent of updates for the foreseeable future! Eventually it will be much more open/explorable world

tines · 1h ago

Oh man, DXBall. Those were the days.

bennymag · 2h ago

I think this would be a great learning tool too - imagine like a bridge simulator or robocodo (https://game.rodocodo.com/hour-of-code/) - which is a learn to code tool for elementary students - but for AI agents. As a tribute to Sims, you should allow for the `rosebud` cheat code :)

maxraven · 2h ago

Love both of these thoughts:)

DonHopkins · 1h ago

Have you played around with Sims-like plug-in objects, which include the knowledge of how to make the characters use themselves?

The important thing is that you can plug in new objects without reprogramming the people.

Sims objects (including characters) have a list of "advertisements" of possible interactions (usually shown as items on the user control pie menu, but also including invisible and special orchestration actions).

Each enabled action of every live object broadcasts its character-specific scores to each of the characters, adjusted to each character's personality, current stats, location, relationships, history, optionally tweaked by code.

Then to keep the sims from acting like perfectly optimized robots, they have a "behavioral dithering" that choses randomly between the top scoring few advertisements.

Here's a video of "Will Wright - Maxis - Interfacing to Microworlds - 1996-4-26" where he shows an pre-release version called "Dollhouse" and explains the design:

https://www.youtube.com/watch?v=nsxoZXaYJSk

Jamie Doornbos gave a talk at GDC shortly after we released The Sims 1 in 2000, "Those Darned Sims: What Makes Them Tick?" in which he explains everything:

https://www.gdcvault.com/play/1013969/Those-Darned-Sims-What...

Transcript:

https://dn721906.ca.archive.org/0/items/gdc-2001-those-darne...

Yoann Bourse wrote this paper "Artificial Intelligence in The Sims series":

https://yo252yo.com/old/ens/sims-rapport.pdf

In The Sims 4 it's all been rewritten in Python, and has more fancy features, but it still uses the essential model for objects and advertisements.

The Sims 1 used a visual programming language called "SimAntics" to script the objects and characters, including functions that are run to score advertisements for each character.

But with LLMs you can write scoring functions and behavioral control in natural language!

polotics · 2h ago

Have you gotten inspired by the Black Mirror "Plaything" episode? :-D

jader201 · 1h ago

PSA: In case you don't realize, this video has commentary. But it's crazy low, and you have to turn your volume way up to hear it.

I thought it was just another YouTube video with no audio.

maxraven · 1h ago

We just looked and can't increase the volume retroactively (!)- Thank you for the note for folks!

insamniac · 2h ago

Not supported on linux :(

peytonshields · 2h ago

Coming soon! If you join our discord happy to debug live, we have the build for it but figuring out some libgtk dep issues with Tauri

mdaniel · 1h ago

I thought this new future was to get the AI to fix all the bugs

mxwilliamson99 · 2h ago

Pretty cool

gnerd00 · 2h ago

this has carefully costumed role playing characters in the first second -- the title is misleading and/or "con"

monster_truck · 2h ago

Games don't ask me for API Keys. Whatever this is I will not be "playing" it

brulard · 1h ago

Why wouldn't they? You want to use a state-of-the-art AI somewhere, you don't want to pay new subscription for that one game you want to try out. You can set a limit / spending cap on the api keys and revoke them right after you tried it. I don't see a problem there.

max-raven · 2h ago

Appreciate the response - FWIW we're working reallyyy hard on getting local models working so you won't have to in the future if you did want to!

ivape · 1h ago

You need to check what year it is. I don’t know if HN was ever old enough until this exact moment to truly have a class of luddites, but we’re roughly tracking based on natural aging. Live long enough, and you truly do become the thing you never set out to be.

mdaniel · 1h ago

On the one hand, I do kinda hear where you're coming from, but OTOH I'm sympathetic to OP's concern that gaming should be relaxing or fun, and getting into the business of credential management plus billing management is neither of those things

Which is a lot of words to offer: be careful tossing out Luddite accusations just because it happens to be AI adjacent, that's rarely the whole story

Spice Data (YC S19) Is Hiring a Product Associate (New Grad) (ycombinator.com)

Ashby (YC W19) Is Hiring Design Engineers in AMER and EMEA (ashbyhq.com)

EasyPost (YC S13) Is Hiring (easypost.com)

Tesorio (YC S15) Is Hiring a Senior GenAI Engineer (100% Remote) (tesorio.com)

OneSignal (YC S11) Is Hiring Engineers (onesignal.com)

Axle (YC S22) is hiring product engineers (ycombinator.com)

Mbodi AI (YC X25) Is Hiring a Founding Research Engineer (Robotics) (ycombinator.com)

ReadMe (YC W15) Is Hiring a Developer Experience PM (readme.com)

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

Depot (YC W23) Is Hiring a Community and Events Manager (Remote) (ycombinator.com)

CoLoop (YC S21) Is Hiring AI Engineers in London

Trellis (YC W24) Is Hiring: Automate Prior Auth in Healthcare (ycombinator.com)

Type (YC W23) is hiring a founding engineer to build an AI-native doc editor (ycombinator.com)

Foundry (YC F24) is hiring staff-level product engineers (ycombinator.com)

GoGoGrandparent (YC S16) Is Hiring Back End and Full-Stack Engineers

Kyber (YC W23) is hiring enterprise account executives (ycombinator.com)

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Great Question (YC W21) Is Hiring a VP of Engineering (Remote) (ycombinator.com)

Coverage Cat (YC S22) Is Hiring a Senior, Staff, or Principal Engineer (coveragecat.com)

Kaizen (YC X25) is hiring engineers to build browser agents that work (kaizenautomation.com)

Infracost (YC W21) hiring first PM to shift $600B cloud spend to proactive (ycombinator.com)

Sei (YC W22) Is Hiring a Full Stack Engineer in Chennai, India (ycombinator.com)

Artie (YC S23) Is Hiring Founding AEs (ycombinator.com)

Cedana (YC S23) Is Hiring a Systems Engineer (ycombinator.com)

CodeCrafters (YC S22) is hiring first Marketing Person (ycombinator.com)

PAX Markets (YC W25) is hiring a founding principal hardware (RTL) engineer (ycombinator.com)

Sendblue (YC S23) is hiring senior engineers (ycombinator.com)

Thunder Compute (YC S24) Is Hiring a C++ Systems Engineer (ycombinator.com)

Optery (YC W22) Is Hiring in Engineering, Legal, Sales, Marketing (U.S., Latam) (optery.com)

QuestDB (YC S20) Is Hiring a Technical Content Lead (questdb.com)

Depot (YC W23) Is Hiring a Technical Content Writer (Remote) (ycombinator.com)

Show HN: We started building an AI dev tool but it turned into a Sims-style game

Comments (52)