The feature I want most from all of these "agentic" coding tools is a robust, trustworthy sandbox that limits the blast radius for when something goes wrong.
I'm currently leaning on Docker for Mac for this, which seems robust enough - but it would be nice if sensible sandboxes were the default, not something you have to actively enable yourself.
Claude Artifacts and ChatGPT Code Interpreter are still the AI-assisted coding tools I use most often, mainly because I know their sandboxes are rock solid.
fellowniusmonk · 1d ago
This is amazing.. the escalation comes when LLMs realize they are stuck in a VM and try to hack their way out and then we realize something about ourselves.
asadm · 1d ago
I think spawning a new worktree and then mounting it to a docker container is good enough and quick to do.
SV_BubbleTime · 1d ago
I'm running Claude Code in a container and have been quite pleased. I mean... I'm not going to hook it to any MCP that can contact the outside work besides making commits, so I'm good... but it does seem like a lot of people are handing the keys to drunk teenagers.
> Hi everyone - as a previous context I’m an AI Program Manager at J&J and have been using Cursor for personal projects since March.
> Yesterday I was migrating some of my back-end configuration from Express.js to Next.js and Cursor bugged hard after the migration - it tried to delete some old files, didn’t work at the first time and it decided to end up deleting everything on my computer, including itself. I had to use EaseUS to try to recover the data, but didn’t work very well also. Lucky I always have everything on my Google Drive and Github, but it still scared the hell out of me.
> Now I’m allergic to YOLO mode and won’t try it anytime soon again. Does anyone had any issue similar than this or am I the first one to have everything deleted by AI?
msgodel · 1d ago
That's crazy anyone would unleash one of these agents on a work laptop with so little supervision.
ketzo · 1d ago
Completely unsourced and the site is run by a marketing/PR/growth consultancy.
Between that and the utter lack of detail, feels like not worthy of HN front page.
an0malous · 1d ago
Doesn’t matter, AI
tlarkworthy · 1d ago
I wrote an agent that works in userspace inside the developing program and it frequently reads it's own code to diagnose errors and sometimes tries to upgrade itself, but that causes a hot reload and it loses its own conversation.
It does seem to be useful though that it can read it's own tool implementations.
geuis · 1d ago
Was just talking with a coworker yesterday how we both don't let Cursor automatically run commands without permission.
I'm currently leaning on Docker for Mac for this, which seems robust enough - but it would be nice if sensible sandboxes were the default, not something you have to actively enable yourself.
Claude Artifacts and ChatGPT Code Interpreter are still the AI-assisted coding tools I use most often, mainly because I know their sandboxes are rock solid.
> Hi everyone - as a previous context I’m an AI Program Manager at J&J and have been using Cursor for personal projects since March.
> Yesterday I was migrating some of my back-end configuration from Express.js to Next.js and Cursor bugged hard after the migration - it tried to delete some old files, didn’t work at the first time and it decided to end up deleting everything on my computer, including itself. I had to use EaseUS to try to recover the data, but didn’t work very well also. Lucky I always have everything on my Google Drive and Github, but it still scared the hell out of me.
> Now I’m allergic to YOLO mode and won’t try it anytime soon again. Does anyone had any issue similar than this or am I the first one to have everything deleted by AI?
Between that and the utter lack of detail, feels like not worthy of HN front page.
Case in point.
It's a bit nerve wracking when it starts YOLO rebasing and force pushing, but it works out in the end.
Also with Claude Code I've never had it go outside the original folder I've started it, even when I've made it do it.