Jules, our asynchronous coding agent

163 meetpateltech 124 8/6/2025, 4:05:39 PM blog.google ↗

Comments (124)

turblety · 2h ago
Why has Google totally overcomplicated their subscription models?

Looking at "Google AI Ultra" it looks like I get this Jules thing, Gemini App, Notebook, etc. But if I want Gemini CLI, then I've got to go through the GCP hellscape of trying to create subscriptions, billing accounts then buying Google Code Assist or something, but then I can't get the Gemini app.

Then of course, this Google AI gives me YouTube Premium for some reason (no idea how that's related to anything).

ryandvm · 1h ago
And God forbid you were an early Google for Domains adopter and have your own Google Workspace account because nothing fucking works right for those poor saps.
rkomorn · 1h ago
Add "moving to a different country while owning an account that started as Google Apps for Domains" for a little more flavor.

"Can't share the subscription because the other person in your family is in another country."

Okay guess I'll change countr- "No you can't change your Google Workspace account's country."

kyleee · 50m ago
Nobody is getting a promotion for fixing that shit
esher · 2h ago
Watch YouTube while AI is coding for you.
weakwire · 2h ago
That’s actually great!
absurddoctor · 2h ago
But unlike some other pieces of the Ultra subscription you can’t share YouTube premium with family. So now I have both and Google has suggested a few times that I shouldn’t be doing that.
jacksnipe · 27m ago
I wonder if bundling it with ai is to deal with that pesky internal issue where engineers are always trying to turn off ads for their yt accounts
coredog64 · 2h ago
> Then of course, this Google AI gives me YouTube Premium for some reason (no idea how that's related to anything).

One of the common tests I've seen for the Google models specifically is understanding of YT videos: Summarization, transcription, diarization, etc. One of their APIs allows you to provide a YT video ID rather than making you responsible for downloading the content yourself.

gman83 · 2h ago
I was wondering about this too, and apparently they're working on integrating it, so the Google AI Pro/Ultra subscriptions will also give API/CLI credits or something -- https://github.com/google-gemini/gemini-cli/issues/1427
p1nkpineapple · 2h ago
I've been actually kind-of enjoying using Jules as a way of "coding" my side project (a react native app) using my phone.

I have very limited spare time these days, but sometimes on my walk to work I can think of an idea/feature, plan out what I want it to do (and sometimes use the github app to revise the existing code), then send out a few jobs. By the time I get home in the evening I've got a few PRs to review. Most of the code is useless to me, but it usually runs, and means I can jump straight into testing out the idea before going back and writing it properly myself.

Next step is to add automatic builds to each PR, so that on the way home I can just check out the different branches on my phone instead of waiting to be home to run the ios simulator :D

rcakebread · 18m ago
I don't have much experience with using LLMs to help write code, but I gave Jules a try on a new, very unorganized Python project I recently started. About 800 lines of code. It need a major refactoring, so I simply asked Jules to make suggestions.

At a cursory glance, it did a great job. It failed the first time. I gave it the error message and it fixed it. I was shocked it ran after that. Not bad for the free plan.

esafak · 3h ago
The daily task limit went down from 60 to 15 (edit: on the free plan) with this release. Personally I wasn't close to exhausting the limit because I had to spend time going back and forth, and fixing its code.

To communicate with the Jules team join https://discord.gg/googlelabs

lacoolj · 2h ago
That's odd cuz my daily task limit went up to 100.

Are you on Google Pro or using it free?

Also, I've found that even with 60, over an entire full day/night of using it for different things, I never went over 10 tasks and didn't feel like I was losing anything. To be clear, I've used this every weekend for months and I mean that I've never gone over 10 on any one day, not overall.

15 should be plenty, especially if you aren't paying for it. I will likely never use 100 even on my busiest of weekends

mvieira38 · 3h ago
Good to see competition for Codex. I think cloud-based async agents like Codex and Jules are superior to the Claude Code/Aider/Cursor style of local integration. It's much safer to have them completely isolated from your own machine, and the loop of sending them commands, doing your own thing on your PC and then checking back whenever is way better than having to set up git worktrees or any other type of sandbox yourself
agentastic · 2h ago
Codex/Jules are taking a very different approach than CC/Curser,

There used to be this thesis in software of [Cathedral vs Bazaar](https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar), the modern version of it is you either 1) build your own cathedral, and you bring the user to your house. It is a more controlled environment, deployment is easier, but also the upside is more limited and also shows the model can't perform out-of-distribution. OpenAI has taken this approach for all of its agentic offering, whether ChatGPT Agent or Codex.

2) the alternative is Bazaar, where you bring the agent to the user, and let it interact with 1000 different apps/things/variables in their environment. It is 100x more difficult to pull this off, and you need better model that are more adaptable. But payoff is higher. The issues that you raised (env setup/config/etc) are temporary and fixable.

throwup238 · 2h ago
Cursor now has “Background Agents” which do the same thing as Codex/Jules.
highfrequency · 2h ago
Can you elaborate on how Codex vs. CC maps onto this cathedral vs. bazaar dichotomy? They seem fairly similar to me.
agentastic · 1h ago
of course,

cathedral = sandbox env in the provider's cloud, so [codex](https://chatgpt.com/codex) uses this model. Their codex-cli product is the Bazaar model, where you run in your computer, in your own environment.

Claude Code, on the other hand, doesn't have the cloud-based sandboxing product, you have to run in on your computer, so the bazaar model. You can also run in in a way that anthropic never envisioned (e.g. give it control to your house). Curser also follows the same model, albeit they have been trying to get into the cathedral model by using the background agent (as someone also pointed out below). Presumably not to lose the market share to codex/jules/etc.

vb-8448 · 3h ago
It's safer have them completely isolated, but it's slower and more expensive.

Sometimes I just realize that CC going nuts and stop it before it goes too far (and consume too much). With this async setup, you may come after a couple of hours and see utter madness(and millions of tokens burned).

unshavedyak · 2h ago
Completely agree. I also want to tightly control the output, and the more it just burns and burns the more i become overwhelmed by a giant pile of work to review.

A tight feedback loop is best for me. The opposite of these async models. At least for now.

xiphias2 · 3h ago
I agree but I just love codex-1 model that is powering codex and see pro 2.5 as inferior.

It's interesting that most people seem to prefer local code, I love that it allows me to code from my mobile phone while on the road.

jondwillis · 1h ago
What kind of things are you coding while “on the road”? Phone addiction aside, the UX of tapping prompts into my phone and either collaborating with an agent, or waiting for a background agent to do its thing, is not very appealing.
xiphias2 · 1h ago
Mainly thinking about what are the minimum testable changes that I can give to codex to work on the background.

Tapping the prompts in is the easy part, but async model is different to work with, I feel more like a manager, not a co-developer.

mattnewton · 3h ago
Getting the environment set up in the cloud is a pain vs just running in your environment imo. I think we’ll probably see both for the foreseeable future but I am betting on the worse-is-better of cli tools and ide integrations winning over the next 2 years.
mvieira38 · 2h ago
It took me like half an afternoon to get set up for my workplace's monorepo, but our stack is pretty much just Python and MongoDB so I guess that's easier. I agree, it's a significant trade-off, it just enables a very convenient workflow once it's done, and stuff like having it make 4 different versions with no speed loss is mind-blowing.

One nice perk on the ChatGPT Team and Enterprise plans is that Codex environments can be shared, so my work setting this up saved my coworkers a bunch of time. I pretty much just showed how it worked to my buddy and he got going instantly

drdrey · 3h ago
with something like github copilot coding agent it's really not, the environment setup is just like github actions
MattGaiser · 3h ago
It’s surprisingly good. If you try Copilot in GitHub, it has had no issues setting up temporary environments every single time in my case.

No special environment instructions required.

timdumol · 2h ago
I've tried using Jules for a side project, and the code quality it emits is much worse than GH Copilot (using Claude Sonnet), Gemini CLI, and Claude Code (which is odd, since it should have the same model as Gemini CLi). It also had a tendency to get confused in a monorepo -- it would keep trying to `cd backend && $DO_STUFF` even when it was already in backend, and iterate by trying to change `$DO_STUFF` rather than figure out that it's already in the backend directory.
qingcharles · 1h ago
I just tried Jules for the first time and it did a fantastic job on reworking a whole data layer. Probably better than I would have expected from Copilot. So.. I'm initially impressed. We'll see how it holds up. I was really impressed with Copilot, but after a lot of use there are times when it gets really bogged down and confused and you waste all the time you would have saved. Which is the story of AI right now.
xnx · 1h ago
> I've tried using Jules for a side project, and the code quality it emits is much worse than GH Copilot

It might be worth trying again.

"Jules now uses the advanced thinking capabilities of Gemini 2.5 Pro to develop coding plans, resulting in higher-quality code outputs"

timdumol · 53m ago
Ah, I missed that. I do vaguely remember that it used to use Flash, but I can't find where I saw it now. Thanks, I'll give it a shot!
simonw · 51m ago
I like the term "asynchronous coding agent" for this class of software. I found a couple of other examples of it in use, which makes me hope it's going to stick:

- https://blog.langchain.com/introducing-open-swe-an-open-sour...

- https://github.com/newsroom/press-releases/coding-agent-for-...

ramoz · 2h ago
There is only one true agent in 2025, Claude Code.

That said, Gemini is very powerful for it's quality long-context capabilities: https://www.reddit.com/r/ClaudeAI/comments/1miweuv/comment/n...

patrickhogan1 · 1h ago
I agree with you at this point. Even though Google is performing well on benchmarks and releasing impressive models like World Models Genie 3, the Gemini CLI suggestions/changes feel overly formulaic. Almost like its priorities are that of an OCD coder that cares more about tabs vs spaces instead of building a useful feature. For example, in a recent project, Google CLI spent all of my token allotment for that day on trivial tasks like tweaking ESLint configs or modularizing code that didn't need modularization.

In contrast, Claude Code seems to interpret my prompts better and helps me ship real product features for users.

Maybe it’s a system prompt issue. Its likely my prompting causing the problem. But Claude Code seems to understand my intent better.

dash2 · 21m ago
Perhaps this is the modern version of "every company ships its own org chart"? Maybe Gemini's priorities are those of a Google engineer, Claude's are those of an engineer at Anthropic....
ramoz · 57m ago
It's how these models/their-harnesses (e.g. the Claude Code js program) are being trained together in the RL stages.

I think the software is now a very important part of the training process. Which is why I think frontier labs are only capable of shipping "actual" agents.

Anthropic has figured something out here that others have not.

https://news.ycombinator.com/item?id=44816424

the_sleaze_ · 2h ago
Thinking the same. I don't want Github approval process to sit in between me and the changes - the killer feature of claude code is being able to head it off as it starts to go down a bad path, and to code myself in between its steps.

Do you let juniors complete full features without asking questions or make them check in when they get flustered?

jondwillis · 1h ago
I do want to try out some background agents, but from my experience with Cursor’s (frontier model agents) frequency of going off the rails despite having rules and context to help avoid producing slop, I can’t see background agents being that generally useful yet.
ramoz · 1h ago
for you or anyone else that wants this to be real - I would love to test a solution out with you.
lacoolj · 2h ago
I've used this tool for a few months now and have been pretty impressed by it. It handles large quantities of tasks very well and is good at making tests for very specific/isolated functions.

I have found it is not very good when trying to make new projects with different react libraries, inside of existing projects (for instance, my admin UI that I had it place inside of my existing server project).

If you start noticing it change directories and move around and delete/move directories a lot, you should stop the process, reconsider what you're telling it to do and how, then start from scratch with a new task.

purpleidea · 2h ago
I've been playing with it, and I've been generally not impressed.

There are both obvious annoying UI bugs (which should be easy to fix unless they vibe coded the whole thing) and the output of the tool isn't very good for anything but the simplest problems.

If the model was really good, I'd love this, but it's not.

xnx · 1h ago
> If the model was really good, I'd love this, but it's not.

Might be worth trying again now:

"Jules now uses the advanced thinking capabilities of Gemini 2.5 Pro to develop coding plans, resulting in higher-quality code outputs"

natch · 3h ago
Why is the pricing so well hidden? I had to ask Grok. Google would not show even the overview page unless I click-to-agree to all their terms and conditions.

OK found a good page for the plans here… ymmv if you're not logged in:

https://gemini.google/subscriptions/

rvnx · 3h ago
It should be illegal to say "> Highest task limits" or change them retroactively like Claude or Cursor did
unreal6 · 2h ago
In the middle of a billing cycle (which could be a month or year, in some cases), I would agree
jondwillis · 1h ago
>had to use grok

Had to implies that you pointed other models at the task and they failed, or that grok is your go-to model for this.

Can you explain?

SchizoDuckie · 3h ago
Who in their right mind hands off tasks to one of these for their day job? They can never be trusted.
esafak · 3h ago
You have to review their work, the same as any human's. What's the matter, you don't like cheap assistants?
munificent · 3h ago
> What's the matter, you don't like cheap assistants?

I think the main reason I'm not personally excited about AI is that... no, I don't, actually.

I'm in my late 40s. I have had many opportunities to move into management. I haven't because while I enjoy working with others, I derive the most satisfaction from feeling like I'm getting my hands dirty and doing work myself.

Spending the entire day doing code reviews of my army of minions might be strictly more productive, but it's not a job I would enjoy having. I have never, for a second, felt some sort of ambitious impulse to move up the org chart and become some sort of executive giving marching orders.

The world that AI boosters are driving towards seems to me to be one where the only human jobs left are effectively middle management where the leaf nodes of the org chart are all machines. It may the case that such a world has greater net productivity and the stock prices will go up.

But it's not a world that feels meaningful, dignified, or desirable to me.

lbrito · 2h ago
I feel exactly this but I'm in my mid 30s. You're lucky in the sense that you probably have a longer career and may be able to retire.
munificent · 2h ago
I'm definitely not at retirement age yet, but I do have to admit that I'm hopeful I can make it to retirement while still mostly working in a way that I enjoy.

At the same time, I've realized that "let me just try to squeeze out the last of my career" is a really unhealthy mindset for me to hold. It sort of locks me into a feeling like my best days are behind me or something.

So I am trying to dabble in using AI for coding and trying to make sure I stay open-minded and open to learning new things. I don't want to feel like a dinosaur.

freshtake · 39m ago
I've used all of the popular coding agents, including Jules. The reality to me is that they can and should be used for certain kinds of low severity and low complexity tasks (documentation, writing tests, etc.). They should not be used for the opposite end of the spectrum.

There are many perspectives on coding agents because there are many different types of engineers, with different levels of experience.

In my interactions I've found that junior engineers overestimate or overuse the capabilities of these agents, while more senior engineers are better calibrated.

The biggest challenge I see is what to do in 5 years once a generation of fresh engineers never learned how compilers, operating systems, hardware, memory, etc actually work. Innovation almost always requires deep understanding of the fundamentals, and AI may erode our interest in learning these critical bits of knowledge.

What I see as a hiring manager is senior (perhaps older) engineers commanding higher comp, while junior engineers become increasingly less in demand.

Agents are here to stay, but I'd estimate your best engineering days are still ahead.

esafak · 2h ago
You could consider yourself liberated to concentrate on higher level concerns like architecture and API/product design.
dingnuts · 2h ago
oh come on, I got into this field because I like to code.

now I'm liberated to do all the crap I don't like and never code. fuck off

vb-8448 · 3h ago
They will produce PR(and probably shitty code) on a rate you are not able to review XD
lbrito · 2h ago
There will likely be another agent to review the PRs and make questionable choices :D
9dev · 2h ago
And all that with an energy requirement a lot higher than a single human just doing it right in the first place, and learning something in the process. It all seems so incredibly weird and futile to me.
vb-8448 · 2h ago
And token will burn and provider will bill XD
esafak · 2h ago
And it often does! When I don't like its work I provide stricter instructions and repeat if I think it will succeed.

I still end up ahead.

percentcer · 3h ago
Assistants can be taught
esafak · 2h ago
And these models get upgraded -- at a much faster averaged rate than humans. Continual vs punctuated improvement :)
SchizoDuckie · 3h ago
I can trust humans to do as I ask.
jondwillis · 1h ago
Have you met humans? I can’t trust myself with half of the things I do.

Not saying that I trust LLMs more…

jmtulloss · 1h ago
https://blog.singleton.io/posts/2025-06-14-coding-agents-cro...

The former CTO of stripe, for one.

They show you the code they produce. Why wouldn’t you trust it after reading it?

ActionHank · 3h ago
They can be great for focused tasks with very specific acceptance criteria. Especially in cases where you have broad test coverage that can verify nothing broke.

We already see bots that monitor repos to bump versions. I suspect we will see this expand to handle larger version bumps, minor issues, minor features. Basically junior dev learning tasks.

SchizoDuckie · 3h ago
Great. So Junior devs will be useless now. Now how are we going to train more senior devs that know what they're doing?
ianandrich · 3h ago
Thats the neat part. We won't.
alex_suzuki · 3h ago
No need. In a year, senior devs will be useless as well. </sarcasm>
seunosewa · 2h ago
They will train themselves by doing open source projects with AI.
midnitewarrior · 3h ago
I really appreciate your optimism about a future world where you expect senior devs will be needed. How do we get the tech bros to share your vision for the future?
SchizoDuckie · 3h ago
As it stands right now, until there is some radically new way that doesn't hallucinate implementations, is grounded in security rules and actually understands what it's doing in the larger context of the system it's working in I am not really worried about my job.

I stopped worrying about what techbro's think a long time ago. I saw one slinging a blockchain ai nft filesystem that will ingest and organize your documents for you on twitter yesterday.

brap · 3h ago
I’m sorry but how is that any of your business?

If a company prefers small teams right now, at the cost of not having juniors to grow into seniors in the future, they are well within their rights to make that decision.

Might be an awful decision, might be a smart one, in any case there is no “we” here.

SchizoDuckie · 3h ago
How is that any of my business? Well, I'm a software dev by trade and hobby, and I hack the planet on the side and advise multibillion $$$ companies on the security mistakes they make.

Even for the next 5 years I'd like to be able to have some capable humans in my teams.

brap · 31m ago
Then hire juniors for your own team? How is this an issue?
vineyardmike · 2h ago
> I’m sorry but how is that any of your business?

Part of living in a society is considering the social impact of things. Such as the erosion of training opportunities for young talent.

Each business can make their own decisions, but someone should be thinking about the greater good. “Within your rights” doesn’t mean it’s a good thing, nor should that be the sole standard we set for members of our society. Same reason people hire interns and write technical blogs, open source code and sponsor school hackathons. Sometimes the greater good should be a consideration.

brap · 32m ago
>Same reason people hire interns and write technical blogs

I’m sorry but almost nobody does this for the greater good

asadm · 56m ago
i do. ALL. THE. TIME.
byefruit · 3h ago
How is this different from https://github.com/google-gemini/gemini-cli ?

Edit: it seems this is a hosted version. Would be nice if they actually joined up some of their products.

mattnewton · 3h ago
Idk, I think this is easier to talk about than “codex” by open ai which means either means the cli or the web interface to an agent with its own computer.

(Or a deprecated code fine tuned model)

0x457 · 54m ago
Jules is web only from my understanding, similar to OpenAI's Codex (web version...)

You give it a task and it produces a PR. While gemini-cli is more like pair programming with AI.

esafak · 3h ago
Being hosted, it does not have access to your development environment. Its Ubuntu sandbox is quite restricted. https://jules.google/docs/environment/
computer23 · 35m ago
Waiting for Google to buy the rights to Ask Jeeves.
Retr0id · 3h ago
> over 140,000 code improvements shared publicly.

Where can I check them out?

UncleOxidant · 2h ago
What does it mean by "asynchronous coding agent" exactly? They don't go into any details there. Like how does this differ from Gemini CLI? Is this more of a pass a high level idea to it and then go on vacation sort of thing? If so, I don't see how that can't end badly.
nemomarx · 2h ago
give high level user stories to it > it writes code and tests and etc for several hours > returns to you when it thinks it's done for you to review a pull request or etc
UncleOxidant · 2h ago
I'm afraid that's a hard nope. Gemini CLI is already doing stuff I don't want it to unless I'm very careful to keep it on a short leash.
0x457 · 49m ago
Well, first Jules came before Gemini CLI. Second, that's okay, as long as it can verify its work (i.e. run tests) it will eventually figure out what to do.

Its sandbox is very limited and prevents proper grounding IMO. However, if their sandbox works for your project, it will be alright.

franze · 3h ago
the agent is good, the UI horrible.

"Usability declines in inverse proportion to the number of vice-presidents who sign the release notes." Law of Interface Inversion

esafak · 3h ago
theusus · 3h ago
Used it didn't like it. Claude Code is far better because the active collaboration part.
mvieira38 · 3h ago
Different use cases, IMO. With a cloud solution like this it's much easier to ask it to solve whatever issues or backlog tasks you have and continue working on your own on your main project. I don't think this is a solution for vibecoding or for the AI copilot crowd
throwup238 · 3h ago
It is also great for on the go when you only have a phone. I frequently fire off agents when I get a new idea or some backlog I want to tackle while I’m the gym - the 2 minute rest periods between sets is perfect to write up a prompt or review some changes.
r0fl · 3h ago
I thought I would like it based on the pitch but gave up using it after just a handful of times

Liking kiro a lot these days

ghawkescs · 3h ago
How long is the queue for invites to Kiro these days? I joined the wait-list right after it launched.
jjani · 3h ago
Seemingly infinite, I don't think they've invited anyone from the list so far.
0x457 · 47m ago
Do you need an invitation? I'm just using my Amazon Q Dev account that I pay $20 a month for. Works fine with Kiro.
beefnugs · 2h ago
So are there just 100 developers sitting in the edge of their seats constantly refreshing all the spy reports from other AI companies, waiting to copy the exact same idea and shit it out at top speed?

Or is it more of a vibe code thing where every new feature from everyone is recreated by every other company in a matter of days?

Do they even realize they are destroying their own industry economics? The only reason anyone uses big tech is because there are no alternatives

dvngnt_ · 2h ago
How long do we think it will take for google to rename this like the did from Bard > Gemini
arcticfox · 3h ago
My problem with Codex is it can't really run Docker. Can Jules or any other competitor?
achierius · 3h ago
Claude Code can.
0x457 · 46m ago
Entirely different product. Claude Code runs on my machine, Jules runs on some sandboxed ubuntu vm in GCP without any input (beyond high level user story promp)
herval · 2h ago
I’m I being pedantic or does the jules.google landing page screams “howdie, kids” (the Buschemi meme).

It tries to be funny and authentic, but the cheap looking mascot and low contrast text makes it feel like IBM pretending to be vibecoded startup.

Google has/had a distinct branding with its austere and no-nonsense style in the past, then moved into a clunky-but-not-AWS design aesthetic with GCP (which is still recognizable), and now the AI products just look so completely inconsistent, you can’t even tell they’re from Google

jmtulloss · 1h ago
Both from the design scheme and the process it uses to go about its business, Jules seems very inspired by replit
varispeed · 1h ago
They could call is Deidre. Missed opportunity.
simonpure · 1h ago
There's now also Gemini CLI GitHub Actions for a similar async experience -

https://github.com/google-github-actions/run-gemini-cli

kundi · 2h ago
Tried both Jules and Gemini CLI, heavily advertised and disappointing. Running it on any slightly more complex codebase, it will crash every few iterations and then complain I have drained all the credits (although it hasn’t done anything yet), not close to live up any basic expectations to their advertised generosity. Disappointing experience
oblio · 4h ago
What does this compete with?
felipemesquita · 3h ago
The codex thing inside ChatGPT, the copilot thing in the github web ui
throwup238 · 3h ago
OpenAI Codex, Github Copilot Agents, Cursor Background Agents, and Devin.
rmonvfer · 3h ago
Is Devin still alive?
hiatus · 1h ago
Yes, and Cognition AI bought Windsurf.
esafak · 3h ago
Hosted agents, followed by local CLI agents.
joshdick · 3h ago
Claude Code
mattnewton · 3h ago
No Gemini-cli competes with that, this competes with the web-interface-around-agent-with-it’s-own-machine space,

not the pair-programming-on-your-machine space I would put the cli tools in

loloquwowndueo · 3h ago
They even had to choose a French-sounding name to make the comparison clear?
longtimelistnr · 3h ago
dont shoot the messenger, but it's supposed to be like a "butler sounding name"
loloquwowndueo · 1h ago
Got it - and Jenkins and Hudson were already taken.
nathan_douglas · 1h ago
Wadsworth is free. brb startup
span_ · 3h ago
Codex by openai
jebronie · 3h ago
its way better than the github thing in my experience it produces usable PRs
0x457 · 33m ago
A blind monkey smashing a keyboard can produce better PR and PR reviews than GitHub copilot. I don't get how they managed to make copilot so bad.
42lux · 3h ago
The naming is pathetic.
esafak · 3h ago
Jules the octopus!