> Copilot excels at low-to-medium complexity tasks in well-tested codebases, from adding features and fixing bugs to extending tests, refactoring, and improving documentation.
Bounds bounds bounds bounds. The important part for humans seems to be maintaining boundaries for AI. If your well-tested codebase has the tests built thru AI, its probably not going to work.
I think its somewhat telling that they can't share numbers for how they're using it internally. I want to know that Microsoft, the company famous for dog-fooding is using this day in and day out, with success. There's real stuff in there, and my brain has an insanely hard time separating the trillion dollars of hype from the usefulness.
timrogers · 1h ago
We've been using Copilot coding agent internally at GitHub, and more widely across Microsoft, for nearly three months. That dogfooding has been hugely valuable, with tonnes of valuable feedback (and bug bashing!) that has helped us get the agent ready to launch today.
So far, the agent has been used by about 400 GitHub employees in more than 300 our our repositories, and we've merged almost 1,000 pull requests contributed by Copilot.
In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)
(Source: I'm the product lead at GitHub for Copilot coding agent.)
overfeed · 51m ago
> we've merged almost 1,000 pull requests contributed by Copilot
I'm curious to know how many Copilot PRs were not merged and/or required human take-overs.
every bullet hole in that plane is the 1k PRs contributed by copilot. The missing dots, and whole missing planes, are unaccounted for. Ie, "ai ruined my morning"
NitpickLawyer · 32m ago
> In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)
Really cool, thanks for sharing! Would you perhaps consider implementing something like these stats that aider keeps on "aider writing itself"? - https://aider.chat/HISTORY.html
binarymax · 1h ago
So I need to ask: what is the overall goal of your project? What will you do in, say, 5 years from now?
timrogers · 1h ago
What I'm most excited about is allowing developers to spend more of their time working on the work they enjoy, and less of their time working on mundane, boring or annoying tasks.
Most developers don't love writing tests, or updating documentation, or working on tricky dependency updates - and I really think we're heading to a world where AI can take the load of that and free me up to work on the most interesting and complex problems.
petetnt · 12m ago
What about developers who do enjoy writing for example high quality documentation? Do you expect that the status quo will be that most of the documentation will be AI slop and AI itself will just bruteforce itself through the issues? How close are we to the point where the AI could handle "tricky dependency updates", but not being able to handle "most interesting and complex problems"? Who writes the tests that are required for the "well tested" codebases for GitHub Copilot Coding Agent to work properly?
binarymax · 59m ago
Thanks for the response… do you see a future where engineers are just prompting all the time? Do you see a timeline in which todays programming languages are “low level” and rarely coded by hand?
ilaksh · 1h ago
That's a completely nonsensical question given how quickly things are evolving. No one has a five year project timeline.
binarymax · 1h ago
Absolutely the wrong take. We MUST think about what might happen in several years. Anyone who says we shouldn’t is not thinking about this technology correctly. I work on AI tech. I think about these things. If the teams at Microsoft or GitHub are not, then we should be pushing them to do so.
ilaksh · 20m ago
He asked that in the context of an actual specific project. It did not make sense way he asked it. And it's the executive's to plan that out five years down the line.. although I guarantee you none of them are trying to predict that far.
ilaksh · 1h ago
What model does it use? gpt-4.1? Or can it use o3 sometimes? Or the new Codex model?
aaroninsf · 1h ago
Question you may have a very informed perspective on:
where are we wrt the agent surveying open issues (say, via JIRA) and evaluating which ones it would be most effective at handling, and taking them on, ideally with some check-in for conirmation?
Or, contrariwise, from having product management agents which do track and assign work?
The entire website was created by Claude Sonnet through Windsurf Cascade, but with the “Fair Witness” prompt embedded in the global rules.
If you regularly guide the LLM to “consult a user experience designer”, “adopt the multiple perspectives of a marketing agenc”, etc., it will make rather decent suggestions.
I’ve been having pretty good success with this approach, granted mostly at the scale of starting the process with “build me a small educational website to convey this concept”.
aegypti · 6m ago
Tell Claude the site is down!
twodave · 2h ago
I feel like I saw a quote recently that said 20-30% of MS code is generated in some way. [0]
In any case, I think this is the best use case for AI in programming—as a force multiplier for the developer. It’s for the best benefit of both AI and humanity for AI to avoid diminishing the creativity, agency and critical thinking skills of its human operators. AI should be task oriented, but high level decision-making and planning should always be a human task.
So I think our use of AI for programming should remain heavily human-driven for the long term. Ultimately, its use should involve enriching humans’ capabilities over churning out features for profit, though there are obvious limits to that.
> I feel like I saw a quote recently that said 20-30% of MS code is generated in some way. [0]
Similar to google. MS now requires devs to use ai
ilaksh · 1h ago
You might want to study the history of technology and how rapidly compute efficiency has increased as well as how quickly the models are improving.
In this context, assuming that humans will still be able to do high level planning anywhere near as well as an AI, say 3-5 years out, is almost ludicrous.
_se · 58m ago
Reality check time for you: people were saying this exact thing 3 years ago. You cannot extrapolate like that.
greatwhitenorth · 1h ago
How much was previously generated by intellisense and other code gen tools before AI? What is the delta?
tmpz22 · 2h ago
How much of that is protobuf stubs and other forms of banal autogenerate code?
twodave · 2h ago
Updated my comment to include the link. As much as 30% specifically generated by AI.
OnionBlender · 1h ago
The 2nd paragraph contradicts the title.
The actual quote by Satya says, "written by software".
shafyy · 1h ago
I would still wager that most of the 30% is some boilterplate stuff. Which is ok. But sounds less impressive with that caveat.
Scene_Cast2 · 2h ago
I tried doing some vibe coding on a greenfield project (using gemini 2.5 pro + cline). On one hand - super impressive, a major productivity booster (even compared to using a non-integrated LLM chat interface).
I noticed that LLMs need a very heavy hand in guiding the architecture, otherwise they'll add architectural tech debt. One easy example is that I noticed them breaking abstractions (putting things where they don't belong). Unfortunately, there's not that much self-retrospection on these aspects if you ask about the quality of the code or if there are any better ways of doing it. Of course, if you pick up that something is in the wrong spot and prompt better, they'll pick up on it immediately.
I also ended up blowing through $15 of LLM tokens in a single evening. (Previously, as a heavy LLM user including coding tasks, I was averaging maybe $20 a month.)
candiddevmike · 2h ago
> I also ended up blowing through $15 of LLM tokens in a single evening.
This is a feature, not a bug. LLMs are going to be the next "OMG my AWS bill" phenomenon.
Scene_Cast2 · 2h ago
Cline very visibly displays the ongoing cost of the task. Light edits are about 10 cents, and heavy stuff can run a couple of bucks. It's just that the tab accumulates faster than I expect.
eterm · 1h ago
> Light edits are about 10 cents
Some well-paid developers will excuse this with, "Well if it saved me 5 minutes, it's worth an order of magnitude than 10 cents".
Which is true, however there's a big caveat: Time saved isn't time gained.
You can "Save" 1,000 hours every night, but you don't actuall get those 1,000 hours back.
PretzelPirate · 2h ago
> Cline very visibly displays the ongoing cost of the task
LLMs are now being positioned as "let them work autonomously in the background" which means no one will be watching the cost in real time.
Perhaps I can set limits on how much money each task is worth, but very few would estimate that properly.
BeetleB · 2h ago
> I also ended up blowing through $15 of LLM tokens in a single evening.
Consider using Aider, and aggressively managing the context (via /add, /drop and /clear).
My tool Plandex[1] allows you to switch between automatic and manual context management. It can be useful to begin a task with automatic context while scoping it out and making the high level plan, then switch to the more 'aider-style' manual context management once the relevant files are clearly established.
If you want to use Cline and are at all price sensitive (in these ranges) you have to do manual context management just for that reason. I find that too cumbersome and use Windsurf (currently with Gemini 2.5 pro) for that reason.
falcor84 · 2h ago
> LLMs need a very heavy hand in guiding the architecture, otherwise they'll add architectural tech debt
I wonder if the next phase would be the rise of (AI-driven?) "linters" that check that the implementation matches the architecture definition.
dontlikeyoueith · 2h ago
And now we've come full circle back to UML-based code generation.
Everything old is new again!
tmpz22 · 2h ago
While its being touted for Greenfield projects I've notices a lot of failures when it comes to bootstrapping a stack.
For example it (Gemini 2.5) really struggles with newer ecosystem like Fastapi when wiring libraries like SQLAlchemy, Pytest, Python-playwright, etc., together.
I find more value in bootstrapping myself, and then using it to help with boiler plate once an effective safety harness is in place.
nodja · 1h ago
I wish they optimized things before adding more crap that will slow things down even more. The only thing that's fast with copilot is the autocomplete, it sometimes takes several minutes to make edits on a 100 line file regardless of the model I pick (some are faster than others). If these models had a close to 100% hit rate this would be somewhat fine, but going back and forth with something that takes this long is not productive. It's literally faster to open claude/chatgpt on a new tab and paste the question and code there and paste it back into vscode than using their ask/edit/agent tools.
I've cancelled my copilot subscription last week and when it expires in two weeks I'll mostly likely shift to local models for autocomplete/simple stuff.
brushfoot · 1h ago
My experience has mostly been the opposite -- changes to several-hundred-line files usually only take a few seconds.
That said, months ago I did experience the kind of slow agent edit times you mentioned. I don't know where the bottleneck was, but it hasn't come back.
I'm on library WiFi right now, "vibe coding" (as much as I dislike that term) a new tool for my customers using Copilot, and it's snappy.
nodja · 43m ago
Here's a video of what it looks like with sonnet 3.7.
Kicking the can down the road. So we can all produce more code faster but there is NSB. Most of my time isn't spent writing the code anyway.
muglug · 2h ago
> Copilot excels at low-to-medium complexity tasks
Oh cool!
> in well-tested codebases
Oh ok never mind
lukehoban · 1h ago
As peer commenters have noted, coding agent can be really good at improving test coverage when needed.
But also as a slightly deeper observation - agentic coding tools really do benefit significantly from good test coverage. Tests are a way to “box in” the agent and allow it to check its work regularly. While they aren’t necessary for these tools to work, they can enable coding agents to accomplish a lot more on your behalf.
(I work on Copilot coding agent)
CSMastermind · 1h ago
In my experience they write a lot of pointless tests that technically increase coverage while not actually adding much more value than a good type system/compiler would.
They also have a tendency to suppress errors instead of fixing them, especially when the right thing to do is throw an error on some edge case.
abraham · 2h ago
Have it write tests for everything and then you've got a well tested codebase.
danielbln · 1h ago
Caveat empor, I've seen some LLMs mock the living hell out of everything, to the point of not testing much of anything. Something to be aware of.
yen223 · 21m ago
I've seen too many human operators do that too. Definitely a problem to watch out for
eikenberry · 1h ago
You forgot the /s
throwaway12361 · 1h ago
In my experience it works well even without good testing, at least for greenfield projects. It just works best if there are already tests when creating updates and patches.
shwouchk · 1h ago
I played around with it quite a bit. it is both impressive and scary. most importantly, it tends to indiscriminately use dependencies from random tiny repos, and often enough not the correct ones, for major projects. buyer beware.
boomskats · 2h ago
My buddy is at GH working on an adjacent project & he hasn't stopped talking about this for the last few days. I think I've been reminded to 'make sure I tune into the keynote on Monday' at least 8 times now.
I gave up trying to watch the stream after the third authentication timeout, but if I'd known it was this I'd maybe have tried a fourth time.
unshavedyak · 2h ago
What specific keynote are they referring to? I'm curious, but thus far my searches have failed
babelfish · 2h ago
MS Build is today
tmpz22 · 2h ago
I’m always hesitant to listen to the line coders on projects because they’re getting a heavy dose of the internal hype every day.
I’d love for this to blow past cursor. Will definitely tune in to see it.
dontlikeyoueith · 2h ago
>I’m always hesitant to listen to the line coders on projects because they’re getting a heavy dose of the internal hype every day.
I'm senior enough that I get to frequently see the gap between what my dev team thinks of our work and what actual customers think.
As a result, I no longer care at all what developers (including myself on my own projects) think about the quality of the thing they've built.
throwaway12361 · 1h ago
Word of advice: just go to YouTube and skip the MS registration tax
jerpint · 2h ago
These kinds of patterns allow compute to take much more time than a single chat since it is asynchronous by nature, which I think is necessary to get to working solutions on harder problems
lukehoban · 1h ago
Yes. This is a really key part of why Copilot coding agent feels very different to use than Copilot agent mode in VS Code.
In coding agent, we encourage the agent to be very thorough in its work, and to take time to think deeply about the problem. It builds and tests code regularly to ensure it understands the impact of changes as it makes them, and stops and thinks regularly before taking action.
These choices would feel too “slow” in a synchronous IDE based experience, but feel natural in a “assign to a peer collaborator” UX. We lean into this to provide as rich of a problem solving agentic experience as possible.
(I’m working on Copilot coding agent)
No comments yet
fvold · 37m ago
The biggest change Copilot has done for me so far is to have me replace my VSCode with VSCodium to be sure it doesn't sneak any uploading of my code to a third party without my knowing.
I'm all for new tech getting introduced and made useful, but let's make it all opt in, shall we?
qwertox · 34m ago
Care to explain? Where are they uploading code to?
In the early days on LLM, I had developed an "agent" using github actions + issues workflow[1], similar to how this works. It was very limited but kinda worked ie. you assign it a bug and it fired an action, did some architect/editing tasks, validated changes and finally sent a PR.
In hindsight it was a mistake that Google killed Google Code. Then again, I guess they wouldn't have put enough effort into it to develop into a real GitHub alternative.
Now Microsoft sits on a goldmine of source code and has the ability to offer AI integration even to private repositories. I can upload my code into a private repo and discuss it with an AI.
The only thing Google can counter with would be to build tools which developers install locally, but even then I guess that the integration would be limited.
And considering that Microsoft owns the "coding OS" VS Code, it makes Google look even worse. Let's see what they come up with tomorrow at Google I/O, but I doubt that it will be a serious competition for Microsoft. Maybe for OpenAI, if they're smart, but not for Microsoft.
joelthelion · 1h ago
I don't know, I feel this is the wrong level to place the AI at this moment. Chat-based AI programming (such as Aider) offers more control, while being almost as convenient.
softwaredoug · 2h ago
Is Copilot a classic case of slow megacorp gets outflanked by more creative and unhindered newcomers (ie Cursor)?
It seems Copilot could have really owned the vibe coding space. But that didn’t happen. I wonder why? Lots of ideas gummed up in organizational inefficiencies, etc?
ilaksh · 1h ago
This is a direct threat to Cursor. The smarter the models get, the less often programmers really need to dig into an IDE, even one with AI in it. Give it a couple of years and there will be a lot of projects that were done just by assigning tasks where no one even opened Cursor or anything.
OutOfHere · 2h ago
GitHub had this exact feature late last year itself, perhaps under a slightly different name.
Copilot Workspace could take a task, implement it and create a PR - but it had a linear, highly structured flow, and wasn't deeply integrated into the GitHub tools that developers already use like issues and PRs.
With Copilot coding agent, we're taking all of the great work on Copilot Workspace, and all the learnings and feedback from that project, and integrating it more deeply into GitHub and really leveraging the capabilities of 2025's models, which allow the agent to be more fluid, asynchronous and autonomous.
(Source: I'm the product lead for Copilot coding agent.)
throwup238 · 2h ago
Are you thinking if Copilot Workspaces?
That seemed to drop off the Github changelog after February. I’m wondering if that team got reallocated to the copilot agent.
WorldMaker · 1h ago
Probably. Also this new feature seems like an expansion/refinement of Copilot Workspaces to better fit the classic Github UX: "assign an issue to Copilot to get a PR" sounds exactly like the workflow Copilot Workspaces wanted to have when it grew up.
theusus · 2h ago
I have been so far disappointed by copilot's offerings. It's just not good enough for anything valuable. I don't want you to write my getter and setter. And call it a day.
rvz · 1h ago
I think we expected disappointment with this one. (I expected it at least)[0]
But the upgraded Copilot was just in response to Cursor and Winsurf.
Which model does it use? Will this let me select which model to use? I have seen a big difference in the type of code that different models produce, although their prompts may be to blame/credit in part.
qwertox · 40m ago
I assume you can select whichever one you want (GPT-4o, o3-mini, Claude 3.5, 3.7, 3.7 thinking, Gemini 2.0 Flash, GPT=4.1 and the previews o1, Gemini 2.5 Pro and 04-mini), subject to the pricing multiplicators they announced recently [0].
Edit: From the TFA: Using the agent consumes GitHub Actions minutes and Copilot premium requests, starting from entitlements included with your plan.
Check in unreviewed slop straight into the codebase. Awesome.
timrogers · 1h ago
Copilot pushes its work to a branch and creates a pull request, and then it's up to you to review its work, approve and merge.
Copilot literally can't push directly to the default branch - we don't give it the ability to do that - precisely because we believe that all AI-generated code (just like human generated code) should be carefully reviewed before it goes to production.
(Source: I'm the product lead for Copilot coding agent.)
olex · 2h ago
> Once Copilot is done, it’ll tag you for review. You can ask Copilot to make changes by leaving comments in the pull request.
To me, this reads like it'll be a good junior and open up a PR with its changes, letting you (the issue author) review and merge. Of course, you can just hit "merge" without looking at the changes, but then it's kinda on you when unreviewed stuff ends up in main.
DeepYogurt · 2h ago
Management: "Why aren't you going faster now that the AI generates all the code and we fired half the dev team?"
tmpz22 · 2h ago
A good junior has strong communication skills, humility, asks many good questions, has imagination, and a tremendous amount of human potential.
odiroot · 2h ago
I'm waiting for the first unicorn that uses just vibe coding.
erikerikson · 2h ago
I expect it to be a security nightmare
freeone3000 · 1h ago
And why would that matter?
postalrat · 2h ago
Now developers can produce 20x the slop and refactor at 5x speed.
OutOfHere · 2h ago
In my experience in VSCode, Claude 3.7 produced more unsolicited slop, whereas GPT-4.1 didn't. Claude aggressively paid attention to type compatibility. Each model would have its strengths.
Bounds bounds bounds bounds. The important part for humans seems to be maintaining boundaries for AI. If your well-tested codebase has the tests built thru AI, its probably not going to work.
I think its somewhat telling that they can't share numbers for how they're using it internally. I want to know that Microsoft, the company famous for dog-fooding is using this day in and day out, with success. There's real stuff in there, and my brain has an insanely hard time separating the trillion dollars of hype from the usefulness.
So far, the agent has been used by about 400 GitHub employees in more than 300 our our repositories, and we've merged almost 1,000 pull requests contributed by Copilot.
In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)
(Source: I'm the product lead at GitHub for Copilot coding agent.)
I'm curious to know how many Copilot PRs were not merged and/or required human take-overs.
every bullet hole in that plane is the 1k PRs contributed by copilot. The missing dots, and whole missing planes, are unaccounted for. Ie, "ai ruined my morning"
Really cool, thanks for sharing! Would you perhaps consider implementing something like these stats that aider keeps on "aider writing itself"? - https://aider.chat/HISTORY.html
Most developers don't love writing tests, or updating documentation, or working on tricky dependency updates - and I really think we're heading to a world where AI can take the load of that and free me up to work on the most interesting and complex problems.
where are we wrt the agent surveying open issues (say, via JIRA) and evaluating which ones it would be most effective at handling, and taking them on, ideally with some check-in for conirmation?
Or, contrariwise, from having product management agents which do track and assign work?
The entire website was created by Claude Sonnet through Windsurf Cascade, but with the “Fair Witness” prompt embedded in the global rules.
If you regularly guide the LLM to “consult a user experience designer”, “adopt the multiple perspectives of a marketing agenc”, etc., it will make rather decent suggestions.
I’ve been having pretty good success with this approach, granted mostly at the scale of starting the process with “build me a small educational website to convey this concept”.
In any case, I think this is the best use case for AI in programming—as a force multiplier for the developer. It’s for the best benefit of both AI and humanity for AI to avoid diminishing the creativity, agency and critical thinking skills of its human operators. AI should be task oriented, but high level decision-making and planning should always be a human task.
So I think our use of AI for programming should remain heavily human-driven for the long term. Ultimately, its use should involve enriching humans’ capabilities over churning out features for profit, though there are obvious limits to that.
[0] https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-a...
Similar to google. MS now requires devs to use ai
In this context, assuming that humans will still be able to do high level planning anywhere near as well as an AI, say 3-5 years out, is almost ludicrous.
The actual quote by Satya says, "written by software".
I noticed that LLMs need a very heavy hand in guiding the architecture, otherwise they'll add architectural tech debt. One easy example is that I noticed them breaking abstractions (putting things where they don't belong). Unfortunately, there's not that much self-retrospection on these aspects if you ask about the quality of the code or if there are any better ways of doing it. Of course, if you pick up that something is in the wrong spot and prompt better, they'll pick up on it immediately.
I also ended up blowing through $15 of LLM tokens in a single evening. (Previously, as a heavy LLM user including coding tasks, I was averaging maybe $20 a month.)
This is a feature, not a bug. LLMs are going to be the next "OMG my AWS bill" phenomenon.
Some well-paid developers will excuse this with, "Well if it saved me 5 minutes, it's worth an order of magnitude than 10 cents".
Which is true, however there's a big caveat: Time saved isn't time gained.
You can "Save" 1,000 hours every night, but you don't actuall get those 1,000 hours back.
LLMs are now being positioned as "let them work autonomously in the background" which means no one will be watching the cost in real time.
Perhaps I can set limits on how much money each task is worth, but very few would estimate that properly.
Consider using Aider, and aggressively managing the context (via /add, /drop and /clear).
https://aider.chat/
1 - https://github.com/plandex-ai/plandex
Also, a bit more on auto vs. manual context management in the docs: https://docs.plandex.ai/core-concepts/context-management
I wonder if the next phase would be the rise of (AI-driven?) "linters" that check that the implementation matches the architecture definition.
Everything old is new again!
For example it (Gemini 2.5) really struggles with newer ecosystem like Fastapi when wiring libraries like SQLAlchemy, Pytest, Python-playwright, etc., together.
I find more value in bootstrapping myself, and then using it to help with boiler plate once an effective safety harness is in place.
I've cancelled my copilot subscription last week and when it expires in two weeks I'll mostly likely shift to local models for autocomplete/simple stuff.
That said, months ago I did experience the kind of slow agent edit times you mentioned. I don't know where the bottleneck was, but it hasn't come back.
I'm on library WiFi right now, "vibe coding" (as much as I dislike that term) a new tool for my customers using Copilot, and it's snappy.
https://streamable.com/rqlr84
The claude and gemini models tend to be the slowest (yes, including flash). 4o is currently the fastest but still not great.
https://streamable.com/rqlr84
Oh cool!
> in well-tested codebases
Oh ok never mind
But also as a slightly deeper observation - agentic coding tools really do benefit significantly from good test coverage. Tests are a way to “box in” the agent and allow it to check its work regularly. While they aren’t necessary for these tools to work, they can enable coding agents to accomplish a lot more on your behalf.
(I work on Copilot coding agent)
They also have a tendency to suppress errors instead of fixing them, especially when the right thing to do is throw an error on some edge case.
I gave up trying to watch the stream after the third authentication timeout, but if I'd known it was this I'd maybe have tried a fourth time.
I’d love for this to blow past cursor. Will definitely tune in to see it.
I'm senior enough that I get to frequently see the gap between what my dev team thinks of our work and what actual customers think.
As a result, I no longer care at all what developers (including myself on my own projects) think about the quality of the thing they've built.
In coding agent, we encourage the agent to be very thorough in its work, and to take time to think deeply about the problem. It builds and tests code regularly to ensure it understands the impact of changes as it makes them, and stops and thinks regularly before taking action.
These choices would feel too “slow” in a synchronous IDE based experience, but feel natural in a “assign to a peer collaborator” UX. We lean into this to provide as rich of a problem solving agentic experience as possible.
(I’m working on Copilot coding agent)
No comments yet
I'm all for new tech getting introduced and made useful, but let's make it all opt in, shall we?
Good to see an official way of doing this.
1. https://github.com/asadm/chota
Now Microsoft sits on a goldmine of source code and has the ability to offer AI integration even to private repositories. I can upload my code into a private repo and discuss it with an AI.
The only thing Google can counter with would be to build tools which developers install locally, but even then I guess that the integration would be limited.
And considering that Microsoft owns the "coding OS" VS Code, it makes Google look even worse. Let's see what they come up with tomorrow at Google I/O, but I doubt that it will be a serious competition for Microsoft. Maybe for OpenAI, if they're smart, but not for Microsoft.
It seems Copilot could have really owned the vibe coding space. But that didn’t happen. I wonder why? Lots of ideas gummed up in organizational inefficiencies, etc?
Copilot Workspace could take a task, implement it and create a PR - but it had a linear, highly structured flow, and wasn't deeply integrated into the GitHub tools that developers already use like issues and PRs.
With Copilot coding agent, we're taking all of the great work on Copilot Workspace, and all the learnings and feedback from that project, and integrating it more deeply into GitHub and really leveraging the capabilities of 2025's models, which allow the agent to be more fluid, asynchronous and autonomous.
(Source: I'm the product lead for Copilot coding agent.)
That seemed to drop off the Github changelog after February. I’m wondering if that team got reallocated to the copilot agent.
But the upgraded Copilot was just in response to Cursor and Winsurf.
We'll see.
[0] https://news.ycombinator.com/item?id=43904611
Edit: From the TFA: Using the agent consumes GitHub Actions minutes and Copilot premium requests, starting from entitlements included with your plan.
[0] https://docs.github.com/en/copilot/managing-copilot/monitori...
Copilot literally can't push directly to the default branch - we don't give it the ability to do that - precisely because we believe that all AI-generated code (just like human generated code) should be carefully reviewed before it goes to production.
(Source: I'm the product lead for Copilot coding agent.)
To me, this reads like it'll be a good junior and open up a PR with its changes, letting you (the issue author) review and merge. Of course, you can just hit "merge" without looking at the changes, but then it's kinda on you when unreviewed stuff ends up in main.