If AI is so good at coding where are the open source contributions?

61 thm 28 5/15/2025, 6:24:27 PM pivot-to-ai.com ↗

Comments (28)

avbanks · 7h ago
This is exactly what I've been trying to point out, while LLM's and coding agents are certainly helpful, they're extremely over-hyped. We don't see a significant bump in open source contributions, optimizations, and innovation in general.
drob518 · 5h ago
Yep. It’s all frothy hype. It’s all crashing down at some point. The billion dollar valuations. Nvidia’s revenue. All of it. That’s not to say that it’s worthless, but everyone is throwing money around like the dot-com days and LLMs (thus far) are not nearly as transformative as the Internet. I’m even hearing that phrase investors use when it’s about to pop: “this time it’s different.”
api · 3h ago
I lived through the late 90s and the dot.com bubble. This is exactly the same. The Internet was real but a lot of the hype crashed and a lot of ventures from those days failed.

LLMs are real but they are also overhyped and imbued with powers I don’t think they have.

The most interesting stuff with AI will be done after the bubble collapses.

rerdavies · 2h ago
I lived through the 80s. Those who owned Microsoft stock did pretty well. This is exactly the same.

:-P

johnisgood · 5h ago
I have made some "serious" contributions with the help of LLMs. Sadly I cannot share them, but hey, N = 1, so it is not significant.
jsheard · 6h ago
I keep seeing users of the AI-centric VSCode forks impotently complaining about Microsoft withholding their first-party extensions from those forks, rather than using their newfound AI superpowers to whip up open source replacements. That should be easy, right? Surely there's no need to be spoonfed by Microsoft when we're in the golden age of agenic coding?
jwitthuhn · 3h ago
There are some but not a lot, current AI is a lot better focusing on smaller more well-defined problems than stuff like "add this feature" or "fix this bug".

A good example is this PR for llama.cpp which the author says was written 99% by AI. https://github.com/ggerganov/llama.cpp/pull/11453

When the problem statement can be narrowed down to "Here is some code that already produces a correct result, optimize it as much as possible" then the AI doesn't need to understand any context specific to your application at all and the chance it will produce something usable is much higher.

Maybe it will eventually be better at big picture stuff but for now I don't see it.

rerdavies · 2h ago
The problem I have is keeping up.

Current generation coding assistants are capable of doing things they could not do mere months ago, and are dramatically more capable than they were a year ago. I recently gave my coding assistant a task that was much larger than I usually trust my coding assistant with; and it produced a very large chunk of code that was remarkably well implemented. (And then rewrote about 2/3 of in order to add an innovative feature. But still).

As things stand at this particular moment, I am unsure of the upper limit of what I can trust this month's AI coding assistant with. Learning to use ones' tools properly is a big part of programming.

MoonGhost · 6h ago
Probably because AI generated code cannot be copyrighted. And the second reason it's not as good as AI sellers tell you.
mcny · 5h ago
> Probably because AI generated code cannot be copyrighted. And the second reason it's not as good as AI sellers tell you.

Is this true? I anal but I have a publicly available GitHub repo which shall go unnamed where most of the heavy lifting is by Claude and Gemini. If I can't copyright this code, does that mean I can't license it under AGPL? If someone copied some code from my repo, does that somehow now taint their code as well?

My application is a simple console app. The original code mostly worked. I have been updating it for the past month or so when I run this application and discover some defect. I have tried hard to avoid the temptation to write much of the code myself with this application and only made minor edits myself, opting to try coaxing an LLM to do the right thing.

This is not for work or even remotely work adjacent but I have been pleasantly surprised by the code quality. It still doesn't replace a programmer but it is impressive for what is essentially a glorified autocorrect auto complete iterated a little.

dragonwriter · 5h ago
> Is this true?

It is true that products of purely a prompt passed through an AI and returned cannot be copyrighted in the US currently, whether a particular body of code that involves AI generation but has a more involved workflow can be copyrighted is a case by case question (for copyright registration you need to document the actual human and AI process and present it to the copyright office and they will make a decision; if you didn't do this at the time, you've probably accidentally rendered it unprotectable even if would have been were the facts documented. Technically, copyright exists at creation without registration, but since you can't sue for infringement without registration, if you make it so you can't register the copyright, you've basically made it unenforceable even if it might abstractly exist.)

> I anal

IANAL should really be kept together and capitalized.

> but I have a publicly available GitHub repo which shall go unnamed where most of the heavy lifting is by Claude and Gemini. If I can't copyright this code, does that mean I can't license it under AGPL?

Absolutely. AGPL is a copyright license. If you have no copyright, you can't enforce any licensing restrictions. You can obviously include uncopyrightable-because-of-how-it-was-generated (and therefore public domain, unless it is violating someone else's copyright) code along with other code that is AGPL licensed, but you can't enforce the AGPL terms on the code you don't have a copyright too, because the copyright is what gives you permission to prohibit uses of the exclusive rights of copyright (copying, derivative works, etc.) that you haven't licensed.

> If someone copied some code from my repo, does that somehow now taint their code as well?

No, if they copied legally-unprotected code from your repo, it has no adverse effect on their repo. They, too, cannot enforce any licensing restrictions on the unprotected code, because neither you nor they have a copyright on it to enforce, but it doesn't adversely impact their ability to enforce their license on any of the code they have that is protected by a copyright.

heavyset_go · 5h ago
> Is this true? I anal but I have a publicly available GitHub repo which shall go unnamed where most of the heavy lifting is by Claude and Gemini. If I can't copyright this code, does that mean I can't license it under AGPL? If someone copied some code from my repo, does that somehow now taint their code as well?

No one knows, it's untested in court.

As far as I understand, you can own and license the code you modify or create yourself, but if someone decides to copy and paste code they can prove came from an LLM that they found in your project, you don't have much of a standing to enforce your license on them.

protocolture · 5h ago
>Is this true?

Depends. Probably not.

Code produced solely by AI with no inputs or modification is not copyrightable.

But theres an amount of human labor and direction you can mix in to pass that hurdle.

Likely:

1. If you never tell anyone its AI created it doesnt matter 2. Someone will test this in court one day and we will discover where the line is actually drawn

Until then you should follow the Terms and Conditions of the tool you use, it probably assigns you all applicable rights to the generated code.

Not a lawyer etc.

dragonwriter · 4h ago
> If you never tell anyone its AI created it doesn't matter

If you try to register the copyright (which you legally must to sue over copyright infringement), you must state whether it was AI generated and, if it was AI generated, provide the information that supports it being copyrightable for the Copyright Office to evaluate. Falsifying this information is a crime.

If it ever goes to court and the code did not provably exist before AI coding became common, you will be probably be questioned about it, and failing to respond truthfully is, again, a crime. So, when talking about the ability to claim and enforce legal rights, "if you never tell anyone about it", means, "if you are willing to actively and criminally lie".

> Until then you should follow the Terms and Conditions of the tool you use, it probably assigns you all applicable rights to the generated code.

It probably explicitly doesn't claim such rights (because the tool owner has no basis for claiming them, especially since they may not even exist), but it can't "assign" you anything the party offering the T&C's doesn't have in the first place, and even if it tried to that assignment would have no effect.

protocolture · 4h ago
>If you try to register the copyright

Which legal jurisdiction? Because I don't have to register shit?

>You don’t need to register for copyright in Australia. The moment an idea or creative concept is documented on paper or electronically it is automatically protected by copyright in Australia. Copyright protection is free and automatic under the Copyright Act 1968.

>It probably explicitly doesn't claim such rights

Most I have read assign everything to the user, Midjourney being an outlier because it reserves the right to distribute and remix your work without a bigger license fee.

They do this for practical reasons, they don't want to be seen to own anything that might be associated with criminal activity.

Sudowrite has their terms reproduced human readable.

> We understand it's super important for you to retain ownership over the content you create, so whether you use Sudowrite to brainstorm, enhance your ideas, or fine-tune your writing, all the writing you generate remains yours. When you input text into Sudowrite and use its features to generate or improve content, the output is based on your initial input and creative direction. This means that the resulting content, enriched or transformed by Sudowrite, is considered your intellectual property. You have initiated the creative process, and therefore, the contributions made by Sudowrite in response to your inputs are yours to own, use, and distribute as you see fit.

kazinator · 3h ago
If you get something upstreamed into a project based on misrepresenting is as human work copyrighted by you, you are doing a great harm to that project. Don't do that.
drob518 · 5h ago
You own your prompts. /s
Gigachad · 2h ago
How would this even be enforced? If I privately used AI to generate some code and then post it on github. How could someone verify if it was written by me or AI to decide if copyright applies or not?
heavyset_go · 5h ago
Your modifications are owned by you.
kazinator · 4h ago
Open source dev here. I cannot merge something generated by AI, because it is plagiarized, and therefore incompatible with the project license (pretty much any license: BSD, MIT, Apache, GPLn, ...).

A significant contribution to a project requires that the the contributor put a copyright note and license on it. The license has to be compatible with the project being contributed to. (Note: I'm not talking about copyright assignment, but, yes, that's a thing also, e.g. with GNU projects.)

You can't put your copyright notice and license on something that you generated with AI.

Small changes to existing code are in a gray area. They are more based on the existing code that the unauthorized training data, but the latter is hanging over them like a spectre.

I won't merge anything knowing it was done with AI, even a one liner bug fix. If AI were used to describe the issue and propose a fix, and then someone coded it based on that, I think that would be okay; that's something analogous to a clean room approach.

rerdavies · 4h ago
How could somebody who has actually used coding assistants be asking for evidence of open source projects that had been 100% written by an AI? That's not what the tools are used for.

Here: https://rerdavies.github.io/ About (at least) 30% of the NEW code in my open source repos are currently written by AI (older code is not). I don't think this is extraordinary. This is the new normal. This is what AI-generated source code looks like. I don't imagine anyone could actually point to a specific AI-generated chunk of code in this codebase and say with certainty that it had been written by an AI.

I can’t help feeling that people who make these sorts of complaints have not used coding assistants. Or perhaps have not used coding assistants recently. Are non-professional programmers writing about AI coding assistants they have never used really any better than non-programmers submitting AI generated pull requests? I think not.

davidgerard · 4h ago
of course, that doesn't answer the question Evans was asking at all.
rerdavies · 2h ago
The question Evans asked:

    But if AI is so obviously superior … show us the 
    code. Where’s the receipts? Let’s say, where’s 
    the open source code contributions using AI?"
The answer I provided:

    Here: https://rerdavies.github.io/ About (at least) 
    30% of the NEW code in my open source repos are 
    currently written by AI (older code is
    not). 
The point I wanted to make: code currently written by AIs isn't obviously different or segregated from code that is not. That's not how the tool is used.
SpecialistK · 6h ago
It's pretty obvious that AI code generation is not up to snuff for mission-critical projects (yet? ever?) - it can prove handy for small hobbyist projects with low stakes and it may provide time-savings and alternative perspectives for those who subsequently know how to sniff out BS.

But even airliner autopilot systems, which are much more mature and have a proven track record, are not trusted as a replacement for two trained experts to have final control.

The overall trend I've seen with AI creations (not just in programming) is that the tech is cool and improving, but people have trouble recognizing where its suitable and where it isn't. I've found chatbots to be fantastic for recipes and cooking advice, or banal conversations I couldn't have with real people (especially after a drink...) and pretty shoddy for real programming projects or creative endeavors. It isn't a panacea and we'd benefit a lot from people recognizing that more.

lalith_c · 6h ago
maybe open source projects are prohibiting code generated by AI?
v3ss0n · 6h ago
Where?, name them
rpdillon · 6h ago
The article is actually pretty interesting. The only one mentioned is Curl, and that's because of abuses by uneducated developers using AI with no idea of what they're doing.

I actually think that's the central thesis of the article, especially the last example that discusses the LLVM compiler project getting raked over the coals after not engaging with a non-developer that had used AI to make pull requests, and admitted he had no idea what the code did.

Buried in the middle of the article is a paragraph that I think sums up the main point well.

> More broadly, the very hardest problem in open source is not code, it’s people — how to work with others. Some AI users just don’t understand the level they simply aren’t working at.

The point being that without a good programmer, AI is not very useful.

johnisgood · 5h ago
> with no idea of what they're doing.

And that is the problem. I made some contributions to projects with the help of LLMs, but I had to know what I am doing.

> The point being that without a good programmer, AI is not very useful.

Exactly!