If AI is so good at coding where are the open source contributions?

83 thm 37 5/15/2025, 6:24:27 PM pivot-to-ai.com ↗

Comments (37)

avbanks · 32d ago

This is exactly what I've been trying to point out, while LLM's and coding agents are certainly helpful, they're extremely over-hyped. We don't see a significant bump in open source contributions, optimizations, and innovation in general.

drob518 · 32d ago

Yep. It’s all frothy hype. It’s all crashing down at some point. The billion dollar valuations. Nvidia’s revenue. All of it. That’s not to say that it’s worthless, but everyone is throwing money around like the dot-com days and LLMs (thus far) are not nearly as transformative as the Internet. I’m even hearing that phrase investors use when it’s about to pop: “this time it’s different.”

api · 32d ago

I lived through the late 90s and the dot.com bubble. This is exactly the same. The Internet was real but a lot of the hype crashed and a lot of ventures from those days failed.

LLMs are real but they are also overhyped and imbued with powers I don’t think they have.

The most interesting stuff with AI will be done after the bubble collapses.

rerdavies · 32d ago

I lived through the 80s. Those who owned Microsoft stock did pretty well. This is exactly the same.

:-P

johnisgood · 32d ago

I have made some "serious" contributions with the help of LLMs. Sadly I cannot share them, but hey, N = 1, so it is not significant.

pavel_lishin · 32d ago

We shall add the anecdote to the bucket.

johnisgood · 32d ago

Thank you! :D Gotta be one of the most useless ones.

There are probably more uses of LLMs than people admit to, though.

jsheard · 32d ago

I keep seeing users of the AI-centric VSCode forks impotently complaining about Microsoft withholding their first-party extensions from those forks, rather than using their newfound AI superpowers to whip up open source replacements. That should be easy, right? Surely there's no need to be spoonfed by Microsoft when we're in the golden age of agenic coding?

rerdavies · 32d ago

How could somebody who has actually used coding assistants be asking for evidence of open source projects that had been 100% written by an AI? That's not what the tools are used for.

Here: https://rerdavies.github.io/ About (at least) 30% of the NEW code in my open source repos are currently written by AI (older code is not). I don't think this is extraordinary. This is the new normal. This is what AI-generated source code looks like. I don't imagine anyone could actually point to a specific AI-generated chunk of code in this codebase and say with certainty that it had been written by an AI.

I can’t help feeling that people who make these sorts of complaints have not used coding assistants. Or perhaps have not used coding assistants recently. Are non-professional programmers writing about AI coding assistants they have never used really any better than non-programmers submitting AI generated pull requests? I think not.

davidgerard · 32d ago

of course, that doesn't answer the question Evans was asking at all.

rerdavies · 32d ago

The question Evans asked:

    But if AI is so obviously superior … show us the 
    code. Where’s the receipts? Let’s say, where’s 
    the open source code contributions using AI?"

The answer I provided:

    Here: https://rerdavies.github.io/ About (at least) 
    30% of the NEW code in my open source repos are 
    currently written by AI (older code is
    not).

The point I wanted to make: code currently written by AIs isn't obviously different or segregated from code that is not. That's not how the tool is used.

0points · 32d ago

About 30% of my new code is completed using intellisense.

davidgerard · 32d ago

His challenge was:

    Share some AI-derived pull requests that deal with non-obvious corner cases or non-trivial bugs from mature F/OSS projects. I’ll also accept high-quality documentation that isn’t just the sort of wasted space and slop that I always tell juniors not to write.

You answered with a personal project.

I keep seeing this from AI fans. They don't read the question and they answer a completely different question they think they have an answer to.

rerdavies · 32d ago

1) That is a mature F/OSS project.

2) That's not what the tool is used for.

kazinator · 32d ago

Open source dev here. I cannot merge something generated by AI, because it is plagiarized, and therefore incompatible with the project license (pretty much any license: BSD, MIT, Apache, GPLn, ...).

A significant contribution to a project requires that the the contributor put a copyright note and license on it. The license has to be compatible with the project being contributed to. (Note: I'm not talking about copyright assignment, but, yes, that's a thing also, e.g. with GNU projects.)

You can't put your copyright notice and license on something that you generated with AI.

Small changes to existing code are in a gray area. They are more based on the existing code that the unauthorized training data, but the latter is hanging over them like a spectre.

I won't merge anything knowing it was done with AI, even a one liner bug fix. If AI were used to describe the issue and propose a fix, and then someone coded it based on that, I think that would be okay; that's something analogous to a clean room approach.

jwitthuhn · 32d ago

There are some but not a lot, current AI is a lot better focusing on smaller more well-defined problems than stuff like "add this feature" or "fix this bug".

A good example is this PR for llama.cpp which the author says was written 99% by AI. https://github.com/ggerganov/llama.cpp/pull/11453

When the problem statement can be narrowed down to "Here is some code that already produces a correct result, optimize it as much as possible" then the AI doesn't need to understand any context specific to your application at all and the chance it will produce something usable is much higher.

Maybe it will eventually be better at big picture stuff but for now I don't see it.

rerdavies · 32d ago

The problem I have is keeping up.

Current generation coding assistants are capable of doing things they could not do mere months ago, and are dramatically more capable than they were a year ago. I recently gave my coding assistant a task that was much larger than I usually trust my coding assistant with; and it produced a very large chunk of code that was remarkably well implemented. (And then rewrote about 2/3 of in order to add an innovative feature. But still).

As things stand at this particular moment, I am unsure of the upper limit of what I can trust this month's AI coding assistant with. Learning to use ones' tools properly is a big part of programming.

MoonGhost · 32d ago

Probably because AI generated code cannot be copyrighted. And the second reason it's not as good as AI sellers tell you.

mcny · 32d ago

> Probably because AI generated code cannot be copyrighted. And the second reason it's not as good as AI sellers tell you.

Is this true? I anal but I have a publicly available GitHub repo which shall go unnamed where most of the heavy lifting is by Claude and Gemini. If I can't copyright this code, does that mean I can't license it under AGPL? If someone copied some code from my repo, does that somehow now taint their code as well?

My application is a simple console app. The original code mostly worked. I have been updating it for the past month or so when I run this application and discover some defect. I have tried hard to avoid the temptation to write much of the code myself with this application and only made minor edits myself, opting to try coaxing an LLM to do the right thing.

This is not for work or even remotely work adjacent but I have been pleasantly surprised by the code quality. It still doesn't replace a programmer but it is impressive for what is essentially a glorified autocorrect auto complete iterated a little.

dragonwriter · 32d ago

> Is this true?

It is true that products of purely a prompt passed through an AI and returned cannot be copyrighted in the US currently, whether a particular body of code that involves AI generation but has a more involved workflow can be copyrighted is a case by case question (for copyright registration you need to document the actual human and AI process and present it to the copyright office and they will make a decision; if you didn't do this at the time, you've probably accidentally rendered it unprotectable even if would have been were the facts documented. Technically, copyright exists at creation without registration, but since you can't sue for infringement without registration, if you make it so you can't register the copyright, you've basically made it unenforceable even if it might abstractly exist.)

> I anal

IANAL should really be kept together and capitalized.

> but I have a publicly available GitHub repo which shall go unnamed where most of the heavy lifting is by Claude and Gemini. If I can't copyright this code, does that mean I can't license it under AGPL?

Absolutely. AGPL is a copyright license. If you have no copyright, you can't enforce any licensing restrictions. You can obviously include uncopyrightable-because-of-how-it-was-generated (and therefore public domain, unless it is violating someone else's copyright) code along with other code that is AGPL licensed, but you can't enforce the AGPL terms on the code you don't have a copyright too, because the copyright is what gives you permission to prohibit uses of the exclusive rights of copyright (copying, derivative works, etc.) that you haven't licensed.

> If someone copied some code from my repo, does that somehow now taint their code as well?

No, if they copied legally-unprotected code from your repo, it has no adverse effect on their repo. They, too, cannot enforce any licensing restrictions on the unprotected code, because neither you nor they have a copyright on it to enforce, but it doesn't adversely impact their ability to enforce their license on any of the code they have that is protected by a copyright.

heavyset_go · 32d ago

> Is this true? I anal but I have a publicly available GitHub repo which shall go unnamed where most of the heavy lifting is by Claude and Gemini. If I can't copyright this code, does that mean I can't license it under AGPL? If someone copied some code from my repo, does that somehow now taint their code as well?

No one knows, it's untested in court.

As far as I understand, you can own and license the code you modify or create yourself, but if someone decides to copy and paste code they can prove came from an LLM that they found in your project, you don't have much of a standing to enforce your license on them.

protocolture · 32d ago

>Is this true?

Depends. Probably not.

Code produced solely by AI with no inputs or modification is not copyrightable.

But theres an amount of human labor and direction you can mix in to pass that hurdle.

Likely:

1. If you never tell anyone its AI created it doesnt matter 2. Someone will test this in court one day and we will discover where the line is actually drawn

Until then you should follow the Terms and Conditions of the tool you use, it probably assigns you all applicable rights to the generated code.

Not a lawyer etc.

dragonwriter · 32d ago

> If you never tell anyone its AI created it doesn't matter

If you try to register the copyright (which you legally must to sue over copyright infringement), you must state whether it was AI generated and, if it was AI generated, provide the information that supports it being copyrightable for the Copyright Office to evaluate. Falsifying this information is a crime.

If it ever goes to court and the code did not provably exist before AI coding became common, you will be probably be questioned about it, and failing to respond truthfully is, again, a crime. So, when talking about the ability to claim and enforce legal rights, "if you never tell anyone about it", means, "if you are willing to actively and criminally lie".

> Until then you should follow the Terms and Conditions of the tool you use, it probably assigns you all applicable rights to the generated code.

It probably explicitly doesn't claim such rights (because the tool owner has no basis for claiming them, especially since they may not even exist), but it can't "assign" you anything the party offering the T&C's doesn't have in the first place, and even if it tried to that assignment would have no effect.

protocolture · 32d ago

>If you try to register the copyright

Which legal jurisdiction? Because I don't have to register shit?

>You don’t need to register for copyright in Australia. The moment an idea or creative concept is documented on paper or electronically it is automatically protected by copyright in Australia. Copyright protection is free and automatic under the Copyright Act 1968.

>It probably explicitly doesn't claim such rights

Most I have read assign everything to the user, Midjourney being an outlier because it reserves the right to distribute and remix your work without a bigger license fee.

They do this for practical reasons, they don't want to be seen to own anything that might be associated with criminal activity.

Sudowrite has their terms reproduced human readable.

> We understand it's super important for you to retain ownership over the content you create, so whether you use Sudowrite to brainstorm, enhance your ideas, or fine-tune your writing, all the writing you generate remains yours. When you input text into Sudowrite and use its features to generate or improve content, the output is based on your initial input and creative direction. This means that the resulting content, enriched or transformed by Sudowrite, is considered your intellectual property. You have initiated the creative process, and therefore, the contributions made by Sudowrite in response to your inputs are yours to own, use, and distribute as you see fit.

kazinator · 32d ago

If you get something upstreamed into a project based on misrepresenting is as human work copyrighted by you, you are doing a great harm to that project. Don't do that.

mcny · 32d ago

It isn't "upstreamed" at all. The project readme clearly says special thanks to Claude. This is my own repo.

What I wanted to avoid is hurting anyone who might come across this code in the future and unwittingly copy paste it but reading the replies, that is not really a concern if the "fear" is the code might be in the public domain. I don't mind the code being in the public domain. What I would mind is if anthropic or Google somehow owned the copyright.

drob518 · 32d ago

You own your prompts. /s

Gigachad · 32d ago

How would this even be enforced? If I privately used AI to generate some code and then post it on github. How could someone verify if it was written by me or AI to decide if copyright applies or not?

heavyset_go · 32d ago

Your modifications are owned by you.

vborovikov · 32d ago

I recently discovered an open source project which I believe was completely vibe coded. 90k lines of code, 135 commits starting from an empty project.

I cloned the repo and tried to run the examples. Nothing worked. The problem was many data structures and variables were initialized empty/zero but the code wasn't designed to work with empty data at all.

ramity · 32d ago

I think it's fair to say AI generated code isn't visibly making a meaningful impact in open source. Absence of evidence is not evidence of absence, but that shouldn't be interpreted as a defense to orgs or the fanciful predictions made by tech CEOs. In its current forms, AI feels comparable to piracy where the real impact is fuzzy and companies claim a number is higher or lower depending on the weather.

Yes, open source projects would be the main place where these claims could be publicly verifiable, but established open source projects aren't just code--they're usually complex, organic, and ever shifting organizations of people. I'd argue the metric of interacting with a large group of people whom have cultivated their own working process and internal communication patterns is closer to AGI than coding assistant, so maybe the goal posts we're using for AI PRs are too grand. I think it's expected to hear claims from within walled gardens, where processes and teams can be upended at will, that AI is making an unverifiable splash, because they're precisely the environments where AI could be the most disruptive.

Additionally, I think we're willfully looking in the wrong places when trying to measure AI impact by looking for AI PRs. Programmers don't flag PRs when they use IntelliJ or confer with X flavor of LLM(tm), and expecting mature open source projects to have AI PRs seems as dubious as expecting then to use blockchain or any other technology that could be construed as disruptive. It just may not be compatible or reasonable with their current process. Calculated change is often incremental and boring, where real progress is only felt by looking away.

I made a really simple project that automatically forwards browser console logs to a central server, programmatically pull the file(s) from the trace, and had an LLM consume a templated prompt + error + file. It'd make a PR with what it thought was the correct fix. Sometimes it was helpful. The problem was it needed to do more than code, because the utility of a one shot prompt to PR is low.

flembat · 31d ago

I let a popular agentic ai refactor some code for me today, it looked really nice, but despite the fact that it was meant to be splitting a file full of functions that already compiled and worked it rewrote them all and broke them comprehensively, it then tried to research how to fix them, even though working code was right there in the file and just kept making the code worse and worse. It also had some limit which meant it left the code partly refactored, ran out of its quota left everything broken and then when restarted suggested refactoring some more classes.

SpecialistK · 32d ago

It's pretty obvious that AI code generation is not up to snuff for mission-critical projects (yet? ever?) - it can prove handy for small hobbyist projects with low stakes and it may provide time-savings and alternative perspectives for those who subsequently know how to sniff out BS.

But even airliner autopilot systems, which are much more mature and have a proven track record, are not trusted as a replacement for two trained experts to have final control.

The overall trend I've seen with AI creations (not just in programming) is that the tech is cool and improving, but people have trouble recognizing where its suitable and where it isn't. I've found chatbots to be fantastic for recipes and cooking advice, or banal conversations I couldn't have with real people (especially after a drink...) and pretty shoddy for real programming projects or creative endeavors. It isn't a panacea and we'd benefit a lot from people recognizing that more.

lalith_c · 32d ago

maybe open source projects are prohibiting code generated by AI?

v3ss0n · 32d ago

Where?, name them

rpdillon · 32d ago

The article is actually pretty interesting. The only one mentioned is Curl, and that's because of abuses by uneducated developers using AI with no idea of what they're doing.

I actually think that's the central thesis of the article, especially the last example that discusses the LLVM compiler project getting raked over the coals after not engaging with a non-developer that had used AI to make pull requests, and admitted he had no idea what the code did.

Buried in the middle of the article is a paragraph that I think sums up the main point well.

> More broadly, the very hardest problem in open source is not code, it’s people — how to work with others. Some AI users just don’t understand the level they simply aren’t working at.

The point being that without a good programmer, AI is not very useful.

johnisgood · 32d ago

> with no idea of what they're doing.

And that is the problem. I made some contributions to projects with the help of LLMs, but I had to know what I am doing.

> The point being that without a good programmer, AI is not very useful.

Exactly!

Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

GCP Outage (status.cloud.google.com)

Frequent reauth doesn't make you more secure (tailscale.com)

A receipt printer cured my procrastination (laurieherault.com)

The last six months in LLMs, illustrated by pelicans on bicycles (simonwillison.net)

Magistral — the first reasoning model by Mistral AI (mistral.ai)

If the moon were only 1 pixel: A tediously accurate solar system model (2014) (joshworth.com)

Apple announces Foundation Models and Containerization frameworks, etc (apple.com)

Working on databases from prison (turso.tech)

Jemalloc Postmortem (jasone.github.io)

Containerization is a Swift package for running Linux containers on macOS (github.com)

Research suggests Big Bang may have taken place inside a black hole (port.ac.uk)

Apple introduces a universal design across platforms (apple.com)

US-backed Israeli company's spyware used to target European journalists (apnews.com)

Marines being mobilized in response to LA protests (cnn.com)

I convinced HP's board to buy Palm and watched them kill it (philmckinney.substack.com)

Chatterbox TTS (github.com)

WhatsApp introduces ads in its app (nytimes.com)

Congratulations on creating the one billionth repository on GitHub (github.com)

Bruteforcing the phone number of any Google user (brutecat.com)

How I program with agents (crawshaw.io)

Honda conducts successful launch and landing of experimental reusable rocket (global.honda)

"Localhost tracking" explained. It could cost Meta €32B (zeropartydata.es)

Launch HN: Vassar Robotics (YC X25) – $219 robot arm that learns new skills

Start your own Internet Resiliency Club (bowshock.nl)

Kagi Reaches 50k Users (kagi.com)

Why SSL was renamed to TLS in late 90s (2014) (tim.dierks.org)

OpenAI dropped the price of o3 by 80% (twitter.com)

Waymo rides cost more than Uber or Lyft and people are paying anyway (techcrunch.com)

Air India flight to London crashes in Ahmedabad with more than 240 onboard (theguardian.com)

Self-Host and Tech Independence: The Joy of Building Your Own (ssp.sh)

I have reimplemented Stable Diffusion 3.5 from scratch in pure PyTorch (github.com)

Meta invests $14.3B in Scale AI to kick-start superintelligence lab (nytimes.com)

We’re secretly winning the war on cancer (vox.com)

Joining Apple Computer (2018) (folklore.org)

Building supercomputers for autocrats probably isn't good for democracy (helentoner.substack.com)

Danish Ministry Replaces Windows and Microsoft Office with Linux and LibreOffice (heise.de)

Convert photos to Atkinson dithering (gazs.github.io)

Rendering Crispy Text on the GPU (osor.io)

Show HN: I made a 3D printed VTOL drone (tsungxu.com)

FSE meets the FBI (blog.freespeechextremist.com)

Low-background Steel: content without AI contamination (blog.jgc.org)

Show HN: Chili3d – A open-source, browser-based 3D CAD application

Successful people set constraints rather than chasing goals (joanwestenberg.com)

Ask HN: How do I give back to people helped me when I was young and had nothing?

Brian Wilson has died (pitchfork.com)

Endometriosis is an interesting disease (owlposting.com)

Finding Shawn Mendes (2019) (ericneyman.wordpress.com)

iPhone 11 emulation done in QEMU (github.com)

macOS Tahoe brings a new disk image format (eclecticlight.co)

If AI is so good at coding where are the open source contributions?

Comments (37)