AI tooling must be disclosed for contributions

290 freetonik 127 8/21/2025, 6:49:57 PM github.com ↗

Comments (127)

Waterluvian · 1h ago
I’m not a big AI fan but I do see it as just another tool in your toolbox. I wouldn’t really care how someone got to the end result that is a PR.

But I also think that if a maintainer asks you to jump before submitting a PR, you politely ask, “how high?”

cvoss · 1h ago
It does matter how and where a PR comes from, because reviewers are fallible and finite, so trust enters the equation inevitably. You must ask "Do I trust where this came from?" And to answer that, you need to know where it come from.

If trust didn't matter, there wouldn't have been a need for the Linux Kernel team to ban the University of Minnesota for attempting to intentionally smuggle bugs through the PR process as part of an unauthorized social experiment. As it stands, if you / your PRs can't be trusted, they should not even be admitted to the review process.

KritVutGu · 1h ago
This is it exactly.

Slop generators being available to everyone makes everyone less trustworthy, from a maintainer's POV. Thus, the circle of trust, for any given maintainer, shrinks starkly.

People do not become maintainers because they want to battle malicious, or even criminally negligent crap. They expect benign and knowledgeable contributors, or at least benign and willing to do their homework ones.

Being a maintainer is already hugely thankless. It's hard work (harder than writing code), and it comes with a lot less recognition. Not to mention all the newcomers that (a) maintainers usually eagerly educate, but then (b) disappear.

Screw up the social contract for maintainers even more, and they'll go extinct. (Edit: if a maintainer gets a whiff of some contributor working against them, rather than with them, they'll either ban the contributor forever, or just quit the project.)

Any sane project should categorically ban AI-assisted contributions, and extend their Signed-off-by definition, after a cut-off-date, to carry an explicit statement by the contributor that the code is free of AI-output. If this rules out "agentic IDE"s, that's a win.

ToucanLoucan · 45m ago
The sheer amount of entitlement on display by very pro-AI people genuinely boggles the mind.
koolba · 1h ago
> You must ask "Do I trust where this came from?" And to answer that, you need to know where it come from.

No you don’t. You can’t outsource trust determinations. Especially to the people you claim not to trust!

You make the judgement call by looking at the code and your known history of the contributor.

Nobody cares if contributors use an LLM or a magnetic needle to generate code. They care if bad code gets introduced or bad patches waste reviewers’ time.

falcor84 · 1h ago
Trust is absolutely a thing. Maintaining an open source project is an unreasonably demanding and thankless job, and it would be even more so if you had to treat every single PR as if it's a high likelihood supply-chain attack.
fnimick · 24m ago
While true, we really should be treating every single piece of external code as though it's malicious.
geraneum · 38m ago
> Nobody cares if contributors use an LLM or a magnetic needle to generate code.

That’s exactly opposite of what the author is saying. He mentions that [if the code is not good, or you are a beginner] he will help you get to finish line, but if it’s LLM code, he shouldn’t be putting effort because there’s no human on the other side.

It makes sense to me.

dsjoerg · 1h ago
You haven't addressed the primary stated rationale from the linked content: "I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of. But if it's just an AI on the other side, I don't need to put in this effort, and it's rude to trick me into doing so."
nosignono · 1h ago
> I wouldn’t really care how someone got to the end result that is a PR.

I can generate 1,000 PRs today against an open source project using AI. I think you do care, you are only thinking about the happy path where someone uses a little AI to draft a well constructed PR.

There's a lot ways AI can be used to quickly overwhelm a project maintainer.

oceanplexian · 1h ago
> I can generate 1,000 PRs today against an open source project using AI.

Then perhaps the way you contribute, review, and accept code is fundamentally wrong and needs to change with the times.

It may be that technologies like Github PRs and other VCS patterns are literally obsolete. We've done this before throughout many cycles of technology, and these are the questions we need to ask ourselves as engineers, not stick our heads in the sand and pretend it's 2019.

whatevertrevor · 49m ago
I don't think throwing out the concept of code reviews and version control is the correct response to a purported rise in low-effort high-volume patches. If anything it's even more required.
kelvinjps10 · 45m ago
Why it's incorrect? And what would be the new way? AI to review the changes of AI?
Waterluvian · 1h ago
In that case a more correct rule (and probably one that can be automatically enforced) for that issue is a max number of PRs or opened issues per account.
raincole · 1h ago
When one side has much more "scalability" than the other, then the other side has very strong motivation to match up.

- People use AI to write cover letters. If the companies don't filter out them automatically, they're screwed.

- Companies use AI to interview candidates. No one wants to spend their personal time talking to a robot. So the candidates start using AI to take interviews for them.

etc.

If you don't at least tell yourself that you don't allow AI PRs (even just as a white lie) you'll one day use AI to review PRs.

oceanplexian · 1h ago
Both sides will use AI and it will ultimately increase economic productivity.

Imagine living before the invention of the printing press, and then lamenting that we should ban them because it makes it "too easy" to distribute information and will enable "low quality" publications to have more reach. Actually, this exact thing happened, but the end result was it massively disrupted the world and economy in extremely positive ways.

bootsmann · 1h ago
> Both sides will use AI and it will ultimately increase economic productivity.

Citation needed, I don’t think the printing press and gpt are in any way comparable.

alfalfasprout · 39m ago
The mental gymnastics the parent poster went through to equate an LLM to the printing press in this sense are mind-boggling.
ionelaipatioaei · 30m ago
> Both sides will use AI and it will ultimately increase economic productivity.

In some cases sure but it can also create the situation where people just waste time for nothing (think AI interviewing other AIs - this might generate GDP by people purchasing those services but I think we can all agree that this scenario is just wasting time and resource without improving society).

jrflowers · 4m ago
> Imagine living before the invention of the printing press, and then lamenting that we should ban them because it makes it "too easy" to distribute information

Imagine seeing “rm -rf / is a function that returns “Hello World!” and thinking “this is the same thing as the printing press”

https://bsky.app/profile/lookitup.baby/post/3lu2bpbupqc2f

renrutal · 1h ago
I won't put it as "just another tool". AI introduces a new kind of tool where the ownership of the resulting code is not straightforward.

If, in the dystopian future, a justice court you're subjected to decides that Claude was trained on Oracle's code, and all Claude users are possibly in breach of copyright, it's easier to nuke from orbit all disclosed AI contributions.

wahnfrieden · 1h ago
You should care. If someone submits a huge PR, you’re going to waste time asking questions and comprehending their intentions if the answer is that they don’t know either. If you know it’s generated and they haven’t reviewed it themselves, you can decide to shove it back into an LLM for next steps rather than expect the contributor to be able to do anything with your review feedback.

Unreviewed generated PRs can still be helpful starting points for further LLM work if they achieve desired results. But close reading with consideration of authorial intent, giving detailed comments, and asking questions from someone who didn't write or read the code is a waste of your time.

That's why we need to know if a contribution was generated or not.

KritVutGu · 56m ago
You are absolutely right. AI is just a tool to DDoS maintainers.

Any contributor who was shown to post provably untested patches used to lose credibility. And now we're talking about accommodating people who don't even understand how the patch is supposed to work?

wahnfrieden · 10m ago
That’s not what I said though. LLM output, even unreviewed and without understanding, can be a useful artifact. I do it all the time - generate code, try running it, and then if I see it works well, I can decide to review it and follow up with necessary refactoring before integrating it. Parts of that can be contributed too. We’re just learning new etiquettes for doing that productively, and that does includes testing the PR btw (even if the code itself is not understood or reviewed).

Example where this kind of contribution was accepted and valuable, inside this ghostty project https://x.com/mitchellh/status/1957930725996654718

alfalfasprout · 40m ago
The reality is as someone that helps maintain several OSS projects you vastly underestimate the problem that AI-assisted tooling has created.

On the one hand, it's lowered the barrier to entry for certain types of contributions. But on the other hand getting a vibe-coded 1k LOC diff from someone that has absolutely no idea how the project even works is a serious problem because the iteration cycle of getting feedback + correctly implementing it is far worse in this case.

Also, the types of errors introduced tend to be quite different between humans and AI tools.

It's a small ask but a useful one to disclose how AI was used.

quotemstr · 1h ago
As a project maintainer, you shouldn't make rules unenforceable rules that you and everyone else know people will flout. Doing so comes makes you seem impotent and diminishes the respect people have for rules in general.

You might argue that by making rules, even futile ones, you at least establish expectations and take a moral stance. Well, you can make a statement without dressing it up as a rule. But you don't get to be sanctimonious that way I guess.

natrius · 1h ago
Unenforceable rules are bad, but if you tweak the rule to always require some sort of authorship statement (e.g. "I wrote this by hand" or "I wrote this with Claude"), then the honor system will mostly achieve the desired goal of calibrating code review effort.
voxl · 1h ago
Except you can enforce this rule some of the time. People discover that AI was used or suspect it all the time, and people admit to it after some pressure all the time.

Not every time, but sometimes. The threat of being caught isn't meaningless. You can decide not to play in someone else's walled garden if you want but the least you can do is respect their rules, bare minimum of human decency.

quotemstr · 1h ago
It. doesn't. matter.

The only legitimate reason to make a rule is to produce some outcome. If your rule does not result in that outcome, of what use is the rule?

Will this rule result in people disclosing "AI" (whatever that means) contributions? Will it mitigate some kind of risk to the project? Will it lighten maintainer load?

No. It can't. People are going to use the tools anyway. You can't tell. You can't stop them. The only outcome you'll get out of a rule like this is making people incrementally less honest.

recursive · 1h ago
Sometimes you can tell.
blaufuchs · 1h ago
> Will it lighten maintainer load?

Yes that is the stated purpose, did you read the linked GitHub comment? The author lays out their points pretty well, you sound unreasonably upset about this. Are you submitting a lot of AI slop PRs or something?

P.S Talking. Like. This. Is. Really. Ineffective. It. Makes. Me. Just. Want. To. Disregard. Your. Point. Out. Of. Hand.

devmor · 1h ago
There are plenty of argumentative and opinionated reasons to say it matters, but there is one that can't really be denied - reviewers (and project maintainers, even if they aren't reviewers) are people whose time deserves to be respected.

If this rule discourages low quality PRs or allows reviewers to save time by prioritizing some non-AI-generated PRs, then it certainly seems useful in my opinion.

KritVutGu · 51m ago
> As a project maintainer, you shouldn't make rules unenforceable rules

Total bullshit. It's totally fine to declare intent.

You are already incapable of verifying / enforcing that a contributor is legally permitted to submit a piece of code as their own creation (Signed-off-by), and do so under the project's license. You won't embark on looking for prior art, for the "actual origin" of the code, whatever. You just make them promise, and then take their word for it.

Razengan · 1h ago
> if a maintainer asks you to jump before submitting a PR, you politely ask, “how high?”

or say "fork you."

neilv · 1h ago
There is also IP taint when using "AI". We're just pretending that there's not.

If someone came to you and said "good news: I memorized the code of all the open source projects in this space, and can regurgitate it on command", you would be smart to ban them from working on code at your company.

But with "AI", we make up a bunch of rationalizations. ("I'm doing AI agentic generative AI workflow boilerplate 10x gettin it done AI did I say AI yet!")

And we pretend the person never said that they're just loosely laundering GPL and other code in a way that rightly would be existentially toxic to an IP-based company.

ineedasername · 58m ago
Courts (at least in the US) have already ruled that use of ingested data for training is transformative. There’s lots of details to figure, but the genie is out of the bottle.

Sure it’s a big hill to climb in rethinking IP laws to align with a societal desire that generating IP continue to be a viable economic work product, but that is what’s necessary.

alfalfasprout · 38m ago
> Courts (at least in the US) have already ruled that use of ingested data for training is transformative

This is far from settled law. Let's not mischaracterize it.

Even so, an AI regurgitating proprietary code that's licensed in some other way is a very real risk.

popalchemist · 25m ago
No more so than regurgitating an entire book. While it could technically be possible in the case of certain repos that are ubiquitous on the internet (and therefore overrepresented in training data to the point that they are "regurgitated" verbatim, in whole), it is extremely unlikely and would only occur after deliberate prompting. The NYT suit against Open AI shows (in discovery) that the NYT was only able to get partial results after deliberately prompting the model with portions of the text they were trying to force it to regurgitate.

So. Yes, technically possible. But impossible by accident. Furthermore when you make this argument you reveal that you don't understand how these models work. They do not simply compress all the data they were trained on into a tiny storable version. They are effectively multiplication matrices that allow math to be done to predict the most likely next token (read: 2-3 Unicode characters) given some input.

So the model does not "contain" code. It "contains" a way of doing calculations for predicting what text comes next.

Finally, let's say that it is possible that the model does spit out not entire works, but a handful of lines of code that appear in some codebase.

This does not constitute copyright infringement, as the lines in question a) represent a tiny portion of the whole work (and copyright only protecst against the reduplication of whole works or siginficant portions of the work), and B) there are a limited number of ways to accomplish a certain function and it is not only possible but inevitable that two devs working independently could arrive at the same implementation. Therefore using an identical implementation (which is what this case would be) of a part of a work is no more illegal than the use of a certain chord progression or melodic phrasing or drum rhythm. Courts have ruled about this thoroughly.

tick_tock_tick · 1h ago
> There is also IP taint when using "AI". We're just pretending that there's not.

I don't think anyone who's not monetarily incentivize to pretend there are IP/Copyright issues actually thinks there are. Luckily everyone is for the most part just ignoring them and the legal system is working well and not allowing them an inch to stop progress.

neilv · 42m ago
> I don't think anyone who's not monetarily incentivize to pretend there are IP/Copyright issues actually thinks there are.

Why do you think that about people who disagree with you? You're responding directly to someone who's said they think there's issues, and not pretending. Do you think they're lying? Did you not read what they said?

And AFAICT a lot of other people think similarly to me.

The perverse incentives to rationalize are on the side of the people looking to exploit the confusion, not the people who are saying "wait a minute, what you're actually doing is..."

So a gold rush person claiming opponents must be pretending because of incentives... seems like the category of "every accusation is a confession".

luma · 1h ago
Also ban StackOverflow and nearly any text book in the field.

The reality is that programmers are going to see other programmers code.

neilv · 1h ago
Huge difference, and companies recognized the difference, right up until "AI" hype.
JoshTriplett · 1h ago
"see" and "copy" are two different things. It's fine to look at StackOverflow to understand the solution to a problem. It's not fine to copy and paste from StackOverflow and ignore its license or attribution.

Content on StackOverflow is under CC-by-sa, version depends on the date it was submitted: https://stackoverflow.com/help/licensing . (It's really unfortunate that they didn't pick license compatible with code; at one point they started to move to the MIT license for code, but then didn't follow through on it.)

timeon · 1h ago
How is that same thing?
thallavajhula · 1h ago
I’m loving today. HN’s front page is filled with some good sources today. No nonsense sensationalism or preaching AI doom, but more realistic experiences.

I’ve completely turned off AI assist on my personal computer and only use AI assist sparingly on my work computer. It is so bad at compound work. AI assist is great at atomic work. The rest should be handled by humans and use AI wisely. It all boils down back to human intelligence. AI is only as smart as the human handling it. That’s the bottom line.

tick_tock_tick · 1h ago
> AI is only as smart as the human handling it.

I think I'm slowly coming around to this viewpoint too. I really just couldn't understand how so many people were having widely different experiences. AI isn't magic; how could I have expected all the people I've worked with who struggle to explain stuff to team members, who have near perfect context, to manage to get anything valuable across to an AI?

I was original pretty optimistic that AI would allow most engineers to operate at a higher level but it really seems like instead it's going to massively exacerbate the difference between an ok engineer and a great engineer. Not really sure how I feel about that yet but at-least I understand now why some people think the stuff is useless.

btown · 50m ago
One of my mental models is that the notion of "effective engineer" used to mean "effective software developer" whether or not they were good at system design.

Now, an "effective engineer" can be a less battle-tested software developer, but they must be good at system design.

(And by system design, I don't just mean architecture diagrams: it's a personal culture of constantly questioning and innovating around "let's think critically to see what might go wrong when all these assumptions collide, and if one of them ends up being incorrect." Because AI will only suggest those things for cut-and-dry situations where a bug is apparent from a few files' context, and no ambitious idea is fully that cut-and-dry.)

The set of effective engineers is thus shifting - and it's not at all a valid assumption that every formerly good developer will see their productivity skyrocket.

ilc · 10m ago
I suspect that truly battle tested engineers will go up in value.

I don't think that it lowers the bar there, if anything the bar is far harsher.

If I'm doing normal coding I make X choices per time period, with Y impacts.

With AI X will go up and the Y / X ratio may ALSO go up, so making more decisions of higher leverage!

btucker · 51m ago
I've been starting to think of it like this:

Great Engineer + AI = Great Engineer++ (Where a great engineer isn't just someone who is a great coder, they also are a great communicator & collaborator, and love to learn)

Good Engineer + AI = Good Engineer

OK Engineer + AI = Mediocre Engineer

QuercusMax · 31m ago
I recently watched a mid-level engineer use AI to summarize some our code, and he had it put together a big document describing all the various methods in a file, what they're used for, and so forth. It looked to me like a huge waste of time, as the code itself was already very readable (I say this as someone who recently joined the project), and the "documentation" the AI spit out wasn't that different than what you'd get just by running pydoc.

He took a couple days doing this, which was shocking to me. Such a waste of time that would have been better spent reading the code and improving any missing documentation - and most importantly asking teammates about necessary context that couldn't just be inferred from the code.

biophysboy · 40m ago
I sort of think of it in terms of self-deskilling.

If an OK engineer is still actively trying to learn, making mistakes, memorizing essentials, etc. then there is no issue.

On the other hand, if they're surrendering 100% of their judgment to AI, then they will be mediocre.

QuercusMax · 30m ago
The same people who just copy-pasted stack overflow answers and didn't understand why or how things work are now using AI to create stuff that they also don't understand.
biophysboy · 25m ago
And for low-stakes one-time hobby projects, they're correct to do so!
aydyn · 45m ago
Is there a difference between "OK" and "Mediocre"?
BolexNOLA · 33m ago
“Ok” I generally associate with being adequate but could obviously be better. “Mediocre” is just inadequate.
geodel · 29m ago
Not Engineer + AI = Now an Engineer

Thats the reason for high valuation of AI companies.

ToucanLoucan · 25m ago
Of all the things I would absolutely not trust the stock market to evaluate, "technical competence" is either near or at the top.

The people deciding how much OpenAI is worth would probably struggle to run first-time setup on an iPad.

jgilias · 39m ago
This fits my observations as well. With the exception that it’s sometimes the really sharp engineers who can do wonders themselves who aren’t really great at communication. AI really needs you to be verbose, and a lot of people just can’t.
katbyte · 57m ago
It’s like the difference between someone who can search the internet or codebase well bs someone who can’t

Using search engines is a skill

jerf · 55m ago
I've been struggling to apply AI on any large scale at work. I was beginning to wonder if it was me.

But then my wife sort of handed me a project that previously I would have just said no to, a particular Android app for the family. I have instances of all the various Android technologies under my belt, that is, I've used GUI toolkits, I've used general purpose programming languages, I've used databases, etc, but with the possible exception of SQLite (which even that is accessed through an ORM), I don't know any of the specific technologies involved with Android now. I have never used Kotlin; I've got enough experience that I can pretty much piece it together when I'm reading it but I can't write it. Never used the Android UI toolkit, services, permissions, media APIs, ORMs, build system, etc.

I know from many previous experiences that A: I could definitely learn how to do this but B: it would be a many-week project and in the end I wouldn't really be able to leverage any of the Android knowledge I would get for much else.

So I figured this was a good chance to take this stuff for a spin in a really hard way.

I'm about eight hours in and nearly done enough for the family; I need about another 2 hours to hit that mark, maybe 4 to really polish it. Probably another 8-12 hours and I'd have it brushed up to a rough commercial product level for a simple, single-purpose app. It's really impressive.

And I'm now convinced it's not just that I'm too old a fogey to pick it up, which is, you know, a bit of a relief.

It's just that it works really well in some domains, and not so much in others. My current work project is working through decades of organically-grown cruft owned by 5 different teams, most of which don't even have a person on them that understands the cruft in question, and trying to pull it all together into one system where it belongs. I've been able to use AI here and there for some stuff that is still pretty impressive, like translating some stuff into psuedocode for my reference, and AI-powered autocomplete is definitely impressive when it correctly guesses the next 10 lines I was going to type effectively letter-for-letter. But I haven't gotten that large-scale win where I just type a tiny prompt in and see the outsized results from it.

I think that's because I'm working in a domain where the code I'm writing is already roughly the size of the prompt I'd have to give, at least in terms of the "payload" of the work I'm trying to do, because of the level of detail and maturity of the code base. There's no single sentence I can type that an AI can essentially decompress into 250 lines of code, pulling in the correct 4 new libraries, and adding it all to the build system the way that Gemini in Android Studio could decompress "I would like to store user settings with a UI to set the user's name, and then display it on the home page".

I think I recommend this approach to anyone who wants to give this approach a fair shake - try it in a language and environment you know nothing about and so aren't tempted to keep taking the wheel. The AI is almost the only tool I have in that environment, certainly the only one for writing code, so I'm forced to really exercise the AI.

adastra22 · 3m ago
It's a matter of the tools not getting there though. If there was a summarization system that could compress down the structure and history of the system you are working on in a way that could then extract out a half-filled context window of the relevant bits of the code base and architecture for the task (in other words, generate that massive prompt for you), then you might see the same results that you get with Android apps.

The reason being that the boilerplate Android stuff is effectively given for free and not part of the context as it is so heavily represented in the training set, whereas the unique details of your work project is not. But finding a way to provide that context, or better yet fine-tune the model on your codebase, would put you in the same situation and there's no reason for it to not deliver the same results.

That it is not working for you now at your complex work projects is a limitation of tooling, not something fundamental about how AI works.

Aside: Your recommendation is right on. It clicked for me when I took a project that I had spent months of full-time work creating in C++, and rewrote it in idiomatic Go, a language I had never used and knew nothing about. It took only a weekend, and at the end of the project I had reviewed and understood every line of generated code & learned a lot about Go in the process.

thewebguyd · 27m ago
> try it in a language and environment you know nothing about and so aren't tempted to keep taking the wheel.

That's a good insights. Its almost like to use AI tools effectively, one needs to stop caring about the little things you'd get caught up in if you were already familiar and proficient in a stack. Style guidelines, a certain idiomatic way to do things, naming conventions, etc.

A lot like how I've stopped organizing digital files into folders, sub folders etc (along with other content) and now I just just rely on search. Everything is a flat structure, I don't care where its stored or how it's organized as long as I can just search for it, that's what the computer is for, to keep track for me so I don't have to waste time organizing it myself.

Like wise for the code Generative AI produces. I don't need to care about the code itself. As long as its correct, not insecure, and performant, it's fine.

It's not 100% there yet, I still do have to go in and touch the code, but ideally I shouldn't have to, nor should I have to care what the actual code looks like, just the result of it. Let the computer manage that, not me. My role should be the system design and specification, not writing the code.

smokel · 31m ago
What you are describing also seems to align with the idea that greenfield projects are well-suited for AI, whereas brownfield projects are considerably more challenging.
QuercusMax · 27m ago
Brownfield projects are more challenging because of all the context and decisions that went into building things that are not directly defined in the code.

I suspect that well-engineered projects with plenty of test coverage and high-quality documentation will be easier to use AI on, just like they're easier for humans to comprehend. But you need to have somebody with the big picture still who can make sure that you don't just turn things into a giant mess once less disciplined people start using AI on a project.

WhyNotHugo · 8m ago
> AI is only as smart as the human handling it.

An interesting stance.

Plenty of posts in the style of "I wrote this cool library with AI in a day" were written by really smart devs who are known for shipping good quality library very quickly.

danenania · 24m ago
The way I've been thinking about it is that the human makes the key decisions and then the AI connects the dots.

What's a key decision and what's a dot to connect varies by app and by domain, but the upside is that generally most code by volume is dot connecting (and in some cases it's like 80-90% of the code), so if you draw the lines correctly, huge productivity boosts can be found with little downside. But if you draw the lines wrong, such that AI is making key decisions, you will have a bad time. In that case, you are usually better off deleting everything it produced and starting again rather than spending time to understand and fix its mistakes.

Things that are typically key decisions:

- database table layout and indexes

- core types

- important dependencies (don't let the AI choose dependencies unless it's low consequence)

- system design—caches, queues, etc.

- infrastructure design—VPC layout, networking permissions, secrets management

- what all the UI screens are and what they contain, user flows, etc.

- color scheme, typography, visual hierarchy

- what to test and not to test (AI will overdo it with unnecessary tests and test complexity if you let it)

- code organization: directory layout, component boundaries, when to DRY

Things that are typically dot connecting:

- database access methods for crud

- API handlers

- client-side code to make API requests

- helpers that restructure data, translate between types, etc.

- deploy scripts/CI and CD

- dev environment setup

- test harness

- test implementation (vs. deciding what to test)

- UI component implementation (once client-side types and data model are in place)

- styling code

- one-off scripts for data cleanup, analytics, etc.

That's not exhaustive on either side, but you get the idea.

AI can be helpful for making the key decisions too, in terms of research, ideation, exploring alternatives, poking holes, etc., but imo the human needs to make the final choices and write the code that corresponds to these decisions either manually or with very close supervision.

devmor · 1h ago
I'm right there with you, and having a similar experience at my day job. We are doing a bit of a "hack week" right now where we allow everyone in the org to experiment in groups with AI tools, especially those that don't regularly use them as part of their work - and we've seen mostly great applications of analytical approaches, guardrails and grounded generation.

It might just be my point of view, but I feel like there's been a sudden paradigm shift back to solid ML from the deluge of chatbot hype nonsense.

arjie · 13m ago
One way is to just name the LLM as the author for commits it makes. You can be the committer so long as you just accurately name the author.
andruby · 27m ago
> I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of

I really appreciate this point from mitchellh. Giving thoughtful constructive feedback to help a junior developer improve is a gift. Yet it would be a waste of time if the PR submitter is just going to pass it to an AI without learning from it.

uberduper · 18m ago
> Or a more detailed disclosure:

> I consulted ChatGPT to understand the codebase but the solution was fully authored manually by myself.

What's the reasoning for needing to disclose this?

jedahan · 7m ago
Helps the maintainer focus their efforts during review. If you wrote that exact sentence, a maintainer would keep "understanding of the codebase" as a place for potential sources of incorrect assumptions higher in mind than otherwise.
uberduper · 4m ago
In my experience, if there's one thing the AI assistants do really well, it's understanding existing code and summarizing it or pinpointing specific things I'm looking for.
uberduper · 6m ago
I'm not really a developer but I routinely need to go dig through codebases looking for answers and AI assistants have made this so much faster. This has been a boon for me.
hodgehog11 · 1h ago
How does this not lead to a situation where no honest person can use any AI in their submissions? Surely pull requests that acknowledge AI tooling will be given significantly less attention, on the grounds that no one wants to read work that they know is written by AI.
skogweb · 1h ago
I don't think this is the case. Mitchell writes that he himself uses LLMs, so it's not black and white. A PR author who has a deep understanding of their changes and used an LLM for convenience will be able to convey this without losing credibility imo
MerrimanInd · 1h ago
It just might. But if people generate a bias against AI generated code because AI can generate massive amounts of vaguely correct looking yet ultimately bad code then that seems like an AI problem not a people problem. Get better, AI coding tools.
Workaccount2 · 1h ago
Make a knowledgeable reply and mention you used chat-gpt - comment immediately buried.

Make a knowledgeable reply and give no reference to the AI you used- comment is celebrated.

We are already barreling full speed down the "hide your AI use" path.

showcaseearth · 31m ago
I doubt a PR is going to be buried if it's useful, well designed, good code, etc, just because of this disclosure. Articulate how you used AI and I think you've met the author's intent.

If the PR has issues and requires more than superficial re-work to be acceptable, the authors don't want to spend time debugging code spit out by an AI tool. They're more willing to spend a cycle or two if the benefit is you learning (either generally as a dev or becoming more familiar with the project). If you can make clear that you created or understand the code end to end, then they're more likely to be willing to take these extra steps.

Seems pretty straightforward to me and thoughtful by the maintainers here.

eschaton · 7m ago
You ask this as if it’s a problem.
alfalfasprout · 37m ago
No one is saying to not use AI. The intent here is to be honest about AI usage in your PRs.
andunie · 1h ago
Isn't that a good thing?
jama211 · 1h ago
What, building systems where we’re specifically incentivised not to disclose ai use?
eschaton · 2m ago
Submitting a PR also means you’re not submitting code copied from elsewhere without calling that out and ensuring license compatibility, we don’t refer to that as incentivizing lying about the origin of submitted code.

Fraud and misrepresentation are always options for contributors, at some point one needs to trust that they’re adhering to the rules that they agreed to adhere to.

hodgehog11 · 1h ago
It might encourage people to be dishonest, or to not contribute at all. Maybe that's fine for now, but what if the next generation come to rely on these tools?
KritVutGu · 48m ago
Good point. That's the point exactly. Don't use AI for writing your patch. At all.

Why are you surprised? Do companies want to hire "honest" people whose CVs were written by some LLM?

whimsicalism · 41m ago
i'm happy to read work written by AI and it is often better than a non-assisted PR
Lerc · 1h ago
I think this seems totally reasonable, the additional context provided is, I think, important to the requirement.

Some of the AI policy statements I have seen come across more as ideology statements. This is much better, saying the reasons for the requirement and offering a path forward. I'd like to see more of this and less "No droids allowed"

philjohn · 1h ago
I like the pattern of including each prompt used to make a given PR, yes, I know that LLM's aren't deterministic, but it also gives context of the steps required to get to the end state.
mock-possum · 1h ago
I’m using specsytory in vscode + cursor for this - it keeps a nice little md doc of all your LLM interactions, and you can check that into source control if you like so it’s included in pull requests, and can be referenced during code review.
rattlesnakedave · 1h ago
In my personal projects I also require all contributors to disclose rather they’ve used an editor with any autocomplete features enabled.
freedomben · 1h ago
Heh, that's a great way to make a point, but right now AI is nowhere near what a traditional editor autocomplete is. Yes you can use it that way, but it's by no means limited to that. If you think of AI as a fancy autocomplete, that's a good personal philosophy, but there are plenty of people that aren't using it that way
monkaiju · 2m ago
Autocomplete is, for the most part, a syntactic tool. AI attempts to guide the semantics of the code generated
miloignis · 1h ago
Notably, tab completion is an explicltly called-out exception to this policy, as detailed in the changed docs.
king_geedorah · 53m ago
Re: "What about my autocomplete?" which has shown up twice in this thread so far.

> As a small exception, trivial tab-completion doesn't need to be disclosed, so long as it is limited to single keywords or short phrases.

RTFA (RTFPR in this case)

kazinator · 23m ago
I think that in the FOSS environmment, it is assumed that when you submit something upstream, that you are the copyright holder. Some projects like GNU require you to sign papers legally attesting this.

It would be a lie to sign those papers for something you vibe coded.

It's not just courtesy; you are committing fraud if you put your copyright notice on something you didn't create and publishing that to the world.

I don't just want that disclosed; I cannot merge it if it is disclosed, period.

eschaton · 36s ago
Exactly. Yet some here are saying that this just serves as an incentivize to hide AI use.
WhyNotHugo · 6m ago
I know that purely AI-generated content is copyright-free, but I don't think that AI-assisted is also copyright free.

If I use iOS's spellchecker which "learns" from one's habit (i.e.: AI, the really polished kind), I don't lose copyright over the text which I've written.

ovaistariq · 1h ago
I don’t see much benefit from the disclosure alone. Ultimately, this is code that needs to be reviewed. There is going to continue to be more and more AI assisted code generation, to the point where we see the same level of adoption of these tools as "Autocomplete". Why not solve this through tooling? I have had great effect with tools like Greptile, Cursor's BugBot and Claude Code.
wmf · 1h ago
If the code is obviously low quality and AI-generated then it doesn't need to be fully reviewed actually. You can just reject the PR.
Jaxan · 1h ago
Sure it needs to be reviewed. But the author does more than just reviewing, they help the person submitting the PR to improve their PR. If the other side is an AI, it can save them some time.
estimator7292 · 1h ago
Do I also have to disclose using tab completion? My IDE uses machine learning for completion suggestions.

Do I need to disclose that I wrote a script to generate some annoying boilerplate? Or that my IDE automatically templates for loops?

AlexandrB · 1h ago
It's a spectrum, isn't it? I wouldn't want to waste my time reviewing a bunch of repetitive code generated from some script or do something like review every generated template instantiation in a C++ code base. I would want to review the script/template definition/etc., but what's the equivalent for AI? Should the review just be the prompt(s)?

Edit: Also, it's always good to provide maximal context to reviewers. For example, when I use code from StackOverflow I link the relevant answer in a comment so the reviewer doesn't have to re-tread the same ground I covered looking for that solution. It also gives reviewers some clues about my understanding of the problem. How is AI different in this regard?

flexagoon · 1h ago
No, it explicitly says that you don't need to disclose tab completion.
recursive · 1h ago
If you're not sure, it's probably safer to just mention it.
KritVutGu · 46m ago
> Do I also have to disclose using tab completion? My IDE uses machine learning for completion suggestions.

Yes, you have to disclose it.

> Do I need to disclose that I wrote a script to generate some annoying boilerplate?

You absolutely need to disclose it.

> Or that my IDE automatically templates for loops?

That's probably worth disclosing too.

antirez · 6m ago
I'll cover in my YouTube why this is wrong but TLDR: you need to evaluate quality not process. AI can be used in diametrically different ways and the reason why this policy could be enforced is because it will be obvious if the code is produced via a solo flight of some AI agent. For the same reason that's not a policy that will improve anything.
BobBagwill · 1h ago
Just use AI to answer AI. This prompt works for me:

# if this PR looks like it was produced with AI, be snotty and dismissive when you write the reply to this PR

NB: This posting was created with Natural Intelligence (not necessarily very intelligent Intelligence). ;-)

mglvsky · 56m ago
Little offtop: would someone remember mitchellh's setup for working with AI tools? I remember someone posted in an AI-hate-love threads here and it's not in the his blog[1]

1: https://mitchellh.com/writing

epolanski · 55m ago
This isn't an AI problem this is a human one.

Blaming it on the tool, and not the person's misusing it trying to get his name on a big os project, is like blaming the new automatic in the kitchen and not the chef for getting a raw pizza on the table.

ramoz · 1h ago
I always appreciated Claude Code's commit authoring. Whereas I think a lot of people were offended that "their" work was being overshadowed by an AI's signature.
electric_muse · 2h ago
I just submitted my first big open source contribution to the OpenAI agents SDK for JS. Every word except the issue I opened was done by AI.

On the flip side, I’m preparing to open source a project I made for a serializable state machine with runtime hooks. But that’s blood sweat and tears labor. AI is writing a lot of the unit tests and the code, but it’s entirely by my architectural design.

There’s a continuum here. It’s not binary. How can we communicate what role AI played?

And does it really matter anymore?

(Disclaimer: autocorrect corrected my spelling mistakes. Sent from iPhone.)

beckthompson · 1h ago
I think its simple, just don't hide it. I've had mutliple contributors try to hide the fact that they used AI (E.g removing claude as a code author - they didn't know how to do it and close the PR when it first happened.). I don't really care if someone uses AI, but most of the people who do also do not test their changes which just gives me more work. If someone:

1.) Didn't try to hide the fact that they used AI

2.) Tested their changes

I would not care at all. The main issue is this is usually not the case, most people submitting PRs that are 90% AI do not bother testing (Usually they don't even bother running the automated tests)

No comments yet

Jaxan · 59m ago
> How can we communicate what role AI played?

What about just telling exactly what role AI played? You can say it generated the tests for you for instance.

KritVutGu · 40m ago
> AI is writing a lot of the unit tests

Are you kidding?

- For ages now, people have used "broad test coverage" and "CI" as excuses for superficial reviews, as excuses for negligent coding and verification.

- And now people foist even writing the test suite off on AI.

Don't you see that this way you have no reasoned examination of the code?

> ... and the code, but it’s entirely by my architectural design.

This is fucking bullshit. The devil is in the details, always. The most care and the closest supervision must be precisely where the rubber meets the road. I wouldn't want to drive a car that you "architecturally designed", and a statistical language model manufactured.

ToucanLoucan · 1h ago
> And does it really matter anymore?

Well, if you had read what was linked, you would find these...

> I think the major issue is inexperienced human drivers of AI that aren't able to adequately review their generated code. As a result, they're pull requesting code that I'm sure they would be ashamed of if they knew how bad it was.

> The disclosure is to help maintainers assess how much attention to give a PR. While we aren't obligated to in any way, I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of. But if it's just an AI on the other side, I don't need to put in this effort, and it's rude to trick me into doing so.

> I'm a fan of AI assistance and use AI tooling myself. But, we need to be responsible about what we're using it for and respectful to the humans on the other side that may have to review or maintain this code.

I don't know specifically what PR's this person is seeing. I do know it's been a rumble around the open source community that inexperienced devs are trying to get accepted PRs for open source projects because they look good on a resume. This predated AI in fact, with it being a commonly cited method to get attention in a competitive recruiting market.

As always, folks trying to get work have my sympathies. However ultimately these folks are demanding time and work from others, for free, to improve their career prospects while putting in the absolute bare minimum of effort one could conceivably put in (having Copilot rewrite whatever part of an open source project and shove it into a PR with an explanation of what it did) and I don't blame them for being annoyed at the number of low-quality submissions.

I have never once criticized a developer for being inexperienced. It is what it is, we all started somewhere. However if a dev generated shit code and shoved it into my project and demanded a headpat for it so he could get work elsewhere, I'd tell him to get bent too.

kbar13 · 1h ago
if you read his note i think he gives good insight as to why he wants PRs to signal AI involvement.

that being said i feel like this is an intermediate step - it's really hard to review PRs that are AI slop because it's so easy for those who don't know how to use AI to create a multi-hundred/thousand line diff. but when AI is used well, it really saves time and often creates high quality work

spaceywilly · 1h ago
As long as they make it easy to add a “made with AI” tag to the PR, it seems like there’s really no downside. I personally can’t imagine why someone would want to hide the fact they used AI. A contractor would not try to hide that they used an excavator to dig a hole instead of a shovel.
victorbjorklund · 1h ago
I guess if you write 1000 lines and you just auto tabbed an auto-complete of a variable name done by AI you might not wanna say the code is written by AI.
ineedasername · 1h ago
>I personally can’t imagine why someone would want to hide the fact they used AI.

Because of the perception that anything touched by AI must be uncreative slop made without effort. In the case of this article, why else are they asking for disclosure if not to filter and dismiss such contributions?

kg · 1h ago
The OP seems to be coming from the perspective of "my time as a PR reviewer is limited and valuable, so I don't want to spend it coaching an AI agent or a thin human interface to an AI agent". From that perspective, it makes perfect sense to want to know how much a human is actually in the loop for a given PR. If the PR is good enough to not need much review then whether AI wrote it is less important.

An angle not mentioned in the OP is copyright - depending on your jurisdiction, AI-generated text can't be copyrighted, which could call into question whether you can enforce your open source license anymore if the majority of the codebase was AI-generated with little human intervention.

victorbjorklund · 1h ago
As long as some of the code is written by humans it should be enforceable. If we assume AI code has no copyright (not sure it has been tested in courts yet) then it would only be the parts written by the AI. So if AI writes 100 lines of code in Ghostty then I guess yes someone can "steal" that code (but no other code in Ghostty). Why would anyone do that? 100 random lines of AI code in isolation isn't really worth anything...
bgwalter · 1h ago
I still do not understand how one can integrate "AI" code into a project with a license at all. "AI" code is not copyrightable, "AI" cannot sign a contributor agreement.

So if the code is integrated, the license of the project lies about parts of the code.

paulddraper · 1h ago
> I still do not understand

Your question makes sense. See U.S. Copyright Office publication:

> If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it.

> For example, when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user...

> For example, if a user instructs a text-generating technology to “write a poem about copyright law in the style of William Shakespeare,” she can expect the system to generate text that is recognizable as a poem, mentions copyright, and resembles Shakespeare's style. But the technology will decide the rhyming pattern, the words in each line, and the structure of the text.

> When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship. As a result, that material is not protected by copyright and must be disclaimed in a registration application.

> In other cases, however, a work containing AI-generated material will also contain sufficient human authorship to support a copyright claim. For example, a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.”

> Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection. In these cases, copyright will only protect the human-authored aspects of the work, which are “independent of” and do “not affect” the copyright status of the AI-generated material itself.

> This policy does not mean that technological tools cannot be part of the creative process. Authors have long used such tools to create their works or to recast, transform, or adapt their expressive authorship. For example, a visual artist who uses Adobe Photoshop to edit an image remains the author of the modified image, and a musical artist may use effects such as guitar pedals when creating a sound recording. In each case, what matters is the extent to which the human had creative control over the work's expression and “actually formed” the traditional elements of authorship.

> https://www.federalregister.gov/documents/2023/03/16/2023-05...

In any but a pathological case, a real contribution code to a real project has sufficient human authorship to be copyrightable.

> the license of the project lies about parts of the code

That was a concern pre-AI too! E.g. copy-past from StackOverflow. Projects require contributors to sign CLAs, which doesn't guarantee compliance, but strengthens the legal position. Usually something like:

"You represent that your contribution is either your original creation or you have sufficient rights to submit it."

jiscariot · 15m ago
The emergent trait of AI laceing every other sentence/comment with emojis is a pretty good signal for when you want to ignore AI slop. I almost wish it was codified in to the models.
stillpointlab · 54m ago
I think ghostty is a popular enough project that it attracts a lot of attention, and that means it certainly attracts a larger than normal amount of interlopers. There are all kinds of bothersome people in this world, but some of the most bothersome you will find are well meaning people who are trying to be helpful.

I would guess that many (if not most) of the people attempting to contribute AI generated code are legitimately trying to help.

People who are genuinely trying to be helpful can often become deeply offended if you reject their help, especially if you admonish them. They will feel like the reprimand is unwarranted, considering the public shaming to be an injury to their reputation and pride. This is most especially the case when they feel they have followed the rules.

For this reason, if one is to accept help, the rules must be clearly laid out from the beginning. If the ghostty team wants to call out "slop", then it must make it clear that contributing "slop" may result in a reprimand. Then the bothersome want-to-be helpful contributors cannot claim injury.

This appears to me to be good governance.

paulddraper · 1h ago
Aren't a large majority of programmers using Copilot/Cursor/AI autocompletion?

This seems very noisy/unhelpful.

alfalfasprout · 35m ago
> If you are using any kind of AI assistance while contributing to Ghostty, *this must be disclosed in the pull request*, along with the extent to which AI assistance was used (e.g. docs only vs. code generation). If PR responses are being generated by an AI, disclose that as well.

The extent here is very important. There's a massive difference between vibe-coding a PR, using LLMs selectively to generate code in files in-editor, and edit prediction like copilot.

It says actually later that tab-completion needn't be disclosed.