The OP is conflicted by wanting to share code under a permissive OSS license but regretting how this is benefitting certain users they dislike. I understand this, but at least it burns both ways (look at Microsoft's VS Code vs. Cursor and the other forks).
The way "real" OSS licenses (by OSI) allow use for any purpose has been a major reason the movement has succeeded and for that it's worth putting up with some users doing stuff the authors might not like.
jasonthorsness · 9h ago
And to the assertion that Copilot emits copyrighted code - this is in nobody's best interest, not the authors, Microsoft, or Microsoft's customers. Microsoft had to promise to legally defend any of its customers that are sued for using code that came from Copilot (https://www.legal.io/articles/5443653/Microsoft-Will-Pay-for...). I've used Copilot in a business setting and sometimes it will cancel a response because it detects that it contained copyrighted material. So Microsoft appears to be trying its best to avoid the main problem the article is concerned about, and they are confident enough that they are succeeding to make legal guarantees.
johnea · 6h ago
I disagree with that summary of the motive.
It seems to me the author is concerned about terms of a free software license not being maintained in versions of the code redistributed, especially by LLM agents.
The GPL requires derivative works to also be GPL licensed.
Will coplot or other tools abide by this requirement?
jasonthorsness · 6h ago
From what I read Microsoft claims to be able to ignore the licenses completely and apparently they don’t consider model training a derivative work. Bold position maybe we’ll see how that turns out.
y-curious · 10h ago
Is the implication that GitHub can't just go to your website and train on your code? It's open source code, I'm sure they're casting a wider net.
jdiff · 10h ago
Sure, they could. But why would they? It's the wild west out there and there's barely anything you don't already have. Maybe add support for some of the larger code forges, but anything smaller than that is too small to bother with. It's not going to make the difference.
dgrin91 · 9h ago
Because why wouldn't they? Web crawlers are cheap (relatively). More data is good. No one is manually adding or reviewing sites, they just hoover up the whole Internet.
jdiff · 7h ago
Git repos on software forges are structured, polite, easy to process, and are ubiquitous. You can provide additional information very easily with simple existing tools to enhance learning.
Like I said, the web's the wild west. It'd require sifting through a lot of noise and there's no guarantee you can reconstitute a sane codebase that you can learn something practical from. So yeah, they could do it, and if the net is indiscriminate it will probably be in there, but trying to target it is pretty pointless given the incredibly tiny returns.
bgwalter · 7h ago
"If the project is under an open source license, it means that everyone can share a copy – even on GitHub – of the licensed material under certain conditions."
This is disputable. GitHub demands so many additional rights from the uploader, in particular, using the code for AI training. This violates the attribution part of all OSS licenses, so only the original copyright holder can upload to GitHub and give GitHub these rights.
nialse · 10h ago
The underlying reasoning seems to be that agreeing to GitHub TOS may put an uploader of open source code in breach of the license. But, uploading code to GitHub being convenient, this has been ignored. Is this so?
No comments yet
jmclnx · 10h ago
With me you are speaking to the choir. I moved to gitlab + anon ftp.
If gitlab starts doing the same as github, I will delete all my items from gitlab and use only anon ftp.
jsheard · 10h ago
GitLab is already describing itself as an "AI-powered DevSecOps platform", if you want nothing to do with AI then that ship has sailed.
jmclnx · 10h ago
Thanks, I guess it is now only anon-ftp. Luckily I have been expecting this so I still stayed with anon ftp once I moved away from github.
Next week looks a bit busy now.
jsheard · 10h ago
There's still Codeberg, SourceHut, etc which are sticking to the fundamentals. The former is even a non-profit so they're unlikely to get dragged into the VC FOTM.
bigyabai · 10h ago
> since GitHub may not respect the terms of licensed code that is hosted on their servers, not uploading the code of others there is, in fact, a deeply ethical choice.
Feels like a moot point, this goes for anything you upload anywhere (as reprehensible as it is). Perhaps all music and movie clips should be taken off YouTube lest someone train an AI on them? Analog video makes a return to stop the evil consequences of digital training?
guube · 10h ago
I think the main point of the argument is that you should not upload other people’s code to GitHub. Whether you upload your own code is your choice to make. Same goes with uploading to YouTube. You should not upload an entire film to YouTube if you do not have the right to distribute the film
bonki · 8h ago
The huge difference being that you do have the right to upload to GitHub if the code is released under an OSS license. If a project asks you nicely not to it would be equally nice to honor that, but there is no legal obligation to do so. Putting a copyrighted movie on Youtube to which you have no distribution rights is comparing apples to oranges and not the same thing.
johnea · 6h ago
> you do have the right to upload to GitHub if the code is released under an OSS license
This assumes that uploading to github does not involve granting other rights to M$ that would violate the original license, such as attribution or copyright.
bonki · 5h ago
IANAL but I'm pretty sure that's not how it works. If MS chooses to violate the original license that is between the original author and MS. You as the uploader do not and cannot grant rights to MS which you did not receive in the first place. You upload the code under its original license and if MS does not honor the license that is not on you. Not defending MS here in any way, btw, I do firmly believe in opt-out.
andrepd · 10h ago
The right answer would be regulation and enforcement with teeth, but wishing for that seems as far fetched as wishing for a unicorn.
bediger4000 · 10h ago
I feel extremely cynical about "intellectual property" enforcement. In some circumstances (recorded music ca 1995-2015), IP laws are extremely punitive, and enthusiastically enforced. There's even some folklore around copyright, too. It's easy to get dogpiled on social media if you say copyright on books is too long, or too strictly enforced, or that fair use as enshrined in law and not enforced isn't really fit for purpose.
But as far as The Web goes, apparently all content is very loosely owned, copyright is gauzy and difficult to discern, and oligarchs want to do away with it altogether.
It's not hard to figure out that the "hidden variable" here is that already wealthy people typically held music, movie and book content, while plebes without the money to sue hold web content copyright.
The way "real" OSS licenses (by OSI) allow use for any purpose has been a major reason the movement has succeeded and for that it's worth putting up with some users doing stuff the authors might not like.
It seems to me the author is concerned about terms of a free software license not being maintained in versions of the code redistributed, especially by LLM agents.
The GPL requires derivative works to also be GPL licensed.
Will coplot or other tools abide by this requirement?
Like I said, the web's the wild west. It'd require sifting through a lot of noise and there's no guarantee you can reconstitute a sane codebase that you can learn something practical from. So yeah, they could do it, and if the net is indiscriminate it will probably be in there, but trying to target it is pretty pointless given the incredibly tiny returns.
This is disputable. GitHub demands so many additional rights from the uploader, in particular, using the code for AI training. This violates the attribution part of all OSS licenses, so only the original copyright holder can upload to GitHub and give GitHub these rights.
No comments yet
If gitlab starts doing the same as github, I will delete all my items from gitlab and use only anon ftp.
Next week looks a bit busy now.
Feels like a moot point, this goes for anything you upload anywhere (as reprehensible as it is). Perhaps all music and movie clips should be taken off YouTube lest someone train an AI on them? Analog video makes a return to stop the evil consequences of digital training?
This assumes that uploading to github does not involve granting other rights to M$ that would violate the original license, such as attribution or copyright.
But as far as The Web goes, apparently all content is very loosely owned, copyright is gauzy and difficult to discern, and oligarchs want to do away with it altogether.
It's not hard to figure out that the "hidden variable" here is that already wealthy people typically held music, movie and book content, while plebes without the money to sue hold web content copyright.