The fact that they called this captioned as "The bottom photo is an AI-generated image created in six prompts using OpenAI’s ChatGPT", without actually releasing the 6 prompts is quite telling. Because that will show they were prompting things to match the original image.
Iconic images (mona lisa, tank man, widely reported news stories, styles like ghibli) would of course be incorporated in as styles. It doesn't refute fair use.
So, you can't say "draw this person in the mona lisa pose, in simpsons style" and then act surprised and shocked when the model does exactly that. That's not theft.
skerit · 5h ago
I don't understand the example image. The description says:
> The yellow circles highlight areas of similarity between the original photo and the AI-generated photo
What about the person in the middle, doing the throwing? It's exactly the same. Why don't they have a circle around them? Why highlight some lens flares & distorted faces when the actual subject is the same? Or was this some kind of image-to-image generation?
theknarf · 5h ago
Presumably the guy doing the throwing was a part of the actual prompt. While the things circled was not.
andersa · 6h ago
The entire premise is stupid. No theft occurred at any point, the original authors still have their creations.
TrackerFF · 5h ago
Following this logic, anything that is digital can not be stolen? After all, you only steal a digital copy - not the original, and not something that is physical.
In which case, the following is not theft:
- Pirating anything that is digital video
- Pirating anything that is digital audio
- Pirating anything that is digital text
- Pirating any software
etc.
EDIT: For the proponents of "it is not theft", what's your stance on personal data being stolen ("accessed")? If someone can access your medical records, are you fine with that? They can do whatever they want with that?
tialaramex · 5h ago
Yes. Theft explicitly requires "Intent permanently to deprive". Copying things is not theft because the supposed "victim" was never in fact deprived of their thing, not even temporarily.
England has a whole bunch of legislation to prohibit activities that are not theft because it turns out that sometimes we care about other things. TWOC is an example, "Taking Without Owners Consent", because it turns out that it's also very annoying to have people take your car and then drive it somewhere and abandon it for a laugh ("Joy riding"), compared to them say, stealing it to ship abroad, break into parts or just to set it on fire.
Insisting that it's all theft gets us in the same muddle as when we decide that holding up a protest sign is terrorism, or that a billion dollar bribe is speech.
concinds · 5h ago
Correct. Piracy is potentially copyright infringement, not "theft". People insist on using that word, despite it being factually wrong, due to its emotional valence and persuasive value.
The author is a photojournalist, not a lawyer, and not qualified to comment on copyright law beyond simply giving his personal opinion.
amelius · 5h ago
They never said "theft in the legal sense". Also they mean "theft of income" not "theft of data". This is not a legal text obviously.
concinds · 4h ago
> Also they mean "theft of income"
Right. It's the debunked, bad-faith argument that "piracy is theft" because the pirates would have bought a movie were it not for The Pirate Bay, to justify calculating monetary damages that are completely unmoored from reality. We shouldn’t let people revive this nonsense, even in the context of AI.
amelius · 4h ago
In case of AI it is closer to actual theft because income (of an artist) is taken away, and someone else (AI capitalists) makes money from it. I don't think the comparison is unreasonable here.
xscott · 1h ago
How come it's just an "artist" and not an "artist capitalist"? And isn't this "rent seeking" behavior on the part of the artist?
It would be nice if the world used clinical terms to describe what's actually going on instead of using emotional terms to equate "copying" to "stealing" or "pirating", but people would rather confuse the issue than argue it logically in good faith.
9dev · 5h ago
You can hold that stance, but it completely undermines the source of income for artists or inventors. If you want a world where only rich and privileged people can afford the time to create art, this is the way to go.
Otherwise, we maybe can agree that people who make something should be eligible for some kind of compensation to encourage them to continue making, for our shared benefit.
You can argue that Beyoncé and George Clooney and Stephen King are so rich already they don’t need the money anyway, but that omits how even these people had to make a career from the bottom up on the sole premise that their focus on their art (regardless of your opinion on it) will pay the bills.
So just saying piracy isn’t theft and thus isn’t a problem is a wholly undercooked answer to a difficult problem.
rcxdude · 5h ago
The "therefore isn't a problem" was not, I believe, present in the original post. It's entirely possible to believe that copyright infringement is not theft and yet also still a problem. It's mainly an appeal to not misrepresent the situation, because theft in a literal sense refers to a situation which is generally worse than copyright infringement.
FeepingCreature · 5h ago
If you follow this logic, it gets really close to arguing that open source is price dumping. Nobody has an obligation to an income in a particular field of work.
kgwgk · 5h ago
People have been using the word for thousands of years so they may not stop now because you say so. (Romans were already using “furtum” - theft - and “plagium” - kidnapping - for ideas and not just for things and people.)
sam_lowry_ · 4h ago
Why don't we make a conscious effort and call it "sharing" and not "piracy"?
dandellion · 5h ago
Absolutely, yes. Even as a professional software engineer I think software should be free, and tell everybody I know to pirate software, if they can.
Imagine we discovered a way to generate almost unlimited energy for very cheap, then we told poor people if they want any they have to pay it at the same price per kilowatt hour as current energy or it's stealing. It would be morally wrong. Digital content is the same, current copyright laws are unethical.
hn_throw2025 · 5h ago
> tell everybody I know to pirate software, if they can.
I can see where you’re coming from on a philosophical level.
On a practical level that’s just asking for malware, especially for software that’s been cracked.
I would tell people the opposite.
wiseowise · 5h ago
Better ask them to support OSS when they can.
ghosty141 · 5h ago
I would say yes this is not theft.
What happens is you just copy and/or modify something. Kinda like duplicating money, it's not theft since you don't steal from anybody, no ownership or similar changes.
hau · 5h ago
Makes sense. There is very little in common between physical theft and unauthorised information copying. It may be an act worse than theft in ethical or other way, but it's a different thing. Crude theft analogy is good to colour it a particular way and evoke emotions, but harmful for a reasonable discussion.
looofooo0 · 5h ago
No you copy it, the file remains on the server. The problem is the semantics and all what comes with it:
Now you’re getting it. Keep going, you’re almost there.
bestouff · 5h ago
Exactly. Copyright infringement (a.k.a. pirating) is not theft. The owner is only deprived of a theoretical revenue/leverage/whatever (s)he could have exerced, but (s)he can still enjoy its original work.
gr4vityWall · 5h ago
> Following this logic, anything that is digital can not be stolen?
Yes. Some of us might even prefer to use a positive term such as “sharing information with your neighbor.”
DrScientist · 5h ago
I think the issue here is the theft of livelihoods.
If you spend months crafting something unique with your livelihood being based on then selling access to that creation ( whether it be music, software or prose ) and somebody copies it in a way that deprives you of a livelihood ( and replaces it with a revenue to the entity that copied it's ) - it's that theft of your income stream?
xscott · 48m ago
Do you feel the same about spreadsheet software reducing the need for accountants? What about textile workers creating fabric for clothing by hand? Or do we only romanticize artists such that they're entitled to an income?
42lux · 6h ago
Can you send me a copy of the front and back of your social security card? You still have your card so no theft happened.
close04 · 6h ago
You wouldn't "steal" that identity, you'd just prove that e.g. banks are bad at validating identity.
I don't feel this is quite the gotcha argument you believe it is. Your last line is indeed correct, he would still have his card, so no theft would have occurred. Just having a copy of said card is not theft. Likewise should a person use said card for a nefarious purpose, that is still not theft, that is fraud.
gwd · 5h ago
There was a good essay I saw a bit ago, talking about how this shift from "fraud" to "identity theft" neatly started shifting the victim from the bank to the individual. E.g., 30 years ago, if someone claiming to be me went to my bank, asked for my money, and the bank gave it to them, then the bank that was a victim of fraud. But now, if someone goes to my bank, asks for my money, and the bank gives it to them, then I am the victim of identity theft.
The difference is subtle, but potentially important. If the bank unfortunately gives money to someone else, that's their problem: I can say to the bank, I'm sorry you were the victim of fraud, but you still owe me my money. If I unfortunately "have my identity stolen", then that makes it seem like it's my problem -- the bank may say, we're sorry you "had your identity stolen" and thus lost your money, but that's not really our problem.
XCabbage · 5h ago
Well, not necessarily, on your last sentence. It might also be theft, depending on the precise nefarious purpose and on the jurisdiction. If you take somebody else's property without their consent, that's typically theft, even if the "property" is money in a bank account and no tangible physical object changed hands, and even if the method of taking involved deception. Fraud and theft overlap.
42lux · 5h ago
You are really close...
wiseowise · 5h ago
No, your argument is a complete bogus.
If I steal your ID card, then nothing really happens at all. I can spend years watching it, draw something on it, cut it in pieces, whatever.
The moment I start trying to impersonate you, then it becomes a problem for you.
If Bobby the metalhead downloads a copy of Enter the Sandman, then Metallica and brand doesn’t lose anything at all. If you were to make it industrial, then maybe we can talk about it.
42lux · 5h ago
You are also really close...
wiseowise · 5h ago
Can you do the same trick with my copy of Metallica, please?
rcxdude · 6h ago
"Identity theft" is another misuse of the word to deflect from what is actually an authentication failure.
chii · 6h ago
It's identity fraud, but people use theft to mean fraud here.
close04 · 5h ago
People use "theft" because it's the terminology the companies responsible for the failure insist on using. The word makes it sound like the person failed to protect their identity, and hides that it was the company which failed to validate the identity.
This should be called "identity validation failure".
When scammers impersonate a company to steal your money it's no longer called identity theft.
chii · 4h ago
you are absolutely correct. The idea that it is theft is tinged with victim blaming connotations.
jillesvangurp · 5h ago
I wouldn't call it stupid but I do expect that the legal cases will die somewhere along their way to the supreme courts.
Several reasons:
1) The cat is out of the bag, AI is a thing now. It's not going back in the bag. So, artists are going to have to adapt to that. And are already adapting.
2) AI affecting artists is not any different than photography wiping out the market for portrait painters. Or records wiping out the market for selling music in paper form so that professional musicians might reproduce that for an audience. Those things happened a long time ago of course and any copyright issues around that were resolved over time. Copyright law hasn't really changed much since before that happened. These are the same kind of questions that exists for AIs being obviously inspired by but not really perfectly copying songs, images, text, etc. And they have answers in many decades of case law. Judges are going to take all of that into account and are historically reluctant to introduce new interpretations.
3) AI companies are big enough to outright buy the larger copyright holders. That doesn't mean they will; but it suggests they might reach some settlement that doesn't involve pleading guilty in a court. In the end this is about money, not about principles. At least not for the parties paying the lawyers.
4) If a settlement doesn't happen, judges will be forced to look at existing cases to assess what is and isn't an infringement. And if you remove all the outrage and moralism from the equation (which judges tend to do), AI companies are simply not distributing copies of original works to anyone. They are using them, for sure. But it's distributing copies that gets you in trouble. Not using copies. That narrows it down to whether those copies, which are freely distributed on the internet, were obtained legally.
BenGosub · 6h ago
Digital music is infinitely reproducible, but that doesn't mean that I am allowed to illegally download it. It is similar with articles. The author decides who can consume their art,
xscott · 40m ago
> doesn't mean that I am allowed to illegally download it
That word "illegally" is carrying some weight there. You only get to have those rules in law because the rest of us agreed to them in some sense. And in this particular case, the laws were created by powerful and wealthy media industries bribing politicians.
There's not some universal truth about fairness here. It's just a set of conventions where people with guns show up and lock you in prison if you bypass someone else seeking rent.
falcor84 · 5h ago
> The author decides who can consume their art,
What?! How would that even happen? Unless you limit your definition of art to performing at salon events, that doesn't make much sense. Typically art is released into the world and at best the authors can get a bit of rent from the people who consume it (typically via a publisher), but they don't have any control over who the consumers would be.
BenGosub · 5h ago
For example, I decide if my music is going to stream on YouTube or Spotify, or it will only sell on Bandcamp.
I decide if I am going to go with a publisher, which publisher and what platforms.
What you label as "typically" has only been the case for the past ≈15 years.
wiseowise · 5h ago
You don’t decide anything on the platforms. The moment you upload your music there, they decide what they do with it.
falcor84 · 5h ago
YouTube and Spotify and Bandcamp are distribution platforms, not your consumers
BenGosub · 5h ago
My point is that just like I decide if I am going to distribute to streaming services, I also should be able to decide if I am going to allow models to train on my work. Isn't this something obvious?
falcor84 · 2h ago
It is not just not obvious to me, but goes entirely against my mental model.
Staying with your example of music or book authors - a musician or author might choose which distribution platforms they work with, but they don't have the ability to tell a particular shop to not resell their record/book, let alone to tell a particular "consumer" not to listen to it / read it.
BenGosub · 1h ago
Yes, I agree with you and when you give distribution rights to a third party they have the distribution control. How I imagine this mess being resolved is by a middleman, like the distribution services in music, there need to be companies licensing training data, from which artists could take a small cut.
stereolambda · 5h ago
I would just establish that all references to "theft" and "stealing" in the realm of copyright (with the notable exception of plagiarism) is metaphor and emotional rhetoric. Historically it would come from copyright interest groups who want(ed) to use criminal police to enforce their state-granted copyright privileges[1] against regular people.
Sadly these things are often decided by rhetoric in society, but then again, there's no actual debate if it's just throwing slogans.
Now some of the same rhetoric is used in the AI battle. The only question worth asking here is what's the social benefit, as human culture is by nature all commons and derivation. But in this case, the AI companies are also accumulating power, and LLMs are removing attribution which could be argued to discourage publishing new works more than piracy. A "pirate" may learn about you and later buy from you in different ways, a LLM user won't even know that you exist.
[1] Not even discussing how exaggerated these privileges are from what would be reasonable.
jachee · 5h ago
Why the carve-out for plagiarism? And how does an LLM reciting all or part of a work verbatim not qualify as plagiarism?
stereolambda · 5h ago
Because if you present yourself as the author, it follows that the actual author is deprived of attribution. So you are actually taking something from that person.
LLM could commit plagiarism if authorship of generated media was claimed for either the LLM or its creators.
socalgal2 · 5h ago
Hopefully I can get you to do my taxes. when you’re done I’ll just make a copy and then not pay you because I’ll have not stolen anything and so I don’t owe you any money
d_k_f · 5h ago
In terms of property theft: exactly. Which is why your tax advisor would insist on a contract with you that outlines compensation for services rendered, time and material, etc.
dartharva · 5h ago
I think they are more scared that the AI models will reproduce their original content in odd contexts that may bring liability to them and cause overall pandemonium.
exe34 · 5h ago
Exactly! That's why digital content piracy is legal and copyright doesn't exist.
eastbound · 5h ago
I find the argument “Nothing was stolen” very weak compared to “The infinite endeavour of preventing private people from copying bytes, shouldn’t be a prerogative of the state” / ideas are copiable and can’t be decently guaranteed by the public against copy.
That is, public funds, tribunals and lawmaking power, shouldn’t be used to protect a private interests.
Corollary: 1. DRM is ok. 2. If one cracks it, it’s ok too, 3. You gotta find other ways to conduct business than retaining information (licenses, movies, etc.) 4. For software companies, cloud is one of those ways, compared to licensing downloads, 5. Netflix is the cloud of the movies and you pay for earlier access to a shared experience that is synchronized with other friends who will watch the same thing, 5. Patents are another stupid attempt by the state to protect corporations against citizens.
gwd · 5h ago
He gives two examples that are meant to prove his point, but they don't really convince me.
The first example is the image: the AI has seen the famous photograph of the Ferguson riots, and (with 6 prompts?!) manages to get something fairly similar. But suppose a human had seen that photo, and then you asked that human to draw you a photo of the riots; and then continued prompting them to make it look similar. Is it really unrealistic that the human could generate something that looks as similar? Is the human themselves therefore inherently a violation of copyright?
The NYT article to begin with looks a bit more damning -- except that, it appears that they prompted the AI directly with the beginning of the article. My son, when a toddler, for a long time could recite nearly the full text of his favorite story books with minimal prompting -- does that mean he's inherently a violation of copyright? Because he can recite The Gruffalo almost verbatim when prompted, is he a walking violation of Julia Donaldson's copyright? What about people with photographic memory, that can recite long sections of books verbatim -- are they inherently violating copyright?
Now sure, in both cases, the output might be a violation of copyright, if it's clearly derived from it -- both for humans and for AI. But I don't think the fact that AI can be prompted to generate copyright-violating material is proof that the AI training itself has violated copyright, any more than the fact that a human can be prompted to generate copyright-violating material is proof that human training has violated copyright.
makeitdouble · 5h ago
Scale matters.
> But suppose a human [...]
A human doesn't ingest half of the web and simultaneously deal with millions of people.
We've be been through this times and times again. Justice didn't go after humans copying books by hand, it went after reprints of existing copyrighted material.
Music industry didn't go after people singing tunes in their kitchen, but after wide distribution networks.
Removing scale from the discussion leads to absurd conclusions.
gwd · 4h ago
I agree that we have a new situation developing; but we're not going to get any clarity unless we see clearly what the new situation is. There are several things you're still conflating:
1. The ability of an entity (human or AI) which can be prompted to produce copyright-infringing material.
Sure, if OpenAI is actually producing copyright-infringing material at scale, unprompted, then that needs to be addressed. If a common way around NYT's paywall were to copy & paste the first few lines into ChatGPT and then read the rest of the article, then yes, that's a hole that needs to be filled. But that's with the production and dissemination, not the training.
Regarding scale, yes, there is a difference here, but it's more subtle than you think. There are probably millions of toddlers who can recite The Gruffalo nearly verbatim. However, each of those toddlers were trained individually. Similarly, there are probably thousands, maybe tens of thousands of artists who, when prompted, could generate an image that would be similar enough to the presented image to violate copyright. But again, each of those individuals were trained separately.
The difference that modern tech companies have is that they can train their systems once, and then duplicate the same training across millions of instances.
One potential argument to make here would be to say: Training with this material is fair use; but fair use or not, those weights are now a derivative work. You can use exactly one copy of those weights, but you can't copy those weights millions of times, any more than it would be fair use to distribute one copy of that image to everyone in the company. You need to either train millions of copies, or pay licensing fees.
I'm not sure I agree with that argument, but at least it seems to me to bring the actual issues into more clarity.
spwa4 · 4h ago
> Justice didn't go after humans copying books by hand, it went after reprints of existing copyrighted material.
Not sure who "justice" is, but copyright owners most definitely did go after individuals copying even small excerpts of copyrighted material, in fact that is currently still going on in libraries:
> individuals copying even small excerpts of copyrighted material
Legal copyright claims (not the YouTube kind) need to justify harm to the original piece.
> libraries
Setting aside my opinion on the situation of public libraries, I wouldn't call libraries "humans copying books by hand"
> this happened: [...]
"This article is purely satirical and fictitious. "
wiseowise · 5h ago
> Intellectual property rights and copyrights be damned
Yes, be damned and fuck them. And stop pursuing individuals too.
You can have omniscient AI without feeding all of the data into it.
findthewords · 5h ago
One of two legal outcomes must follow from OpenAI's piracy.
1. OpenAI must be fined according to law.
2. Piracy is decriminalized.
Failing to do neither is an admission that the US has become a corporatocracy. That is, a form of oligarchy where the rule of law is not by a majority or plurality of people, but by a number or corporations you can count with one hand.
PicassoCTs · 4h ago
3. The law splits and breaks down controlled. One for cooperations and the upper echelons. One for the peasants.
hsbauauvhabzb · 5h ago
> Failing to do neither is an admission that the US has become a corporatocracy.
Not sure it really matters unless there’s something that will or can be done about it.
ktallett · 6h ago
I'm not AI's biggest fan, but it is training it's virtual brain on images and data, just like a creative human would, yet it is just far quicker at that.
Do we need to limit scraping and learning to only free publicly available data? Therefore is a Reuters photo still not publicly available for learning style of photo, composition, lighting, even with a watermark. What you pay for is the license to reproduce the image, you can still look at them beforehand.
nicbou · 5h ago
I think the main issue is that it uses monopolistic behaviour to press its advantage. It feels as if Google et al are scalping information from the actual producers.
There is also the nonconsensual aspect of it. I make things for humans. Just because it’s free for humans does not mean it’s free to be used by corporations, especially they use my work to kill the economics of my industry. It feels like a parasitic relationship.
hau · 5h ago
What really matters is the society where we want to live in. It doesn't matter much what kind of technology allows private entities to reproduce creativity. We can assume these brains are not virtual and are actually organic and more capable than human ones, or that they are magical black boxes.
Since it changes incentives and mechanics of the creativity market so much it forces us to reassess current approaches. I can't agree that mechanism behind this tech and the approach to IP is of any consequence. We don't make laws, norms and judgements for the sake of our tools.
It's ok to say that we're not ready to arrange things in a proper way yet, without letting everything slide on some arbitrary technicality.
notachatbot123 · 5h ago
As a human I am not allowed to freely acquire anything I want. I have to pay, rent, license. This makes me unable to learn and train my brain in many ways I would like to.
Why should "AI" companies be allowed more than I?
wiseowise · 5h ago
They literally employ torrents, like you can.
can16358p · 5h ago
No one was able to summarize exactly what I had in mind about AI this well for years.
As much as I think that what these companies is doing has moral and legal issues, reframing copyright violations as "theft" will cut both ways and make the arguments for open culture more difficult.
verisimi · 6h ago
Yes, all that data that ai is sucking up belongs to Google, twitter and Facebook! Lol. They are the owners of it. Those corporates didn't commit legal theft!
So ridiculous.
The whole idea of copyright is wrong. Anything that is popular and therefore successful, owes that to the crowd that made it popular. The crowd has is what gives it the interest.
I personally wouldn't even be averse to some sort period of copyright, say 2 years, with the possibility to extend to say 5, but these things are common.
PicassoCTs · 6h ago
My problem with AI is that it has humanity mediocre, filtered for the masses culture baked into it by definition.
This is machinery that can not suggest experimental jazz- unless it is already mainstream. It can not go to the fringes and shove the species towards new and wild discoveries.
I find it hilarious though, that it may destroy the cultural behemoths with their IPs by mashUp. Death by a million beatles-clone songs, yesterday came suddenly indeed. And after all that is sad and done for- some of us, might even venture out to the fringe and find the weird and wild parts again.
BLKNSLVR · 5h ago
Humanity rarely ventures out to the fringes on its own anyway. Only the committed freaks out there between the creation of experimental jazz and the mainstreaming of LLMs.
The more it changes, the more it stays the same.
As a sibling commenter said in fewer words: the models will still consume the fringe content and will be able to regurgitate it given the right prompt; suggesting experimental jazz or whatever other fringe art pursuit the committed freaks have an itch for.
I'll still slowly progress may way through John Zorn's catalogue, and occasionally re-invigorate my appreciation for HR Giger's bleak works, and keep listening (and subscribing) to the local community radio station, and seeking out uncomfortable movies, safe in the knowledge that likely I'll be watching, listening, and appreciating them all in solitude.
I do not think that "culture" can be forced, it must be slowly, almost imperceptibly, absorbed.
FeepingCreature · 5h ago
It can absolutely go outside the mainstream, it just doesn't by default. You can push it outside of its median.
BenGosub · 6h ago
If you use it like a search engine it can find some obscure blogs that list some rare jazz though.
Iconic images (mona lisa, tank man, widely reported news stories, styles like ghibli) would of course be incorporated in as styles. It doesn't refute fair use.
So, you can't say "draw this person in the mona lisa pose, in simpsons style" and then act surprised and shocked when the model does exactly that. That's not theft.
> The yellow circles highlight areas of similarity between the original photo and the AI-generated photo
What about the person in the middle, doing the throwing? It's exactly the same. Why don't they have a circle around them? Why highlight some lens flares & distorted faces when the actual subject is the same? Or was this some kind of image-to-image generation?
In which case, the following is not theft:
- Pirating anything that is digital video
- Pirating anything that is digital audio
- Pirating anything that is digital text
- Pirating any software
etc.
EDIT: For the proponents of "it is not theft", what's your stance on personal data being stolen ("accessed")? If someone can access your medical records, are you fine with that? They can do whatever they want with that?
England has a whole bunch of legislation to prohibit activities that are not theft because it turns out that sometimes we care about other things. TWOC is an example, "Taking Without Owners Consent", because it turns out that it's also very annoying to have people take your car and then drive it somewhere and abandon it for a laugh ("Joy riding"), compared to them say, stealing it to ship abroad, break into parts or just to set it on fire.
Insisting that it's all theft gets us in the same muddle as when we decide that holding up a protest sign is terrorism, or that a billion dollar bribe is speech.
The author is a photojournalist, not a lawyer, and not qualified to comment on copyright law beyond simply giving his personal opinion.
Right. It's the debunked, bad-faith argument that "piracy is theft" because the pirates would have bought a movie were it not for The Pirate Bay, to justify calculating monetary damages that are completely unmoored from reality. We shouldn’t let people revive this nonsense, even in the context of AI.
It would be nice if the world used clinical terms to describe what's actually going on instead of using emotional terms to equate "copying" to "stealing" or "pirating", but people would rather confuse the issue than argue it logically in good faith.
Otherwise, we maybe can agree that people who make something should be eligible for some kind of compensation to encourage them to continue making, for our shared benefit.
You can argue that Beyoncé and George Clooney and Stephen King are so rich already they don’t need the money anyway, but that omits how even these people had to make a career from the bottom up on the sole premise that their focus on their art (regardless of your opinion on it) will pay the bills.
So just saying piracy isn’t theft and thus isn’t a problem is a wholly undercooked answer to a difficult problem.
Imagine we discovered a way to generate almost unlimited energy for very cheap, then we told poor people if they want any they have to pay it at the same price per kilowatt hour as current energy or it's stealing. It would be morally wrong. Digital content is the same, current copyright laws are unethical.
I can see where you’re coming from on a philosophical level. On a practical level that’s just asking for malware, especially for software that’s been cracked. I would tell people the opposite.
What happens is you just copy and/or modify something. Kinda like duplicating money, it's not theft since you don't steal from anybody, no ownership or similar changes.
https://www.gnu.org/philosophy/not-ipr.html.en
Now you’re getting it. Keep going, you’re almost there.
Yes. Some of us might even prefer to use a positive term such as “sharing information with your neighbor.”
If you spend months crafting something unique with your livelihood being based on then selling access to that creation ( whether it be music, software or prose ) and somebody copies it in a way that deprives you of a livelihood ( and replaces it with a revenue to the entity that copied it's ) - it's that theft of your income stream?
(Mitchell & Webb Sound - Identity Theft)
The difference is subtle, but potentially important. If the bank unfortunately gives money to someone else, that's their problem: I can say to the bank, I'm sorry you were the victim of fraud, but you still owe me my money. If I unfortunately "have my identity stolen", then that makes it seem like it's my problem -- the bank may say, we're sorry you "had your identity stolen" and thus lost your money, but that's not really our problem.
If I steal your ID card, then nothing really happens at all. I can spend years watching it, draw something on it, cut it in pieces, whatever.
The moment I start trying to impersonate you, then it becomes a problem for you.
If Bobby the metalhead downloads a copy of Enter the Sandman, then Metallica and brand doesn’t lose anything at all. If you were to make it industrial, then maybe we can talk about it.
This should be called "identity validation failure".
When scammers impersonate a company to steal your money it's no longer called identity theft.
Several reasons:
1) The cat is out of the bag, AI is a thing now. It's not going back in the bag. So, artists are going to have to adapt to that. And are already adapting.
2) AI affecting artists is not any different than photography wiping out the market for portrait painters. Or records wiping out the market for selling music in paper form so that professional musicians might reproduce that for an audience. Those things happened a long time ago of course and any copyright issues around that were resolved over time. Copyright law hasn't really changed much since before that happened. These are the same kind of questions that exists for AIs being obviously inspired by but not really perfectly copying songs, images, text, etc. And they have answers in many decades of case law. Judges are going to take all of that into account and are historically reluctant to introduce new interpretations.
3) AI companies are big enough to outright buy the larger copyright holders. That doesn't mean they will; but it suggests they might reach some settlement that doesn't involve pleading guilty in a court. In the end this is about money, not about principles. At least not for the parties paying the lawyers.
4) If a settlement doesn't happen, judges will be forced to look at existing cases to assess what is and isn't an infringement. And if you remove all the outrage and moralism from the equation (which judges tend to do), AI companies are simply not distributing copies of original works to anyone. They are using them, for sure. But it's distributing copies that gets you in trouble. Not using copies. That narrows it down to whether those copies, which are freely distributed on the internet, were obtained legally.
That word "illegally" is carrying some weight there. You only get to have those rules in law because the rest of us agreed to them in some sense. And in this particular case, the laws were created by powerful and wealthy media industries bribing politicians.
There's not some universal truth about fairness here. It's just a set of conventions where people with guns show up and lock you in prison if you bypass someone else seeking rent.
What?! How would that even happen? Unless you limit your definition of art to performing at salon events, that doesn't make much sense. Typically art is released into the world and at best the authors can get a bit of rent from the people who consume it (typically via a publisher), but they don't have any control over who the consumers would be.
I decide if I am going to go with a publisher, which publisher and what platforms.
What you label as "typically" has only been the case for the past ≈15 years.
Staying with your example of music or book authors - a musician or author might choose which distribution platforms they work with, but they don't have the ability to tell a particular shop to not resell their record/book, let alone to tell a particular "consumer" not to listen to it / read it.
Sadly these things are often decided by rhetoric in society, but then again, there's no actual debate if it's just throwing slogans.
Now some of the same rhetoric is used in the AI battle. The only question worth asking here is what's the social benefit, as human culture is by nature all commons and derivation. But in this case, the AI companies are also accumulating power, and LLMs are removing attribution which could be argued to discourage publishing new works more than piracy. A "pirate" may learn about you and later buy from you in different ways, a LLM user won't even know that you exist.
[1] Not even discussing how exaggerated these privileges are from what would be reasonable.
LLM could commit plagiarism if authorship of generated media was claimed for either the LLM or its creators.
That is, public funds, tribunals and lawmaking power, shouldn’t be used to protect a private interests.
Corollary: 1. DRM is ok. 2. If one cracks it, it’s ok too, 3. You gotta find other ways to conduct business than retaining information (licenses, movies, etc.) 4. For software companies, cloud is one of those ways, compared to licensing downloads, 5. Netflix is the cloud of the movies and you pay for earlier access to a shared experience that is synchronized with other friends who will watch the same thing, 5. Patents are another stupid attempt by the state to protect corporations against citizens.
The first example is the image: the AI has seen the famous photograph of the Ferguson riots, and (with 6 prompts?!) manages to get something fairly similar. But suppose a human had seen that photo, and then you asked that human to draw you a photo of the riots; and then continued prompting them to make it look similar. Is it really unrealistic that the human could generate something that looks as similar? Is the human themselves therefore inherently a violation of copyright?
The NYT article to begin with looks a bit more damning -- except that, it appears that they prompted the AI directly with the beginning of the article. My son, when a toddler, for a long time could recite nearly the full text of his favorite story books with minimal prompting -- does that mean he's inherently a violation of copyright? Because he can recite The Gruffalo almost verbatim when prompted, is he a walking violation of Julia Donaldson's copyright? What about people with photographic memory, that can recite long sections of books verbatim -- are they inherently violating copyright?
Now sure, in both cases, the output might be a violation of copyright, if it's clearly derived from it -- both for humans and for AI. But I don't think the fact that AI can be prompted to generate copyright-violating material is proof that the AI training itself has violated copyright, any more than the fact that a human can be prompted to generate copyright-violating material is proof that human training has violated copyright.
> But suppose a human [...]
A human doesn't ingest half of the web and simultaneously deal with millions of people.
We've be been through this times and times again. Justice didn't go after humans copying books by hand, it went after reprints of existing copyrighted material.
Music industry didn't go after people singing tunes in their kitchen, but after wide distribution networks.
Removing scale from the discussion leads to absurd conclusions.
1. The ability of an entity (human or AI) which can be prompted to produce copyright-infringing material.
2. Actually producing copyright-infringing material.
Sure, if OpenAI is actually producing copyright-infringing material at scale, unprompted, then that needs to be addressed. If a common way around NYT's paywall were to copy & paste the first few lines into ChatGPT and then read the rest of the article, then yes, that's a hole that needs to be filled. But that's with the production and dissemination, not the training.
Regarding scale, yes, there is a difference here, but it's more subtle than you think. There are probably millions of toddlers who can recite The Gruffalo nearly verbatim. However, each of those toddlers were trained individually. Similarly, there are probably thousands, maybe tens of thousands of artists who, when prompted, could generate an image that would be similar enough to the presented image to violate copyright. But again, each of those individuals were trained separately.
The difference that modern tech companies have is that they can train their systems once, and then duplicate the same training across millions of instances.
One potential argument to make here would be to say: Training with this material is fair use; but fair use or not, those weights are now a derivative work. You can use exactly one copy of those weights, but you can't copy those weights millions of times, any more than it would be fair use to distribute one copy of that image to everyone in the company. You need to either train millions of copies, or pay licensing fees.
I'm not sure I agree with that argument, but at least it seems to me to bring the actual issues into more clarity.
Not sure who "justice" is, but copyright owners most definitely did go after individuals copying even small excerpts of copyrighted material, in fact that is currently still going on in libraries:
https://www.theguardian.com/commentisfree/2023/oct/09/us-lib...
(yes, they explicitly made taking excerpts impossible and fought against any attempt to change that)
> Music industry didn't go after people singing tunes in their kitchen
Again, yes they did. I'm not aware of kitchen incidents, but this happened:
https://stanforddaily.com/2020/08/04/warner-chappell-music-s...
Legal copyright claims (not the YouTube kind) need to justify harm to the original piece.
> libraries
Setting aside my opinion on the situation of public libraries, I wouldn't call libraries "humans copying books by hand"
> this happened: [...]
"This article is purely satirical and fictitious. "
Yes, be damned and fuck them. And stop pursuing individuals too.
You can have omniscient AI without feeding all of the data into it.
1. OpenAI must be fined according to law.
2. Piracy is decriminalized.
Failing to do neither is an admission that the US has become a corporatocracy. That is, a form of oligarchy where the rule of law is not by a majority or plurality of people, but by a number or corporations you can count with one hand.
Not sure it really matters unless there’s something that will or can be done about it.
Do we need to limit scraping and learning to only free publicly available data? Therefore is a Reuters photo still not publicly available for learning style of photo, composition, lighting, even with a watermark. What you pay for is the license to reproduce the image, you can still look at them beforehand.
There is also the nonconsensual aspect of it. I make things for humans. Just because it’s free for humans does not mean it’s free to be used by corporations, especially they use my work to kill the economics of my industry. It feels like a parasitic relationship.
Since it changes incentives and mechanics of the creativity market so much it forces us to reassess current approaches. I can't agree that mechanism behind this tech and the approach to IP is of any consequence. We don't make laws, norms and judgements for the sake of our tools.
It's ok to say that we're not ready to arrange things in a proper way yet, without letting everything slide on some arbitrary technicality.
Why should "AI" companies be allowed more than I?
Thank you!
As much as I think that what these companies is doing has moral and legal issues, reframing copyright violations as "theft" will cut both ways and make the arguments for open culture more difficult.
So ridiculous.
The whole idea of copyright is wrong. Anything that is popular and therefore successful, owes that to the crowd that made it popular. The crowd has is what gives it the interest.
I personally wouldn't even be averse to some sort period of copyright, say 2 years, with the possibility to extend to say 5, but these things are common.
This is machinery that can not suggest experimental jazz- unless it is already mainstream. It can not go to the fringes and shove the species towards new and wild discoveries.
I find it hilarious though, that it may destroy the cultural behemoths with their IPs by mashUp. Death by a million beatles-clone songs, yesterday came suddenly indeed. And after all that is sad and done for- some of us, might even venture out to the fringe and find the weird and wild parts again.
The more it changes, the more it stays the same.
As a sibling commenter said in fewer words: the models will still consume the fringe content and will be able to regurgitate it given the right prompt; suggesting experimental jazz or whatever other fringe art pursuit the committed freaks have an itch for.
I'll still slowly progress may way through John Zorn's catalogue, and occasionally re-invigorate my appreciation for HR Giger's bleak works, and keep listening (and subscribing) to the local community radio station, and seeking out uncomfortable movies, safe in the knowledge that likely I'll be watching, listening, and appreciating them all in solitude.
I do not think that "culture" can be forced, it must be slowly, almost imperceptibly, absorbed.