Meta's latest legal wheeze is to insist that pirating books is fair use, actually. And it might be working.

artifex@piefed.social · 16 days ago

Meta's latest legal wheeze is to insist that pirating books is fair use, actually. And it might be working.

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 @pawb.social · 16 days ago

So I can use pirated media to train my AI (Actual Intelligence), right?

artifex@piefed.social · 16 days ago

As long as you’re rich enough to hire your own army of lawyers, probably.

That said, it seems like when you’re rich enough to hire your own army of lawyers you can pretty much do whatever you want.

Kailn@lemmy.myserv.one · 16 days ago

Well, that doesn’t sound civil or lawful at all and more like kindoms of the dark ages degree of “rules” where it doesn’t apply to a choosen few.

If Meta and other bigcorps that support the US goverment get the special “avoid-judgment” card and you face punishment then there’s no law, only bigotry.

That would encourage individuals and small groups to keep their activites a secret (go anonymous) and break the law whenever they can,
because the “king and his followers” don’t follow their own “rules”.

The US is not only getting dystopian, they’re commiting primitive mistakes.

tehn00bi@lemmy.world · 16 days ago

Should make all journal publications fair use.

ptu@sopuli.xyz · 15 days ago

If only US were going for a win in that AI

Bazoogle@lemmy.world · 16 days ago

Unfortunately you do have to prove you’re intelligent

Dr. Moose@lemmy.world · 15 days ago

Yes, in fact there’s no framework or legal precedent right now so everyone already is doing it. You can just scrape the web etc and disregard IP ownership because training AI is transformative work - as it should be.

Archangel1313@lemmy.ca · 16 days ago

I absolutely love the fact that all these companies are laying the legal groundwork to destroy intellectual property rights altogether. If they win enough of these cases, then every pirate on the open seas sails under a flag of amnesty.

artifex@piefed.social · 16 days ago

No, I expect they’ll be more like “rules for thee but not for me”

lmmarsano@group.lt · 16 days ago

I wouldn’t be so confident without a legal argument to support your opinion.

Dead_or_Alive@lemmy.world · 15 days ago

No legal argument is necessary. Just look at history. The rich and well connected have always lived by a different set of rules.

See below:

Robert Richards (Du Pont heir): A 2014 Forbes article noted that a Du Pont heir, Robert Richards, pleaded guilty to raping his 3-year-old daughter in 2008 and received probation instead of jail time, which caused public outrage.
August Busch IV: August Busch IV, a former Anheuser-Busch CEO, has been involved in past legal incidents, including a girlfriend’s overdose death at his house in 2010 and a car crash in 1983, but he was not charged with rape in these cases.

jabjoe@feddit.uk · 15 days ago

Not all IP is self surviving. Even CopyRight isn’t always a bad thing, if you think of small artists, for example. My fear is about CopyLeft mainly as I feel it’s been incredible successful in pushing forwards openness. The megacorps hating it, tells you it is doing its job. Only of the things they love about LLM and code is it can license wash away CopyLeft.

merc@sh.itjust.works · 16 days ago

deleted by creator

Paranoid Factoid@lemmy.world · 15 days ago

So meta gets to claim fair use with pure digital duplication, but archive.org doesn’t when they scan physical copies of books and only lend out the same number of copies as they own in warehouses. That’s piracy.

Got it.

WIZARD POPE💫@lemmy.world · 15 days ago

Rules for thee but not for me ahhhh corpo shit.

andybytes@programming.dev · 15 days ago

yuppers

Goodlucksil@lemmy.dbzer0.com · 16 days ago

Classic “the end justifies the means” (bad) defense. If ISPs can send letter for torrenting, and Facebook torrented a lot, Facebook deserves a fair punishment.

GameOverFlow@lemmy.zip · 16 days ago

Not deserves, needs.

Archer@lemmy.world · 16 days ago

lol it would be hilarious if they could order Facebook disconnected from the Internet like a pleb hit with a copyright complaint

Willoughby@piefed.world · 16 days ago

truck full of letters backs up to Meta’s headquarters

“there, that’s more appropriate.”

Etterra@discuss.online · 15 days ago

So when this works for them it’ll be precedent to allow the fair use pirating of all media and software, right?

Oh never mind, I forgot that I don’t have billions of dollars to spend on lawyers. Never mind.

andybytes@programming.dev · 15 days ago

At this point I don’t understand pirating software.

REDACTED@infosec.pub · 14 days ago

He’s ex blizzard dev who opened his own indie studio

MousePotatoDoesStuff@lemmy.world · 14 days ago

And he’s out of mana.

TheObviousSolution@lemmy.ca · 16 days ago

So we can pirate books as well as long as we aren’t able to reproduce them verbatim from memory as well?

Judge Vince Chhabria either accepts whatever bribes and offers he’s probably getting offered and sides with Meta, or it will eventually go on to the Supreme Court where they most definitely will. That’s the part of this that will work the most under an administration of no accountability.

InternetCitizen2@lemmy.world · 16 days ago

Tell the judge you are training a neural network… it just happens to also be you.

melfie@lemy.lol · 16 days ago

Looking forward to Jellyfin getting a LLM to train locally on movie preferences so everyone’s library is fair use. Wait, is this why LLMs are being shoehorned into everything?

Dr. Moose@lemmy.world · 15 days ago

Honestly I agree with Meta here but this should apply to everyone. I think most people here conflate their hate for Meta with the factual reality of intellectual property.

SpaceMan9000@lemmy.world · 15 days ago

I can hate both.

People can also hate the fact that if you have enough money you can make everything legal.

Dr. Moose@lemmy.world · 15 days ago

What do you mean you can hate both? Whats the other of your hates? Disregard for copyright absolutism?

Luminous5481 "Lawless Heathen" [they/them]@anarchist.nexus · 16 days ago

Yup, that’s what I’m doing with all those audiobooks I torrented. Helping the US maintain the lead in AI 😂

discocactus@lemmy.world · 16 days ago

Unironically may become a legitimate defense. Although in that case, indiscriminately bombing gas stations in your town and extorting their owners should also be allowed but…

ArbitraryValue@sh.itjust.works · 16 days ago

We’re going to end up in a situation where whatever is necessary to train AI is permitted, and the main question is whether that will be through (re)interpretation of existing law or the passage of a new law.

ctrl_alt_esc@lemmy.ml · 16 days ago

Good thing I have a local model running that’s constantly learning, for precisely this reason

panda_abyss@lemmy.ca · 16 days ago

I’m still collecting media before I can start the training process.

XLE@piefed.social · 16 days ago

If anything, this is proof you should be next in line for a large venture capital infusion!

daggermoon@piefed.world · 16 days ago

Is it fair use if I do it?

artifex@piefed.social · 16 days ago

How rich are you?

daggermoon@piefed.world · 16 days ago

I’m quite poor. I’m thankful every day that my mom and dad still let me live with them.

artifex@piefed.social · 16 days ago

I wouldn’t recommend it then 😞

runner_g@piefed.blahaj.zone · 16 days ago

just claim that you are training an AI for a new startup you are working on, and will soon be looking for VC to fund the project further. be sure to use terms like “revolutionary” and “democratize” liberally.

andybytes@programming.dev · 15 days ago

So we subsidize these baby killing bastards and they pull the broke boy card. The united state is a brutal imperialist capitalist shithole …pffft fuck capitalism

☂️-@lemmy.ml · 16 days ago

sure. thanks meta, anna’s archive will help me with my reading list, thanks.

ChunkMcHorkle@lemmy.world · 15 days ago

anna’s archive

I wish. As someone astutely put it in another conversation, now that the tech companies have pilfered Anna’s Archive, the big publishers are going to try to get it shut down.

☂️-@lemmy.ml · edit-2 15 days ago

we back it up and do it all over again, rinse repeat until we can depose the people so desperate to keep us from information.

rc__buggy@sh.itjust.works · 16 days ago

We can train our NI (Natural Intelligence) models.

ClownStatue@piefed.social · 16 days ago

To demand shrubberies?

_Nico198X_@piefed.europe.pub · edit-2 12 days ago

deleted by creator

ryathal@sh.itjust.works · 16 days ago

Arguing that training models isn’t fair use us going to be a massive uphill battle, it’s basically reading the book but with a computer. It’s not actually a big deal to people, unless you hold the copyright to a ton of works and want to get a percentage of all the AI income these companies have made.

Torrenting the books is likely absolutely copyright infringement, but that has relatively low payout compared to the money these companies are getting for their models. The training being fair use means that rights holders can’t try to take any money from the model’s use. The statutory limits for infringement even at per work levels aren’t significant compared to the legal cost of proving it happened.

OfCourseNot@fedia.io · 16 days ago

There’s an argument to be made that it is, in fact, not ‘reading’. The training of the model could be considered a lossy compression of the data. And streaming movies in a lossy compression format is not fair use, is it?

Fatal@piefed.social · 16 days ago

It’s not the storage of the information that matters as much as the presentation. Google’s search index stores a huge amount of copyrighted material, even losslessly. But they only present small snippets at a time which is not considered copyright infringement. The question really is whether or not the information being presented by the models is in a format which is considered copyright infringement. So far, courts have not found that they are.

ryathal@sh.itjust.works · 16 days ago

The model doesn’t stream out anyone’s content though. The article mentions that the plaintiffs have provided no examples of a prompt that creates anything substantial.

Streaming a lossy compression would generally be infringement, but there is definitely a point where it becomes not infringement if it’s lossy enough.

What a model generally stores, is factual information that isn’t copyright in the first place. It’s storing word counts, sentence lengths, sentiment analysis, and so on.

FatCrab@slrpnk.net · 16 days ago

Anthropic pirating books for their training corpus resulted in the biggest copyright settlement in history–well over a billion. That is still being quibbled over i believe, but they settled because they were likely to pay out more if the case went forward. So I’m not really sure where you’re coming from that infringement via torrenting does not result in monstrously large liability.

ryathal@sh.itjust.works · 16 days ago

The judge in that case ruled the training wasn’t fair use for pirated books, which left them on the hook for potentially all revenue (likely a court determined percentage) that the model generated for them in addition to statutory damages. That is well north of 1.5 billion.

artifex@piefed.social · 16 days ago

Which is kind of a pity. Anyone who’s ever written something on the net should be getting royalty checks from these fucks. I’m not exactly famous but I’ve written prolifically in my field of work and have gotten nearly word-for-word reproductions of my articles out of every big model I’ve tested since GPT-3.

FatCrab@slrpnk.net · 15 days ago

Just noticed your reply and want to correct this. Anyhropic settled, the 1.5bil was not a judgment against them. Specifically, this covered the literal pirating of the training corpus. It had absolutely nothing to do with the way training on the data handled the training data–they literally torrented an enormous portion of their training corpus.

Anthropoc DID try to argue that because they used the pirated material for training a model, it was fair use. The judge correctly decided that doesn’t make any fucking sense. Again, this is not about the models encoding data, it is literally just about the fact that these silly fucks torrented vast portions of their training corpus like college students building a porn library on college broadband.