Meta’s Piracy: Are They Hoarding More Terabytes of Books?

TECH NEWS – Mark Zuckerberg’s company used books to train artificial intelligence models, but Meta didn’t exactly have legal access to the content…

 

A copyright lawsuit is being filed against Meta for using authors’ work to train large language models (LLMs). Dozens of emails, allegedly between Meta employees, claim that the company’s AI models were being pirated in bulk for training purposes, and that the downloaded torrents were then seeded. In January, court documents revealed that Meta obtained its AI training data from a large file-sharing database, LibGen, which contains everything from news articles and paywalled academic papers to books.

Meta is accused of downloading more than 80 terabytes of data from LibGen and another “shadow library” called Z-Library. 80 TB of data is almost 80 thousand (!) gigabytes! That’s a lot. This is piracy on a perhaps unprecedented scale. The company emails document Meta’s decision to take copyrighted works it knew were pirated and use them without permission, despite clear ethical concerns. In one email submitted as evidence, a purported Meta employee futilely advises that using pirated material should cross their ethical threshold, then adds that LibGen and similar databases are basically like PirateBay or something similar, distributing copyrighted and infringing content.

Many emails mention concerns about using LibGen. One Meta researcher suggested using a VPN as the only way to access it, and also joked that it didn’t seem acceptable to torrent from a company laptop. So Meta went into stealth mode, hiding the activity by downloading and seeding the torrents outside of Facebook’s official servers. According to the prosecution, this correspondence suggests that Meta executives up to and including Mark Zuckerberg knew that the company was using pirated material to train its AI models, and it has emerged that Meta employees also believed that OpenAI was using LibGen for its own models, claiming that it was a kind of arms race that they eventually resorted to.

If Meta is found guilty, how much of a fine will they have to pay? And why is the Internet Archive (archive.org) not allowed to lend books as a digital library?

Source: PCGamer

Avatar photo
Anikó, our news editor and communication manager, is more interested in the business side of the gaming industry. She worked at banks, and she has a vast knowledge of business life. Still, she likes puzzle and story-oriented games, like Sherlock Holmes: Crimes & Punishments, which is her favourite title. She also played The Sims 3, but after accidentally killing a whole sim family, swore not to play it again. (For our office address, email and phone number check out our IMPRESSUM)

No comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.