LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI

cm0002@lemmy.world · 3 days ago

LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI

mindbleach@sh.itjust.works · 3 days ago

Correct - only the filesharing is against the law. Training is transformative use.

You can’t cram a billion images into one gigabyte. They’d be one byte each. What these models do is very different from the bootlegging you’re trying to make it sound like.