LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI

cm0002@lemmy.world · 3 days ago

LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI

guyincognito@piefed.social · 3 days ago

If I scrape Meta’s AI to develop my own, would that be fair game? I’m genuinely curious about the legality of this.

BrikoX@lemmy.zip · edit-2 2 days ago

Tehnically you would be breaking terms of service and license, but in a legal sense we don’t know if that would be enforceable. Still hasn’t been answered by courts.

cm0002@lemmy.world · 2 days ago

So far, OpenAI, anthropic et al hasn’t sued anyone over it, but they have cut account access when it’s discovered to be used for that purpose

It’s how early versions of deepseek were trained iirc, it’s called distillation