What is lemmy doing about bot scrapers?

flango@lemmy.eco.br · 1 day ago

What is lemmy doing about bot scrapers?

Cocodapuf@lemmy.world · edit-2 19 hours ago

Do you realize how much extra work your browser has to do every time you visit a site that makes money on ads? All the additional scripts being run in the background, it’s astonishing. Trust me, the additional work that users’ machines have to do for this is totally insignificant when viewed in the greater context of what we actually do with computers.

Watching a 10 minute YouTube video, that’s your computer doing more work than it would loading a million text based pages running Anubis.

FaceDeer@fedia.io · 18 hours ago

Do you realize how much extra work your browser has to do every time you visit a site that makes money on ads?

I have uBlock origin and Ghostery, so very little.

Watching a 10 minute YouTube video, that’s your computer doing more work than it would loading a million text based pages running Anubis.

Given that AI trainers are training on YouTube videos too, that sounds like Anubis isn’t going to impose meaningful costs on them.

Cocodapuf@lemmy.world · edit-2 13 hours ago

Given that AI trainers are training on YouTube videos too, that sounds like Anubis isn’t going to impose meaningful costs on them.

Well, does it work?

You don’t need to guess about it, you can simply look at traffic records and see how much it changes after installing Anubis. If it works for now, great. Like all things like this, it’s a cat and mouse game.

Also, the way your computer interprets a YouTube video and the way a scraper interprets a YouTube video may well be different. But in general, for a browser, streaming and decoding video is a relatively heavy and high bandwidth operation. Video is much higher bandwidth and has much higher CPU processing requirements than audio, which likewise is heavier and higher higher bandwidth than text. As a result, video and text barely compare, they’re totally different orders of magnitude in bandwidth and processing needs. So does an AI scraper have to do all that decoding? I actually have no idea, but there definitely could be shortcuts, ways to just avoid it. For instance, they may only care about the audio, or perhaps the transcripts are good enough for them.

What is lemmy doing about bot scrapers?

What is lemmy doing about bot scrapers?

The great scrape

Aggressive bots ruined my weekend