I’ve been following the struggle of bearblog developer to manage the current war between bot scrapers and people who are trying to keep a safe and human oriented internet. What is lemmy doing about bot scrapers?

Some context from bearblog dev

The great scrape

https://herman.bearblog.dev/the-great-scrape/

LLMs feed on data. Vast quantities of text are needed to train these models, which are in turn receiving valuations in the billions. This data is scraped from the broader internet, from blogs, websites, and forums, without the author’s permission and all content being opt-in by default.

Needless to say, this is unethical. But as Meta has proven, it’s much easier to ask for forgiveness than permission. It is unlikely they will be ordered to “un-train” their next generation models due to some copyright complaints.

Aggressive bots ruined my weekend

https://herman.bearblog.dev/agressive-bots/

It’s more dangerous than ever to self-host, since simple mistakes in configurations will likely be found and exploited. In the last 24 hours I’ve blocked close to 2 million malicious requests across several hundred blogs.

What’s wild is that these scrapers rotate through thousands of IP addresses during their scrapes, which leads me to suspect that the requests are being tunnelled through apps on mobile devices, since the ASNs tend to be cellular networks. I’m still speculating here, but I think app developers have found another way to monetise their apps by offering them for free, and selling tunnel access to scrapers

  • Tuukka R@piefed.ee
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    edit-2
    20 hours ago

    Since all real people accessing the site will be doing so through a browser, which has JavaScript built in

    Many blind people don’t, because for blind people a text-based interface makes a LOT more sense than a graphical user interface. And the text-based browsers don’t precisely excel on JavaScript.

    (But, who cares about some blind people anyway?)

    • AMoistGrandpa@lemmy.ca
      link
      fedilink
      arrow-up
      2
      ·
      19 hours ago

      I didn’t know that. I had assumed people using screen readers would use the same versions of websites as everyone else.

      Off to do some research, to make my own sites more accessible for the blind!

    • irelephant [he/him]@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      2
      ·
      20 hours ago

      Most Lemmy frontends don’t work without JavaScript.

      I may be wrong, but I’m pretty sure most blind people just use regular browsers with a screen reader like JAWS or NVDA.

      • Tuukka R@piefed.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        20 hours ago

        Most do, many don’t.

        Most blind people are not told by anybody that you can use a computer in text form, because most sighted people don’t know you can. The user experience is on a whole another level when you have an interface that is basically tailored to you, instead of using something made for people with wildly different abilities than yours! At least, when I watch my friend browse the web in those two formats, the difference is daunting.

        It’s not okay to block them from using an otherwise much better option. Even if not everyone knows about the better way.