I’m talking text only, and there are some fundamental limitations in the way current and near future LLMs handle certain questions. They don’t “see” characters in inputs, they see words which get tokenized to their own internal vocabulary, hence any questions along the lines of “How many Ms are in Lemmy” is challenging even for advanced, fine tuned models. It’s honestly way better than image captchas.
They can also be tripped up if you simulate a repetition loop. They will either give a incorrect answer to try and continue the loop, or if their sampling is overturned, give incorrect answers avoiding instances where the loop is the correct answer.
Somehow I didn’t get pinged for this?
Anyway proof of work scales horrendously, and spammers will always beat out legitimate users of that even holds. I think Tor is a different situation, where the financial incentives are aligned differently.
But this is not my area of expertise.