More attacks by dump AI-bots
Achim D. Brucker
adbrucker at 0x5f.org
Mon Nov 17 12:07:10 CET 2025
On 17/11/2025 10:41, Makarius wrote:
> On 17/11/2025 11:28, Achim D. Brucker wrote:
>> I would not call it a "proper" solution, but I am currently using
>> Anubis (https://anubis.techaro.lol/) with quite some success. Of
>> course, it's an arms race - computing the challenges set out by it
>> are not that expensive. Hence, when enough website use it, the
>> crawlers will implement the challenge solving part ...
>
> Anubis emerged early 2025 as a counter-attack, and I don't like it. An
> "arms race" is war against war, and ultimately won't work.
>
I agree - but it does work at the moment as a relatively easy to install
workaround lowering the load on attacked servers a lot. And it allows to
provide web access for humans, for which I have not seen much better
solutions out there yet. Needing a solution quickly, I either had to
swallow the "Anubis" pill, allow access only to authorised users, or
shut it down completely. Anubis was (and likely still is) the better of
the three evils.
> There must be a proper solution. For us it means that our own programs
> (or "daemons") can access the repository servers, without too much
> additional complication.
>
> I am presently thinking of SSH and maybe RSYNC, as well-known non-HTTP
> protocols. There is also an rsync server that hardly anybody remembers
> now (we actually have one to mirror the Isabelle website).
For downloading Isabelle components, rsync seems to be a viable option -
on my servers, I have not seen high crawler scraping loads for
"data/software" archives (tgz/zip, etc) nor for REST APIs. I am using
Anubis purely for the web interfaces designed for human consumption.
Again, this will likely change earlier than we all fear.
For pure programmatic access, any form of weak authentication should
work at the moment.
These AI crawlers are really a threat to the open web ...
Achim
More information about the isabelle-dev
mailing list