More attacks by dump AI-bots

Achim D. Brucker adbrucker at 0x5f.org
Mon Nov 17 12:07:10 CET 2025


On 17/11/2025 10:41, Makarius wrote:
> On 17/11/2025 11:28, Achim D. Brucker wrote:
>> I would not call it a "proper" solution, but I am currently using 
>> Anubis (https://anubis.techaro.lol/) with quite some success. Of 
>> course, it's an arms race - computing the challenges set out by it 
>> are not that expensive. Hence, when enough website use it, the 
>> crawlers will implement the challenge solving part ...
>
> Anubis emerged early 2025 as a counter-attack, and I don't like it. An 
> "arms race" is war against war, and ultimately won't work.
>

I agree - but it does work at the moment as a relatively easy to install 
workaround lowering the load on attacked servers a lot. And it allows to 
provide web access for humans, for which I have not seen much better 
solutions out there yet.  Needing a solution quickly, I either had to 
swallow the "Anubis" pill, allow access only to authorised users, or 
shut it down completely. Anubis was (and likely still is) the better of 
the three evils.


> There must be a proper solution. For us it means that our own programs 
> (or "daemons") can access the repository servers, without too much 
> additional complication.
>
> I am presently thinking of SSH and maybe RSYNC, as well-known non-HTTP 
> protocols. There is also an rsync server that hardly anybody remembers 
> now (we actually have one to mirror the Isabelle website).


For downloading Isabelle components, rsync seems to be a viable option - 
on my servers, I have not seen high crawler scraping loads for 
"data/software" archives (tgz/zip, etc) nor for REST APIs. I am using 
Anubis purely for the web interfaces designed for human consumption.  
Again, this will likely change earlier than we all fear.

For pure programmatic access, any form of weak authentication should 
work at the moment.


These AI crawlers are really a threat to the open web ...


Achim



More information about the isabelle-dev mailing list