Now that we know AI bots will ignore robots.txt and churn residential IP addresses to scrape websites, does anyone know of a method to block them that doesn’t entail handing over your website to Cloudflare?
Now that we know AI bots will ignore robots.txt and churn residential IP addresses to scrape websites, does anyone know of a method to block them that doesn’t entail handing over your website to Cloudflare?
This is pretty slick, but doesn’t this just mean the bots hammer your server looping forever? How much processing do you do of those forms for example?
Best is to redirect them to a 1TB file served by hetzner’s cache. There’s some nginx configs that do this
Yes
None
It costs me nothing to have bots spending bandwidth on me because I’m not on a metered connection and electricity is cheap enough that the tiny overhead of processing their requests might amount to a dollar or two per year.