Amazon’s AI crawler is making my git server unstable
Published on , 259 words, 1 minutes to read
Please, just stop.
EDIT(2025-01-18 19:00 UTC):
I give up. I moved the Gitea server back behind my VPN. I’m working on a proof of work reverse proxy to protect my server from bots in the future. I’ll have it back up soon.
EDIT(2025-01-17 17:50 UTC):
I added this snippet to the ingress config:
nginx.ingress.kubernetes.io/configuration-snippet: |
if ($http_user_agent ~* "(Amazon)" ){
return 418;
}
The bots are still hammering away from a different IP every time. About 10% of the requests do not have the amazonbot user agent. I’m at a loss for what to do next.
I hate the future.
Hi all. This is a different kind of post. This is not informative. This is a cry for help.
To whoever runs AmazonBot, please add git.xeserv.us
to your list of blocked domains. If you know anyone at Amazon, please forward this to them and ask them to forward it to the AmazonBot team.
Should you want to crawl my git server for some reason, please reach out to me so we can arrange for payment for hardware upgrades commensurate to your egregious resource usage.
I don’t want to have to close off my Gitea server to the public, but I will if I have to. It’s futile to block AI crawler bots because they lie, change their user agent, use residential IP addresses as proxies, and more. I just want the requests to stop.
I have already configured the robots.txt
file to block all bots:
User-agent: *
Disallow: /
What else do I need to do?
Facts and circumstances may have changed since publication. Please contact me before jumping to conclusions if something seems wrong or unclear.
Tags: