Long story short, my VPS, which I’m forwarding my servers through Tailscale to, got hammered by thousands of requests per minute from Anthropic’s Claude AI. All of which being from different AWS IPs.

The VPS has a 1TB monthly cap, but it’s still kinda shitty to have huge spikes like the 13GB in just a couple of minutes today.

How do you deal with something like this?
I’m only really running a caddy reverse proxy on the VPS which forwards my home server’s services through Tailscale. "

I’d really like to avoid solutions like Cloudflare, since they f over CGNAT users very frequently and all that. Don’t think a WAF would help with this at all(?), but rate limiting on the reverse proxy might work.

(VPS has fail2ban and I’m using /etc/hosts.deny for manual blocking. There’s a WIP website on my root domain with robots.txt that should be denying AWS bots as well…)

I’m still learning and would really appreciate any suggestions.

      • A good tar pit will reduce your bandwidth. Tarpits aren’t about shoving useless data at bots; they’re about responding as slow as possible to keep the bot connected for as long as possible while giving it nothing.

        Endlessh accepts the connection and then… does nothing. It doesn’t even actually perform the SSL negotiation. It just very… slowly… sends… an endless preamble, until the bot gives up.

        As I write, my Internet-facing SSH tarpit currently has 27 clients trapped in it. A few of these have been connected for weeks. In one particular spike it had 1,378 clients trapped at once, lasting about 20 hours.

        • mholiv@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          56 minutes ago

          Fair. But I haven’t seen any anti-ai-scraper tarpits that do that. The ones I’ve seen mostly just pipe 10MB of /dev/urandom out there.

          Also I assume that the programmers working at ai companies are not literally mentally deficient. They certainly would add .timeout(10) or whatever to their scrapers. They probably have something more dynamic than that.