User-agent: * Disallow: / # Rate-limit all bots to 1 req/min (unfortunately, Crawl-Delay is non-standard). User-agent: * Crawl-Delay: 60 # Very aggressive, is AI BS. # "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)" User-agent: ClaudeBot Disallow: / # Not very aggressive, but at this point Anthropic can fuck right off. # "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-SearchBot/1.0; +searchbot@anthropic.com)" User-agent: Claude-SearchBot Disallow: / # Very aggressive, is marketing BS. # "Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html)" User-agent: SemrushBot Disallow: / # Very aggressive, doesn't obey Crawl-Delay. # "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)" User-agent: Amazonbot Disallow: / # Very aggressive, the company behind TikTok. # "Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)" User-agent: Bytespider Disallow: / # Reasonably aggressive (>10k requests in <2 hours), is AI BS. # "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot" User-agent: GPTBot Disallow: / # ChatGPT search (I haven't noticed this one, but OpenAI can screw off) # "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; compatible; OAI-SearchBot/1.3; +https://openai.com/searchbot" User-agent: OAI-SearchBot Disallow: / # ChatGPT custom GPTs (I've only noticed a couple thousand requests from this one, but OpenAI can screw off) # "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot" User-agent: ChatGPT-User Disallow: / # Meta's AI-training bot, 200K-1M requests a week # "meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)" User-agent: meta-externalagent Disallow: / # Meta's AI search (i.e. AI BS, not "regular" search), 300K requests in the last 2 days # "The Meta-WebIndexer crawler navigates the web to improve Meta AI search result quality for users." # "meta-webindexer/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)" User-agent: meta-webindexer Disallow: / # Apple's AI-training bot. I've seen applebot make ~1M # requests a week, but idk how much of that is the AI-training # applebot-extended vs "legitimate" search-crawling. # "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)" User-agent: Applebot-Extended Disallow: / # AllenAI # "Mozilla/5.0 (compatible) Ai2Bot-Dolma (+https://www.allenai.org/crawl)" User-agent: Ai2Bot-Dolma Disallow: / # NewsAI # "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 newsai/1.0 Safari/537.36" User-agent: newsai Disallow: /