AI search fix
robots.txt for AI crawlers — how to write it
A robots.txt for AI crawlers should name each trusted user agent explicitly and state what they may fetch on your production domain. Broad `Disallow: /` rules or copied legacy templates often block GPTBot, PerplexityBot, and ClaudeBot even when you intend to welcome AI search traffic. The fix is to add separate user-agent blocks per bot, keep unknown scrapers restricted, and verify the live file at /robots.txt after deploy.
Start from your business policy: which AI platforms you want to read public pages for search and citation, and which broad training crawlers you want to limit. Then map each decision to a user-agent line robots parsers understand. If you use Cloudflare or another edge firewall, robots.txt must agree with those allow rules — otherwise bots see conflicting signals between the edge and your origin.
How to write robots.txt for AI crawlers
- List trusted bots: GPTBot, ChatGPT-User, PerplexityBot, ClaudeBot, Google-Extended, Bingbot.
- Add one user-agent section per bot with explicit Allow paths you want indexed.
- Remove catch-all Disallow rules that override your AI allow intent.
- Deploy and fetch https://yourdomain.com/robots.txt from production, not staging.
- Re-test after CMS or SEO plugin updates that may rewrite robots.txt.
You'll get an HTML report showing which AI user agents your live robots.txt allows or blocks.
Run Express CheckRelated questions
Updated
