Robots.txt for AI Crawlers: How to Allow GPTBot, ClaudeBot, and PerplexityBot
Your robots.txt file controls which AI crawlers can access your site. Here's how to configure it for maximum AI visibility.
The robots.txt file tells web crawlers what they can and can't access on your website. Traditionally, this meant Googlebot and Bingbot. But now there's a new generation of AI crawlers — and many websites are blocking them without realizing it.
If AI crawlers can't access your site, your business won't appear in AI-generated answers from ChatGPT, Claude, Perplexity, or Google AI Overviews. This guide covers every AI crawler you need to know about and how to configure robots.txt correctly.
Check your robots.txt instantly
Foglift scans your robots.txt and checks every AI crawler listed below. Free, no signup required.
Every AI Crawler You Need to Know About
Here's the complete list of AI crawlers as of 2026:
| Crawler | Company | Powers | User-Agent String |
|---|---|---|---|
| GPTBot | OpenAI | ChatGPT training and knowledge | GPTBot |
| ChatGPT-User | OpenAI | ChatGPT web browsing | ChatGPT-User |
| ClaudeBot | Anthropic | Claude AI training and search | ClaudeBot |
| PerplexityBot | Perplexity AI | Perplexity search answers | PerplexityBot |
| Google-Extended | Gemini, AI Overviews | Google-Extended | |
| Amazonbot | Amazon | Alexa answers, Amazon AI | Amazonbot |
| anthropic-ai | Anthropic | Claude training data | anthropic-ai |
| cohere-ai | Cohere | Cohere AI products | cohere-ai |
Recommended robots.txt Configuration
For most businesses that want maximum AI visibility, use this robots.txt:
# Standard search engine crawlers User-agent: Googlebot Allow: / User-agent: Bingbot Allow: / # AI crawlers — allow for AI search visibility User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / User-agent: PerplexityBot Allow: / User-agent: Google-Extended Allow: / User-agent: Amazonbot Allow: / # Default — allow everything else User-agent: * Allow: / # Block private/admin areas Disallow: /admin/ Disallow: /api/ Disallow: /auth/ Sitemap: https://yoursite.com/sitemap.xml
How to Selectively Allow or Block AI Crawlers
Maybe you want ChatGPT visibility but don't want your content used for AI training. Here's a selective approach:
# Allow ChatGPT browsing (real-time answers) User-agent: ChatGPT-User Allow: / # Block GPTBot (training data) User-agent: GPTBot Disallow: / # Allow Perplexity (search answers) User-agent: PerplexityBot Allow: / # Block Google AI training but allow search User-agent: Google-Extended Disallow: /
Note: Blocking training crawlers means AI models won't have up-to-date knowledge of your site. They might still cite you from cached data, but it won't be current.
How to Edit robots.txt on Popular Platforms
WordPress
Edit via Yoast SEO plugin: SEO → Tools → File editor → robots.txt. Or create/edit the file at your site root.
Squarespace
Go to Settings → SEO → scroll to "Additional Robots.txt Rules" and add your AI crawler rules there.
Wix
Go to Dashboard → Settings → SEO (Google) → SEO Tools → Robots.txt Editor.
Shopify
Shopify auto-generates robots.txt. Edit it via theme.liquid or use a Shopify robots.txt app.
Next.js / Vercel
Create a robots.ts file in your app/ directory or add a static robots.txt in public/.
Common Mistakes
- Using a wildcard Disallow that blocks AI crawlers —
User-agent: * Disallow: /blocks everything, including AI crawlers. - Not checking platform defaults — Some CMS platforms add AI crawler blocks automatically. Always check after setup.
- Blocking GPTBot but expecting ChatGPT visibility — GPTBot is how ChatGPT learns about your site. Without it, you rely only on Bing indexing.
- Forgetting to add a sitemap reference — Always include
Sitemap: https://yoursite.com/sitemap.xmlat the end of robots.txt.
How to Verify Your Configuration
- Visit
yoursite.com/robots.txtin your browser - Check that GPTBot, ClaudeBot, and PerplexityBot are not blocked
- Run a free Foglift scan — it checks all AI crawlers and shows exactly which are blocked
Frequently Asked Questions
What is GPTBot?
GPTBot is OpenAI's web crawler that indexes content for ChatGPT and other OpenAI products. Its user-agent string is "GPTBot." Allowing GPTBot access means your content can appear in ChatGPT-generated answers.
Should I allow AI crawlers on my website?
For most businesses, yes. Allowing AI crawlers means your site can appear in AI-generated answers (ChatGPT, Perplexity, Google AI Overviews). Blocking them means you're invisible to the growing number of people using AI search.
Does Squarespace block AI crawlers?
Yes, some Squarespace sites block AI crawlers by default in their robots.txt. Check your site's robots.txt to confirm, and contact Squarespace support if you need to modify it.
What AI crawlers should I allow?
The most important AI crawlers to allow are: GPTBot (ChatGPT), ChatGPT-User (ChatGPT browsing), ClaudeBot (Claude), PerplexityBot (Perplexity AI), Google-Extended (Google AI features), and Amazonbot (Amazon/Alexa).
How do I check if AI crawlers are blocked on my site?
Visit yoursite.com/robots.txt and look for Disallow rules targeting GPTBot, ClaudeBot, or PerplexityBot. Or use Foglift's free scan — it automatically checks AI crawler access as part of the GEO score.
Check your AI crawler status
Instant scan. See which AI crawlers can access your site.
Free AI Crawler CheckGenerate your robots.txt
Use our free AI Robots.txt Generator to create an optimized robots.txt with the right AI crawler settings.
AI Robots.txt Generator