Skip to main content
F
Foglift
Blog/AI Crawlers

Robots.txt for AI Crawlers: How to Allow GPTBot, ClaudeBot, and PerplexityBot

Your robots.txt file controls which AI crawlers can access your site. Here's how to configure it for maximum AI visibility.

10 min read

The robots.txt file tells web crawlers what they can and can't access on your website. Traditionally, this meant Googlebot and Bingbot. But now there's a new generation of AI crawlers — and many websites are blocking them without realizing it.

If AI crawlers can't access your site, your business won't appear in AI-generated answers from ChatGPT, Claude, Perplexity, or Google AI Overviews. This guide covers every AI crawler you need to know about and how to configure robots.txt correctly.

Check your robots.txt instantly

Foglift scans your robots.txt and checks every AI crawler listed below. Free, no signup required.

Every AI Crawler You Need to Know About

Here's the complete list of AI crawlers as of 2026:

CrawlerCompanyPowersUser-Agent String
GPTBotOpenAIChatGPT training and knowledgeGPTBot
ChatGPT-UserOpenAIChatGPT web browsingChatGPT-User
ClaudeBotAnthropicClaude AI training and searchClaudeBot
PerplexityBotPerplexity AIPerplexity search answersPerplexityBot
Google-ExtendedGoogleGemini, AI OverviewsGoogle-Extended
AmazonbotAmazonAlexa answers, Amazon AIAmazonbot
anthropic-aiAnthropicClaude training dataanthropic-ai
cohere-aiCohereCohere AI productscohere-ai

For most businesses that want maximum AI visibility, use this robots.txt:

# Standard search engine crawlers
User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

# AI crawlers — allow for AI search visibility
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Amazonbot
Allow: /

# Default — allow everything else
User-agent: *
Allow: /

# Block private/admin areas
Disallow: /admin/
Disallow: /api/
Disallow: /auth/

Sitemap: https://yoursite.com/sitemap.xml

How to Selectively Allow or Block AI Crawlers

Maybe you want ChatGPT visibility but don't want your content used for AI training. Here's a selective approach:

# Allow ChatGPT browsing (real-time answers)
User-agent: ChatGPT-User
Allow: /

# Block GPTBot (training data)
User-agent: GPTBot
Disallow: /

# Allow Perplexity (search answers)
User-agent: PerplexityBot
Allow: /

# Block Google AI training but allow search
User-agent: Google-Extended
Disallow: /

Note: Blocking training crawlers means AI models won't have up-to-date knowledge of your site. They might still cite you from cached data, but it won't be current.

How to Edit robots.txt on Popular Platforms

WordPress

Edit via Yoast SEO plugin: SEO → Tools → File editor → robots.txt. Or create/edit the file at your site root.

Squarespace

Go to Settings → SEO → scroll to "Additional Robots.txt Rules" and add your AI crawler rules there.

Wix

Go to Dashboard → Settings → SEO (Google) → SEO Tools → Robots.txt Editor.

Shopify

Shopify auto-generates robots.txt. Edit it via theme.liquid or use a Shopify robots.txt app.

Next.js / Vercel

Create a robots.ts file in your app/ directory or add a static robots.txt in public/.

Common Mistakes

  1. Using a wildcard Disallow that blocks AI crawlers User-agent: * Disallow: / blocks everything, including AI crawlers.
  2. Not checking platform defaults — Some CMS platforms add AI crawler blocks automatically. Always check after setup.
  3. Blocking GPTBot but expecting ChatGPT visibility — GPTBot is how ChatGPT learns about your site. Without it, you rely only on Bing indexing.
  4. Forgetting to add a sitemap reference — Always include Sitemap: https://yoursite.com/sitemap.xml at the end of robots.txt.

How to Verify Your Configuration

  1. Visit yoursite.com/robots.txt in your browser
  2. Check that GPTBot, ClaudeBot, and PerplexityBot are not blocked
  3. Run a free Foglift scan — it checks all AI crawlers and shows exactly which are blocked

Frequently Asked Questions

What is GPTBot?

GPTBot is OpenAI's web crawler that indexes content for ChatGPT and other OpenAI products. Its user-agent string is "GPTBot." Allowing GPTBot access means your content can appear in ChatGPT-generated answers.

Should I allow AI crawlers on my website?

For most businesses, yes. Allowing AI crawlers means your site can appear in AI-generated answers (ChatGPT, Perplexity, Google AI Overviews). Blocking them means you're invisible to the growing number of people using AI search.

Does Squarespace block AI crawlers?

Yes, some Squarespace sites block AI crawlers by default in their robots.txt. Check your site's robots.txt to confirm, and contact Squarespace support if you need to modify it.

What AI crawlers should I allow?

The most important AI crawlers to allow are: GPTBot (ChatGPT), ChatGPT-User (ChatGPT browsing), ClaudeBot (Claude), PerplexityBot (Perplexity AI), Google-Extended (Google AI features), and Amazonbot (Amazon/Alexa).

How do I check if AI crawlers are blocked on my site?

Visit yoursite.com/robots.txt and look for Disallow rules targeting GPTBot, ClaudeBot, or PerplexityBot. Or use Foglift's free scan — it automatically checks AI crawler access as part of the GEO score.

Check your AI crawler status

Instant scan. See which AI crawlers can access your site.

Free AI Crawler Check

Generate your robots.txt

Use our free AI Robots.txt Generator to create an optimized robots.txt with the right AI crawler settings.

AI Robots.txt Generator

Related Reading

Free tool

Check your website's SEO + GEO score

Scan any URL in 30 seconds. See scores for SEO, AI search readiness, performance, security, and accessibility.

Scan Your Site Free

No signup. 5 free scans/day. Results in 30 seconds.