terminal INITIATE_SCAN
← Generative Discovery Glossary
OCS://GLOSSARY_TERM

Robots.txt / AI Crawler Allowlist

The server file that controls which AI crawlers can access your site — and must be updated to allow modern AI bots.

Plain Language Definition

Your robots.txt file is a list on your server that permissions specific AI bots to crawl and read your website’s content. By default, new AI crawlers like PerplexityBot and GPTBot may be blocked by legacy robots.txt configurations designed only for Google and Bing. Reviewing and updating your allowlist ensures AI systems can access the content you want cited.

Technical Definition

Server-level configuration file containing per-user-agent crawling directives including allow/disallow rules and crawl-rate parameters — extended in modern AI-readiness contexts to explicitly permission specific AI crawler user-agents (GPTBot, PerplexityBot, ClaudeBot, etc.).

Why This Matters for AI Search Visibility

If AI crawlers are blocked from your site, your content cannot be retrieved or cited by those platforms regardless of how well-optimized it is. Many sites inadvertently block AI crawlers through broad wildcard blocks. Auditing your robots.txt for AI crawler permissions is a foundational first step.

Scroll to Top