Skip to main content
    ← Glossary

    SEO

    robots.txt

    Short definition

    robots.txt is a text file at the root of a website that tells search-engine and AI crawlers which URLs they can or cannot fetch.

    In depth

    robots.txt is the oldest standard in crawler control. It is voluntary — well-behaved bots obey it, malicious ones ignore it — but every major engine respects it. For GEO it is critical because AI engines run separate crawlers (GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended, Applebot-Extended, etc.). Blocking these crawlers means the site cannot be cited; allowing them with sitemap references invites indexing. The file lives at /robots.txt and is fetched on first visit.

    Example

    A site's robots.txt explicitly allows GPTBot, ClaudeBot, and PerplexityBot and points to /sitemap.xml. Within months it appears in answers across all three engines.

    Related terms