Short definition
robots.txt is a text file at the root of a website that tells search-engine and AI crawlers which URLs they can or cannot fetch.
In depth
robots.txt is the oldest standard in crawler control. It is voluntary — well-behaved bots obey it, malicious ones ignore it — but every major engine respects it. For GEO it is critical because AI engines run separate crawlers (GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended, Applebot-Extended, etc.). Blocking these crawlers means the site cannot be cited; allowing them with sitemap references invites indexing. The file lives at /robots.txt and is fetched on first visit.
Example
A site's robots.txt explicitly allows GPTBot, ClaudeBot, and PerplexityBot and points to /sitemap.xml. Within months it appears in answers across all three engines.