Free AI Tool

robots.txt Generator.
Control AI Crawlers.

Block AI training bots. Allow AI search bots. Generate a clean robots.txt in seconds — tell GPTBot, CCBot, and Google-Extended to stay out while keeping Perplexity and ChatGPT browsing in. No signup. Preview updates live.

Free tool. No signup. Copy or download instantly.

Why your robots.txt needs an AI update

10+

AI crawlers you can control with one file

2

categories: training bots vs. search bots

< 2 min

to generate, download and deploy

100 %

free — no account, no limits

What is robots.txt — and why do AI bots change everything?

robots.txt is a text file at the root of your domain (yourdomain.com/robots.txt) that tells crawlers which pages they may and may not access. Well-behaved bots respect it. AI has created two distinct crawler categories: training bots (scrape your content to build AI models) and search bots (index your content so AI can cite you in live answers). This generator helps you manage both — block one, allow the other.

What robots.txt does

  • • Blocks AI training bots from using your content to build commercial models.
  • • Allows AI search bots to index your site so you appear in AI-generated answers.
  • • Restricts private areas (admin, login, checkout) from all crawlers.

What robots.txt does not do

  • • It does not block bad-faith scrapers — malicious actors ignore it.
  • • It does not guarantee AI citations or prevent use of already-scraped content.
  • • It does not replace llms.txt — use both for full GEO control.

Basics

Enter your domain — it will appear as a reference comment in the generated file.

Policy preset

Choose a preset or configure each bot manually below.

AI Training Bots

These bots scrape your content to build AI training datasets. Block them to protect your content from commercial model training.

GPTBot OpenAI

Scrapes content to train ChatGPT and OpenAI models.

Google-Extended Google

Trains Gemini and other Google AI models. Different from Googlebot.

CCBot Common Crawl

Builds open datasets used to train many LLMs including GPT.

anthropic-ai Anthropic

Used to train Claude AI models.

Bytespider ByteDance

ByteDance crawler for AI training data collection.

Diffbot Diffbot

AI-powered data extraction and training dataset builder.

AI Search & Answer Bots

These bots index your content so AI engines can cite you in live answers. Allow them to improve your GEO visibility.

PerplexityBot Perplexity

Powers AI-generated answers in Perplexity search.

ChatGPT-User OpenAI

ChatGPT browsing mode — real-time web access for answers.

OAI-SearchBot OpenAI

OpenAI search indexing for AI-powered answer engines.

Meta-ExternalAgent Meta

Meta AI search, discovery, and answer generation.

Amazonbot Amazon

Powers Amazon AI features, Alexa, and product answers.

YouBot You.com

You.com AI search engine crawler and indexer.

Classic Search Bots

Traditional search engine crawlers. Keeping them allowed is essential for SEO — only block with good reason.

Googlebot Google

Google Search crawler — essential for SEO rankings.

Bingbot Microsoft

Bing Search and Microsoft AI search indexer.

Applebot Apple

Powers Apple Search, Siri suggestions, and Spotlight.

DuckDuckBot DuckDuckGo

DuckDuckGo privacy-focused search indexing.

Blocked Paths

Paths blocked from all crawlers via User-agent: *. Use for private, admin, and non-indexable areas.

Quick add:

Options

Live preview

# Enter your website URL to start generating robots.txt
robots.txt

Enforcement

Controls crawler access. Blocks or allows specific bots and paths. Enforced by well-behaved crawlers.

vs
llms.txt

Advisory

Guides AI prioritization. Suggests which pages are most authoritative. Advisory — no enforcement mechanism.

robots.txt is enforcement: it controls crawler access. llms.txt is advisory: it guides AI prioritization. Use robots.txt to block training bots and restrict private pages. Use llms.txt to tell AI search bots which pages represent you best. For full GEO control you need both.

Generate your llms.txt →

How to deploy robots.txt

  1. 1 Choose a policy preset or configure bots manually. Add any private paths to block.
  2. 2 Copy or download the generated robots.txt file.
  3. 3 Place the file at the root of your domain — https://yourdomain.com/robots.txt. On most static hosts (Astro, Next.js, Netlify, Cloudflare Pages) it goes in the public/ folder.

Where to place the file

The file must be reachable at https://yourdomain.com/robots.txt — never inside a subfolder. On Astro, Next.js, Netlify, and Cloudflare Pages it goes into the public/ directory. On WordPress, place it in the WordPress installation root.

Want AI engines to actually recommend you?

robots.txt is one piece of the puzzle. EchoDestiny monitors how ChatGPT, Perplexity, Gemini and Claude talk about your brand — and turns the findings into prioritised actions.

Frequently asked questions about robots.txt and AI crawlers

Is this robots.txt generator really free?

Yes — completely free. No signup, no email, no account required. Everything is generated inside your browser. Nothing you type is transmitted to any server.

What is the difference between AI training bots and AI search bots?

AI training bots (GPTBot, Google-Extended, CCBot, anthropic-ai) scrape your content to build training datasets — they make the AI model smarter, but do not help you get cited in live answers. AI search bots (PerplexityBot, ChatGPT-User, OAI-SearchBot) index your content so AI engines can reference you in real-time answers. You generally want to block training bots and allow search bots.

Does blocking GPTBot prevent ChatGPT from citing my site?

No — and this is an important distinction. GPTBot is the training crawler. Blocking it prevents your content from entering OpenAI's training datasets. ChatGPT-User and OAI-SearchBot are separate crawlers used for live browsing and search indexing. Blocking GPTBot does not block ChatGPT from citing you in real-time answers.

What does Google-Extended do? Is it the same as Googlebot?

No — they are separate user agents. Googlebot crawls content for Google Search rankings. Google-Extended is a distinct crawler used to train Gemini and other Google AI products. Blocking Google-Extended has no effect on your Google Search rankings.

What is the difference between robots.txt and llms.txt?

robots.txt is enforcement — it tells crawlers what they may and may not access. llms.txt is advisory — it suggests which pages are most authoritative for AI context. robots.txt controls access; llms.txt guides prioritization. You need both for full GEO visibility management.

Will blocking AI training bots affect my Google rankings?

No. Blocking AI training bots (GPTBot, Google-Extended, CCBot, anthropic-ai) does not affect Google Search rankings. These use different user agents from Googlebot, which handles Search. Classic SEO is entirely unaffected.

Should I block all AI bots to protect my content?

Blocking all AI bots protects your content from training datasets, but also prevents AI search engines (Perplexity, ChatGPT browsing, Meta AI) from indexing and citing you. If GEO visibility matters to your business, block only training bots and keep search bots allowed. The "Maximum GEO" preset does exactly this.

What paths should I add to the Blocked Paths section?

Add paths for content that should not be indexed by any crawler: /admin/, /login/, /checkout/, /api/ (if it exposes sensitive data), /wp-admin/ for WordPress sites, and any staging or private areas. Never block your main content pages — that will hurt both SEO and GEO visibility.

Where do I place the robots.txt file?

At https://yourdomain.com/robots.txt — always at the domain root, never in a subfolder. On Astro, Next.js, Netlify, and Cloudflare Pages, place it in the public/ folder. On WordPress, place it in the installation root directory.

What is Crawl-delay and should I use it?

Crawl-delay tells bots how many seconds to wait between requests. Useful if crawlers are causing server load. It is not honoured by Googlebot (Google has its own crawl rate settings), but it is respected by Bingbot, Yandex, and several other crawlers.