How to Use This Tool

Creating an `llm.txt` file is simple with our generator. Follow these steps to stop unwanted AI scraping and control how your data is used:

  • Step 1: Configure User-Agent Rules: Each rule block starts with a `User-Agent`. This specifies which Large Language Model the rules apply to. Use `*` to apply the rules to all LLMs, or specify a particular bot like `Google-Extended` or `ChatGPT-User`.
  • Step 2: Add Allow & Disallow Paths: In the `Allow` text box, add the URL paths you want LLMs to access. In the `Disallow` box, add paths you want to block. Each path should be on a new line (e.g., `/blog/` or `/private/`).
  • Step 3: Add More Rules (Optional): Click the "Add User-Agent Rule" button to create a new, separate block of rules for a different LLM. This allows for granular control.
  • Step 4: Include Your Sitemap (Optional): If you have an XML sitemap, enable the toggle and paste the full URL. This helps LLMs discover all your allowed content efficiently.
  • Step 5: Copy or Download: Once configured, the complete `llm.txt` content appears on the right. You can click "Copy" to add it to an existing file or "Download llm.txt" to save a new file, which you can then upload to the root directory of your website.

A Guide to llm.txt for AI SEO

What is an llm.txt file?

An `llm.txt` file is a plain text file you place in the root directory of your website. It provides instructions to Large Language Models (LLMs) and AI bots, telling them which parts of your site they are permitted to access for training and content generation purposes. It's an emerging standard, similar to `robots.txt`, designed to give site owners critical control over how their content is used by AI.

Key Directives Explained

  • User-Agent: This directive specifies the AI bot the following rules apply to. A `User-Agent: *` line means the rules are for all bots that recognize this standard.
  • Allow: This directive explicitly grants access to a specific directory or path. It can be used to override a broader `Disallow` rule, giving you fine-tuned control.
  • Disallow: This is the most common directive. It tells an AI bot not to crawl or process a specific URL path, file, or directory for its training data.

Frequently Asked Questions (FAQs)

Is `llm.txt` the same as `robots.txt`?

They are similar but serve different purposes. `robots.txt` is a long-standing standard for controlling search engine crawlers (like Googlebot) for indexing purposes. `llm.txt` is a newer, proposed standard focused specifically on controlling data collection by AI and LLM bots for training purposes. While some bots may respect `robots.txt` rules, `llm.txt` offers more explicit and dedicated control for AI.

Which LLMs currently support this file?

The `llm.txt` standard is still emerging, and adoption is growing. Major AI developers have started to look for and respect these files. By implementing `llm.txt`, you are future-proofing your site and signaling your intent clearly, even as official support becomes universal.

Do I really need an `llm.txt` file?

If you want to control how AI models use your content, then yes. Without an `llm.txt` file, you have no way to prevent your proprietary articles, user data, or private sections from being ingested into AI training datasets. It's a critical tool for data privacy and content ownership in the age of AI.

Is my information saved by this tool?

No. This tool is built for privacy. We do not save, store, or view any information you enter. All processing happens directly in your browser, and your data never leaves your device.