Back to Glossary
GEO & AI Search

llms.txt

llms.txt is a proposed standard: a Markdown file placed at a website's root (/llms.txt) that gives large language models a curated summary and links to the site's most important documents so they can understand it at inference time. Proposed by Jeremy Howard of Answer.AI on September 3, 2024, it differs from robots.txt — instead of controlling crawler access, its goal is to gather AI-friendly content in one place and point models to it.

  • llms.txt is a proposed standard — a Markdown file at the site root /llms.txt that helps LLMs grasp the essentials quickly instead of parsing sprawling HTML.
  • It was proposed by Jeremy Howard of Answer.AI on September 3, 2024, and is designed to coexist with robots.txt and sitemap.xml.
  • The format is a fixed Markdown spec: an H1 title (the only required element), a blockquote summary, and H2 sections containing curated link lists.
  • It serves a different purpose than robots.txt (blocking crawlers) or sitemap.xml (indexing). llms.txt is used mainly at inference time, when a user asks for information.
  • llms.txt is an informal proposal, and major search and AI providers such as Google do not guarantee official support — so it does not replace standard SEO foundations like robots.txt, sitemaps, or structured data.

Overview and background

llms.txt is a proposed standard: a Markdown file placed at a website's root (/llms.txt) that helps large language models (LLMs) understand a site's content accurately and efficiently at inference time. Jeremy Howard of Answer.AI introduced it on September 3, 2024, via llmstxt.org.

The motivation behind the proposal is the limited context window of LLMs. According to the official proposal, models increasingly rely on website information, yet most sites are far too large to fit into a context window all at once, and converting complex HTML — laden with navigation, ads, and JavaScript — into clean text that an LLM can read is both difficult and error-prone. llms.txt attempts to ease this by gathering concise, expert-level information in a single place.

File format

The official spec is written in Markdown and follows a set order. After an optional BOM (byte-order mark) come the H1 (the only required element) holding the project or site name, a blockquote summarizing the essentials, an arbitrary Markdown body excluding any further headings, and then H2 "file list" sections holding URLs to additional information. Each entry is a Markdown link in the form [title](URL), optionally followed by a : and a description.

The basic format from the official proposal looks like this:

# Title

> Optional description goes here

Optional details go here

## Section name

- [Link title](https://link_url): Optional link details

## Optional

- [Link title](https://link_url)

Here the ## Optional section carries special meaning: it collects secondary information that can be skipped when a shorter context is needed.

As a real-world example, Anthropic provides an llms.txt for its developer documentation (platform.claude.com/llms.txt, redirected from docs.anthropic.com/llms.txt). It opens with the H1 # Anthropic Developer Documentation, followed by a summary paragraph and H2 sections listing the .md versions of each document — following the spec exactly.

For reference, the official proposal includes a second recommendation: for any page that may be useful to an LLM, also provide a clean Markdown version at the same URL with .md appended (URLs without a filename get index.html.md). The FastHTML project is cited as a leading example that follows both recommendations.

How it differs from robots.txt and sitemap.xml

llms.txt is designed to coexist with existing web standards rather than replace them. The three files have clearly distinct purposes.

Aspectllms.txtrobots.txtsitemap.xml
Primary purposeCurate and present a site's key information and documents to LLMsConvey rules allowing or blocking crawler accessProvide a full list of pages to be indexed
Primary consumerLLMs and AI agentsSearch and crawling botsSearch engine indexers
FormatMarkdownPlain-text rules (User-agent/Allow/Disallow)XML
External site linksCan be includedNot applicableGenerally not included
When it's usedMainly at inference time (on user request)At crawl timeAt index time
LocationRoot /llms.txtRoot /robots.txtUsually root /sitemap.xml

The official proposal explains why sitemap.xml cannot serve as a substitute for llms.txt: sitemaps often don't include LLM-friendly versions of pages, don't list helpful external URLs, and, taken together, are too large and full of unnecessary detail to fit into a context window. robots.txt and llms.txt both follow the root-path convention, but their roles diverge — the former governs access, the latter delivers information.

Current state and limitations

llms.txt is an informal proposal, and the official proposal itself describes it as "a specification open to community feedback." Version control and public discussion happen in a GitHub repository, and directories such as llmstxt.site and directory.llmstxt.cloud collect adopting sites. Generation tooling has also emerged, including plugins for VitePress, Docusaurus, and Drupal.

Unlike robots.txt or sitemaps, however, there is no guarantee that major search and AI providers read llms.txt or factor it into citations. It is therefore safer to treat llms.txt not as a replacement for standard SEO and technical foundations, but as a complementary option for serving AI-friendly documentation. It fits most naturally in development settings that deal with docs and API references, where information is being passed to an LLM.

Implementation checklist

  • Place a Markdown file at the root path /llms.txt.
  • Put the site or project name in the H1, with a key summary in the blockquote (>) right below it.
  • Group documents into H2 sections, writing each entry as [title](URL): description.
  • Collect skippable, secondary material under an ## Optional section.
  • Where possible, also serve a .md version of key pages at the same URL.
  • Avoid vague phrasing and unexplained jargon, and attach a concise description to every link.
  • Keep existing standards such as robots.txt, sitemap.xml, and structured data in place — llms.txt is a complement, not a substitute.

References