# AI Readiness Standards Guide

> A comprehensive guide to the four key signals that make a website discoverable and readable by AI search engines, LLMs, and autonomous agents.

**Source:** [https://ai-ready.space/ai-readiness](https://ai-ready.space/ai-readiness)  
**Published by:** [AI Ready Index](https://ai-ready.space)

---

## Overview

AI search engines and LLM crawlers look for specific files and configurations to efficiently index, summarise, and cite website content. Each of the four signals below contributes **25 points** to a site's AI Readiness Score (maximum: 100/100).

| Signal | Max Points | Standard | Location |
| --- | --- | --- | --- |
| `llms.txt` | 25 | Proposed (llmstxt.org) | `/llms.txt` |
| `llms-full.txt` | 25 | Proposed (llmstxt.org) | `/llms-full.txt` |
| `robots.txt` | 25 | RFC 9309 | `/robots.txt` |
| `sitemap.xml` | 25 | sitemaps.org | `/sitemap.xml` |

---

## 1. llms.txt (+25 Points)

**Standard:** Proposed by llmstxt.org  
**Location:** `https://yourdomain.com/llms.txt`

### What it is

A concise, structured Markdown file placed in the root of a website. It provides a curated summary of the site's purpose, key sections, and important links — specifically designed for LLMs to ingest in a single request.

### Why it matters

Standard search crawlers index full HTML containing styles, scripts, and media. LLM engines need highly curated, semantic text. An `llms.txt` file:
- Saves token overhead for AI agents
- Ensures models receive accurate, undistorted information about your brand
- Enables AI to cite and summarise your site more accurately in responses

### Format

```markdown
# Website Name

> One-line description of the website and its purpose.

## Key Sections

- [Page Title](/path): Short description of what this page contains.
- [API Docs](/docs/api): Developer reference for integrating the service.
- [About](/about): Who we are and what we do.

## Optional: External Resources

- [GitHub](https://github.com/example): Source code repository.
```

---

## 2. llms-full.txt (+25 Points)

**Standard:** Proposed by llmstxt.org  
**Location:** `https://yourdomain.com/llms-full.txt`

### What it is

The full, comprehensive textual content of a website's key pages, compiled into a single Markdown file. While `llms.txt` is a concise index, `llms-full.txt` contains the actual content.

### Why it matters

- Allows AI systems to read your entire documentation or site in **one network request**
- Eliminates the need for AI agents to navigate and parse dozens of individual HTML pages
- Ideal for documentation-heavy sites, knowledge bases, and content-rich domains
- Reduces crawl latency and improves citation quality in AI responses

### Format

```markdown
# Website Name — Full Context

## About

Complete description of the company, product, or service...

## Getting Started

Full content of the Getting Started page...

## API Reference

Full content of the API Reference page...
```

---

## 3. robots.txt (+25 Points)

**Standard:** RFC 9309 (Internet Standard)  
**Location:** `https://yourdomain.com/robots.txt`

### What it is

A plain text file that tells web crawlers which pages or sections of a website they are allowed or not allowed to access. For AI-readiness, this means explicitly configuring rules for known AI crawlers.

### Why it matters

- Without explicit rules, some AI bots may be blocked by overly broad directives
- Allowing friendly AI crawlers maximises your site's footprint in chat interfaces
- You can selectively protect private or sensitive pages while allowing AI to index public content

### Known AI Crawlers

| Crawler | Company | User-Agent |
| --- | --- | --- |
| GPTBot | OpenAI | `GPTBot` |
| PerplexityBot | Perplexity AI | `PerplexityBot` |
| ClaudeBot | Anthropic | `ClaudeBot` |
| anthropic-ai | Anthropic | `anthropic-ai` |
| Google-Extended | Google | `Google-Extended` |
| Applebot-Extended | Apple | `Applebot-Extended` |

### Example AI-Friendly robots.txt

```
User-agent: *
Allow: /

User-agent: GPTBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: Google-Extended
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml
```

---

## 4. sitemap.xml (+25 Points)

**Standard:** sitemaps.org protocol  
**Location:** `https://yourdomain.com/sitemap.xml`

### What it is

An XML file that lists all pages, posts, images, and other resources on a website, along with metadata like last-modified dates and update frequency. Search engines and AI crawlers use this to systematically discover all content.

### Why it matters

- Helps LLM indexers understand your site's **full structure** without crawling blindly
- The `lastmod` tag signals freshness, improving priority in AI search results
- Prevents deep or orphaned pages from being missed by crawlers
- Enables more accurate, up-to-date citations in AI-generated responses

### Example sitemap.xml

```xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://yourdomain.com/</loc>
    <lastmod>2026-06-17</lastmod>
    <changefreq>daily</changefreq>
    <priority>1.0</priority>
  </url>
  <url>
    <loc>https://yourdomain.com/about</loc>
    <lastmod>2026-06-01</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>
```

---

## AI Readiness Score Interpretation

| Score | Rating | Meaning |
| --- | --- | --- |
| 100/100 | 🏆 Fully AI-Ready | All four signals present. Optimal for AI discoverability. |
| 75/100 | ✅ AI-Ready | Three signals present. Minor improvements possible. |
| 50/100 | ⚠️ Partially Ready | Two signals present. Significant room for improvement. |
| 25/100 | 🔶 Minimal | One signal present. Mostly invisible to AI search. |
| 0/100 | ❌ Not AI-Ready | No signals detected. Site is not optimised for AI. |

---

## Resources

- [AI Ready Index Directory](https://ai-ready.space/directory) — Browse all ranked websites
- [Check Your Website](https://ai-ready.space/) — Get your AI Readiness Score instantly
- [llms.txt Standard](https://llmstxt.org) — llms.txt proposed standard
- [Sitemaps Protocol](https://www.sitemaps.org/protocol.html) — Sitemap XML reference
- [robots.txt RFC 9309](https://www.rfc-editor.org/rfc/rfc9309) — Official robots.txt standard

---
_Generated by [AI Ready Index](https://ai-ready.space). Last updated: 2026-06-17._
