Question 1

How widely adopted is llms.txt?

Accepted Answer

Community directories list hundreds of implementations, but cross-referencing against the Majestic Million shows only 105 sites (0.011% of the top million) have llms.txt files, with zero in the top 1,000. Adoption is concentrated almost entirely in developer documentation and tech companies.

Question 2

Do AI systems use llms.txt at inference time?

Accepted Answer

No major AI provider has publicly confirmed using llms.txt at inference time. Google explicitly rejected the standard, and server log evidence is more consistent with training-time data collection than real-time retrieval. The spec was designed for inference, but the data doesn't show that happening.

Question 3

Why do Cloudflare's AI crawler settings conflict with each other?

Accepted Answer

Cloudflare has three overlapping control layers: the AI Audit dashboard, AI Crawl Control categories, and WAF Custom Rules. WAF custom rules execute before AI Crawl Control settings, meaning a security rule can override Cloudflare's own AI-specific toggles without the site operator realizing it.

Question 4

Is the llms.txt Access Paradox just a configuration error?

Accepted Answer

No, it's a structural problem. The llms.txt standard assumes frictionless access between 'file published' and 'file consumed by AI.' But Cloudflare alone sits in front of 20% of all websites, and all major WAF providers treat AI crawlers as threats by default. The burden falls entirely on site operators to become WAF experts.

2 posts tagged with "LlmsTxtKit"

The llms.txt Access Paradox: The Data Nobody Wants to Hear

I Tried to Help AI Read My Website. My Own Firewall Said No.