LlmsTxtKit
LlmsTxtKit is a C#/.NET library that provides a complete pipeline for working with the llms.txt standard: parsing, fetching, validation, caching, and context generation. It also ships as an MCP server, so AI agents can use its capabilities as tools directly.
The Problem It Solves
The llms.txt standard was proposed in late 2024 and has gained real traction among developer-documentation sites--notable adopters include Anthropic, Cloudflare, Stripe, and Vercel. Implementations exist for Python, JavaScript, VitePress, PHP, and Drupal. When LlmsTxtKit was conceived, the entire .NET ecosystem was absent from that list.
Beyond filling the gap, LlmsTxtKit handles a problem most existing implementations ignore: the WAF blocking paradox. AI tools that try to fetch llms.txt files are routinely blocked by the same security infrastructure protecting the sites that publish them. LlmsTxtKit handles this gracefully--configurable retry strategies, user agent management, degradation paths--rather than throwing an exception and calling it done.
What It Does
The library covers five capabilities, each designed to work standalone or as part of the full pipeline:
Parsing takes raw llms.txt content (a Markdown file with a specific structure) and produces a strongly-typed C# object model. It handles well-formed files, malformed files, and the edge cases you encounter in the wild--which are more creative than you'd expect.
Fetching retrieves llms.txt files from the web, including HTTP redirects, WAF challenges, timeouts, and rate limiting. The implementation is designed around the reality that a significant percentage of fetches will be blocked or degraded by security infrastructure. That's not an edge case. It's the default.
Validation checks a parsed file against the specification and reports compliance issues. This overlaps with DocStratum's functionality but is integrated for use in automated pipelines--validation as a gate, not as a standalone analysis.
Caching stores fetched and parsed results with configurable TTL. Particularly important for MCP server usage, where an agent might reference the same site's llms.txt file multiple times during a single task.
Context generation transforms a parsed llms.txt file into structured content optimized for an LLM's context window. The last mile of the pipeline: turning a data structure into something an AI agent can actually use.
MCP Server
LlmsTxtKit ships as an MCP server, exposing its capabilities as tools any MCP-compatible agent can discover and invoke. Fetch, validate, generate context, cache results--all available as tool calls.
The MCP server is the primary way AI agents interact with LlmsTxtKit. Human developers use the library directly via NuGet.
The Research Connection
LlmsTxtKit is one of three projects in the llms.txt research initiative:
- LlmsTxtKit provides the tooling
- DocStratum provides standalone validation with deeper analysis
- The Context Collapse Mitigation Benchmark uses LlmsTxtKit to test whether curated llms.txt content actually produces better AI responses than raw HTML
The blog documents the research findings as they emerge.
Where to Find It
- GitHub: southpawriter02/llmstxtkit (link will update when repo is public)
- NuGet: (coming when the library reaches stable release)
- Related blog posts: I Write the Docs Before the Code