Skip to main content

Project banner

Haiku Protocol

Haiku Protocol is a semantic compression system for LLM context windows. It transforms verbose technical documentation into dense, structured shorthand--a Controlled Natural Language (CNL)--while preserving meaning. Think of it as minifying JavaScript, but for prose.

The Problem It Solves

Context windows are finite. Every token spent on background documentation, system instructions, or reference material is a token unavailable for the actual task. When you're building RAG pipelines or orchestration systems that need to pack multiple documents into a single prompt, space is a real constraint.

Verbose text funneling through a torii gate — 847 tokens compressed to 127 with 85% reduction.

Current approaches are blunt: truncate documents, summarize them (losing detail), or use larger models with bigger context windows (increasing cost). Haiku Protocol proposes something different: compress the documentation into a machine-readable shorthand that an LLM can be taught to decode, preserving semantic content at a fraction of the token count.

How It Works

Two stages:

Semantic density comparison: sparse verbose tokens vs. concentrated compressed output.

Encoding. Documents are processed into semantic chunks. Entities and relationships are extracted. The content is then synthesized into CNL statements: structured, compressed representations that preserve the factual and relational content while eliminating the prose scaffolding--transition sentences, redundant phrasing, stylistic flourishes--that humans need but LLMs don't.

Decoding. LLMs are taught to interpret the compressed format through examples and instruction. Given a CNL-compressed document and a query, the model expands the compressed content back into natural language reasoning. The compression is transparent to the end user. They ask a question. They get a normal answer.

Performance Targets

  • 50%+ compression on procedural documentation
  • Semantic similarity above 0.85 between original and decoded output
  • Competitive performance against existing compression approaches like LLMLingua

Progressive compression passes shrinking content through four stages.

How It Connects

Haiku Protocol sits at the intersection of several other projects:

  • It directly addresses the context window stuffing problem that RAG pipelines face
  • It's the llms.txt philosophy of curating content for AI consumption, taken to its logical extreme
  • Lexichord's orchestration layer could use compressed reference material to fit more context into each AI call
  • The grammar design is informed by technical writing principles: the compression preserves the information architecture that matters for accurate retrieval

Technology

  • Language: Python
  • License: MIT
  • Status: Research and development, spec-first methodology

Where to Find It