Skip to main content

Metadata Filtering

Using structured metadata (tags, categories, dates, document types) to narrow retrieval results before or after similarity search. Instead of searching your entire knowledge base for "deployment procedure," you might filter to only documents tagged as type: procedure and product: v3.0 before running similarity search. This turns a needle-in-a-haystack problem into a needle-in-a-much-smaller-haystack problem.

Why it matters for writers: Most RAG pipelines treat metadata as a post-retrieval filter, retrieve the top 20 results by similarity, then filter by metadata. FractalRecall's core thesis is that this is backwards: metadata should inform how content gets embedded in the first place, not just how results get filtered afterward. It's the difference between organizing your files before searching and dumping everything into one folder and adding tags after. Whether you're building a RAG system or writing content that feeds one, consistent, structured metadata makes retrieval significantly more reliable.

Related terms: Retrieval-Augmented Generation · Vector Store · Chunking