Context Window

The maximum number of tokens an LLM can process in a single interaction, including both the input (your prompt, any documents, system instructions) and the output (the model's response). Think of it as the model's working memory for that conversation. When the conversation's over, so is the memory.

Why it matters for writers: A larger context window means you can include more reference material (style guides, terminology databases, prior drafts) alongside your actual request. But there's a catch: models don't attend equally to all parts of the context. Information in the middle of a very long context often receives less attention than information at the beginning or end. Researchers call this the "lost in the middle" problem. I call it "burying the lede, but for robots."

Related terms: Token · Context Window Stuffing · Large Language Model