Token

The fundamental unit of text that an LLM processes. A token is not the same as a word; it's a chunk of text determined by the model's tokenizer. Common English words are usually a single token. Less common words, technical terms, and non-English text get split into multiple tokens. As a rough approximation, one token ≈ ¾ of a word in English, so 1,000 tokens is roughly 750 words.

Think of it like this: if words are coins, tokens are the bills the model's vending machine actually accepts. Sometimes a coin happens to be a bill. Sometimes it takes three bills to buy one coin. The exchange rate is unpredictable and slightly annoying.

Why it matters for writers: Token counts determine how much text you can send to and receive from an LLM in a single interaction. When documentation says a model has a "128K context window," that means it can process approximately 128,000 tokens (~96,000 words) at once. That sounds generous until you include the system prompt, your style guide, three reference documents, and the actual question.

Related terms: Context Window · Large Language Model · Embedding