The maximum number of tokens an LLM can process in a single inference call. Modern models support windows from 8K to over 1M tokens.
How much the AI can remember during one conversation โ like a whiteboard that can only fit so many words.
The amount of text an AI can "see" at once during a conversation. Bigger context windows mean the AI can handle longer documents and remember more.
The maximum number of tokens an LLM can process in a single inference call. Modern models support windows from 8K to over 1M tokens.
The fixed-length input buffer of an LLM, determining how much text can be jointly attended to. Longer context windows enable multi-document reasoning but increase compute cost quadratically with attention.
The maximum sequence length over which the attention mechanism computes pairwise interactions โ bounded by positional encoding scheme and memory, with recent advances in sparse attention, ring attention, and RoPE extrapolation extending effective context.
Want to explore Context Window in depth?
Ask SeekBox and get answers from 7 AI engines at once.
Try it in SeekBox โ