AI Context Window
filza→·Feb 20, 2026·2 min read·6
AI Context Window
An AI context window is the maximum amount of data—measured in tokens (words, parts of words, images, or video)—a model can process, hold in its "working memory," and reference at once during a single interaction. It defines the limit of what the AI "sees," including prompt, conversation history, system instructions, and retrieved documents.
How it Works: The model treats the context window as short-term memory, similar to human working memory, where new information can replace older data once the limit is reached.
Key Characteristics:
Capacity: Ranges from a few thousand tokens (earlier models) to over 1-2 million tokens (e.g., Gemini 1.5 Pro).
Performance: A larger window allows for processing massive documents, entire codebases, or long, coherent conversations without "forgetting".
Limitation: When the window is exceeded, the model may experience "hallucinations," lose context, or truncate information.
Significance: It is crucial for AI agents, as it allows them to maintain focus, understand complex, multi-step queries, and keep track of long-term task objectives.
Lost-in-the-Middle: Even with large windows, models can suffer from a "lost-in-the-middle" problem, where they struggle to process information in the middle of a very long prompt.