The context window is the total number of tokens your model can process in one call — input + output combined. It's not just a limit, it's a budget: every token you spend on instructions is a token you can't spend on reasoning.
Why this matters
Context engineering — what goes in, what stays out — is the new prompt engineering. Teams that manage this well get 3-5x better results from the same model.