Key Concepts: Tokens, Prompts, Context Window
To understand how Generative AI (especially Large Language Models like ChatGPT) works, you need to know three foundational concepts: tokens, prompts, and the context window. These are the building blocks of how LLMs process and generate language.
🔹 1. What Are Tokens?
Tokens are the basic units of text that a language model reads and writes.
A token might be a word, part of a word, or even punctuation.
For example, the sentence:
"I love Generative AI!" might be broken into tokens like:
"I"," love"," Gener","ative"," AI","!"
LLMs don’t work with full sentences — they work with sequences of tokens. Every prompt you type is first tokenized before it enters the model.
🧠 Tip:
Most models (like GPT-3.5/4) use around 4 tokens per English word on average.
🔹 2. What Is a Prompt?
A prompt is the input you give to a language model — it's how you "talk" to it.
A prompt can be a question, a command, or just a few words.
Examples:
"Write a poem about the ocean."
"Translate this to French: Hello, how are you?"
The model takes your prompt (in token form), understands the context, and generates a completion — its response.
🔹 3. What Is the Context Window?
The context window refers to the maximum number of tokens a model can process at once — including both your prompt and the model’s response.
Example:
If a model has a 4,000-token context window, and your prompt is 1,000 tokens long, the model can generate up to 3,000 tokens in response.
GPT-4-turbo supports up to 128,000 tokens, which allows it to read long documents or conversations in one go.
If you exceed the limit, older parts of the conversation may be forgotten or dropped.
🧭 Why These Concepts Matter
Token
Affects how much you can say and how you’re billed
Prompt
Guides the model's behavior and creativity
Context Window
Limits how much memory the model has
🧠 Summary
Tokens = Pieces of text the model understands
Prompts = Instructions or questions you give the model
Context window = How much input/output the model can handle at once
Understanding these helps you write better prompts, optimize token usage, and avoid issues like the model “forgetting” earlier parts of the conversation.
Last updated