OpenAI Token Counter
OpenAI Token Counter estimates how many tokens your text will use with GPT-5.x, GPT-4o, GPT-4.1, and other OpenAI models. All current OpenAI models use the o200k_base encoding (~4 characters per token for English).
This counter estimates raw token counts and does not account for cached (prompt caching) tokens. OpenAI offers reduced pricing for cached prompt prefixes on supported models -- check the OpenAI pricing page for details.
OpenAI Model Comparison
All current OpenAI models use the o200k_base encoding with a vocabulary of approximately 200,000 tokens.
| Model | Context | Input / 1M | Output / 1M | Encoding |
|---|---|---|---|---|
| GPT-5.3-Codex | ~400K | TBD* | TBD* | o200k_base |
| GPT-5.2 | 400K | $1.75 | $14.00 | o200k_base |
| GPT-5.2-Codex | 400K | $1.75 | $14.00 | o200k_base |
| GPT-5.1 | 400K | $1.25 | $10.00 | o200k_base |
| GPT-5.1-Codex | 400K | $1.25 | $10.00 | o200k_base |
| GPT-5 | 400K | $1.25 | $10.00 | o200k_base |
| GPT-5 Mini | 400K | $0.25 | $2.00 | o200k_base |
| GPT-4o | 128K | $2.50 | $10.00 | o200k_base |
| GPT-4o mini | 128K | $0.15 | $0.60 | o200k_base |
| GPT-4.1 | 1M | $2.00 | $8.00 | o200k_base |
| GPT-4.1 Mini | 1M | $0.40 | $1.60 | o200k_base |
| GPT-4.1 Nano | 1M | $0.10 | $0.40 | o200k_base |
*GPT-5.3-Codex released Feb 5, 2026 -- API pricing not yet announced; currently ChatGPT-only.
How OpenAI Tokenization Works
OpenAI models use Byte Pair Encoding (BPE) to split text into tokens. The current tokenizer is called o200k_base, which has a vocabulary of approximately 200,000 tokens. This is the same encoding used by all GPT-5.x, GPT-4o, and GPT-4.1 models.
What is BPE?
Byte Pair Encoding starts with individual bytes and iteratively merges the most frequent pairs into new tokens. This creates a vocabulary where common words like "the" or "and" are single tokens, while rare words get split into subword pieces.
The tiktoken Library
OpenAI provides an open-source Python library called tiktoken for exact token counting. To use it:
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4o")
tokens = enc.encode("Hello, world!")
print(len(tokens)) # 4
Token Estimation Rules
- English text: ~4 characters per token (1 word = ~1.3 tokens)
- Code: ~3-3.5 characters per token (symbols split into separate tokens)
- CJK text: ~1.5-2 characters per token (each character may be multiple tokens)
- Numbers: each digit or small group is typically 1 token
Frequently Asked Questions
How many tokens does GPT-5 use per word?
GPT-5 uses the o200k_base tokenizer, which averages about 1.3 tokens per English word (or roughly 4 characters per token). Technical terms and code use more tokens per word. Non-English languages, especially CJK, use significantly more tokens.
What encoding does GPT-4o use?
GPT-4o uses the o200k_base encoding with a vocabulary of approximately 200,000 tokens. This is the same encoding used across all current OpenAI models including GPT-5.x and GPT-4.1.
How to count tokens for the OpenAI API?
You can use this free online counter for estimates. For exact counts, use the official tiktoken Python library: pip install tiktoken, then use tiktoken.encoding_for_model('gpt-4o') to get the encoder.
What is tiktoken?
tiktoken is OpenAI's open-source tokenizer library for Python. It provides exact token counts for all OpenAI models. It supports multiple encodings including o200k_base (current models) and cl100k_base (legacy models like GPT-4 Turbo).
What is the difference between GPT-5 and GPT-5.1?
GPT-5.1 is an updated version of GPT-5 with improved reasoning and instruction following. Both use the same o200k_base tokenizer and 400K context window, and have the same input pricing ($1.25/1M tokens).
How much does GPT-5 cost per 1000 tokens?
GPT-5 costs $0.00125 per 1K input tokens ($1.25 per 1M) and $0.01 per 1K output tokens ($10.00 per 1M). GPT-5 Mini is significantly cheaper at $0.00025 per 1K input tokens. Output tokens always cost more than input tokens.
Token Counters by Provider
Pricing data as of February 7, 2026. Prices change frequently -- always verify with the official provider documentation: OpenAI | Anthropic | Google Gemini | Groq | Together AI
Privacy & Limitations
- All calculations run entirely in your browser -- nothing is sent to any server.
- Results are estimates and may vary based on actual conditions.
Related Tools
- Llama Token Counter -- Count tokens and estimate costs for Meta Llama 4, 3.3, and open-source LLM
- AI Token Counter -- Estimate tokens and characters for a prompt
- OpenAI Cost Calculator -- Estimate API cost from token counts
- Claude Token Counter -- Count tokens and estimate costs for Claude Opus 4.6, Sonnet, and Anthropic
Related Tools
View all toolsAI Token Counter
Estimate tokens and characters for a prompt
OpenAI Cost Calculator
Estimate API cost from token counts
Claude Token Counter
Count tokens and estimate costs for Claude Opus 4.6, Sonnet, and Anthropic models
Gemini Token Counter
Count tokens and estimate costs for Google Gemini 3 Pro, 2.5 Pro and Flash models
Llama Token Counter
Count tokens and estimate costs for Meta Llama 4, 3.3, and open-source LLM models
OpenAI Token Counter FAQ
How many tokens does GPT-5 use per word?
GPT-5 uses the o200k_base tokenizer, which averages about 1.3 tokens per English word (or roughly 4 characters per token). Technical terms and code use more tokens per word.
What encoding does GPT-4o use?
GPT-4o uses the o200k_base encoding with a vocabulary of approximately 200,000 tokens. This is the same encoding used across all current OpenAI models including GPT-5.x and GPT-4.1.
How to count tokens for the OpenAI API?
You can use this free online counter for estimates, or use the official tiktoken Python library for exact counts. Install it with pip install tiktoken, then use tiktoken.encoding_for_model('gpt-4o') to get the encoder.
What is tiktoken?
tiktoken is OpenAI's open-source tokenizer library for Python. It provides exact token counts for all OpenAI models. It supports multiple encodings including o200k_base (current models) and cl100k_base (legacy models).
What is the difference between GPT-5 and GPT-5.1?
GPT-5.1 is an updated version of GPT-5 with improved reasoning and instruction following. Both use the same o200k_base tokenizer and 400K context window, but GPT-5.1 has the same input pricing ($1.25/1M tokens) while offering better performance.
How much does GPT-5 cost per 1000 tokens?
GPT-5 costs $0.00125 per 1K input tokens ($1.25 per 1M) and $0.01 per 1K output tokens ($10.00 per 1M). GPT-5 Mini is significantly cheaper at $0.00025 per 1K input tokens.