OpenAI Token Counter -- GPT-5, GPT-4o Tokens

Count tokens and estimate API costs for all OpenAI models

OpenAI Token Counter

OpenAI Token Counter estimates how many tokens your text will use with GPT-5.x, GPT-4o, GPT-4.1, and other OpenAI models. All current OpenAI models use the o200k_base encoding (~4 characters per token for English).

Estimated Tokens
0
GPT-5.1 -- o200k_base encoding
0
Words
0
Characters
0
Chars (no spaces)
0
Lines
0
Bytes (UTF-8)
4.0
Chars/Token
Input Cost
$0.0000
Output Cost (est.)
$0.0000
Context window: 0 of 400,000 tokens used
0.00%

This counter estimates raw token counts and does not account for cached (prompt caching) tokens. OpenAI offers reduced pricing for cached prompt prefixes on supported models -- check the OpenAI pricing page for details.

OpenAI Model Comparison

All current OpenAI models use the o200k_base encoding with a vocabulary of approximately 200,000 tokens.

Model Context Input / 1M Output / 1M Encoding
GPT-5.3-Codex~400KTBD*TBD*o200k_base
GPT-5.2400K$1.75$14.00o200k_base
GPT-5.2-Codex400K$1.75$14.00o200k_base
GPT-5.1400K$1.25$10.00o200k_base
GPT-5.1-Codex400K$1.25$10.00o200k_base
GPT-5400K$1.25$10.00o200k_base
GPT-5 Mini400K$0.25$2.00o200k_base
GPT-4o128K$2.50$10.00o200k_base
GPT-4o mini128K$0.15$0.60o200k_base
GPT-4.11M$2.00$8.00o200k_base
GPT-4.1 Mini1M$0.40$1.60o200k_base
GPT-4.1 Nano1M$0.10$0.40o200k_base

*GPT-5.3-Codex released Feb 5, 2026 -- API pricing not yet announced; currently ChatGPT-only.

How OpenAI Tokenization Works

OpenAI models use Byte Pair Encoding (BPE) to split text into tokens. The current tokenizer is called o200k_base, which has a vocabulary of approximately 200,000 tokens. This is the same encoding used by all GPT-5.x, GPT-4o, and GPT-4.1 models.

What is BPE?

Byte Pair Encoding starts with individual bytes and iteratively merges the most frequent pairs into new tokens. This creates a vocabulary where common words like "the" or "and" are single tokens, while rare words get split into subword pieces.

The tiktoken Library

OpenAI provides an open-source Python library called tiktoken for exact token counting. To use it:

import tiktoken

enc = tiktoken.encoding_for_model("gpt-4o")
tokens = enc.encode("Hello, world!")
print(len(tokens))  # 4

Token Estimation Rules

  • English text: ~4 characters per token (1 word = ~1.3 tokens)
  • Code: ~3-3.5 characters per token (symbols split into separate tokens)
  • CJK text: ~1.5-2 characters per token (each character may be multiple tokens)
  • Numbers: each digit or small group is typically 1 token

Frequently Asked Questions

How many tokens does GPT-5 use per word?

GPT-5 uses the o200k_base tokenizer, which averages about 1.3 tokens per English word (or roughly 4 characters per token). Technical terms and code use more tokens per word. Non-English languages, especially CJK, use significantly more tokens.

What encoding does GPT-4o use?

GPT-4o uses the o200k_base encoding with a vocabulary of approximately 200,000 tokens. This is the same encoding used across all current OpenAI models including GPT-5.x and GPT-4.1.

How to count tokens for the OpenAI API?

You can use this free online counter for estimates. For exact counts, use the official tiktoken Python library: pip install tiktoken, then use tiktoken.encoding_for_model('gpt-4o') to get the encoder.

What is tiktoken?

tiktoken is OpenAI's open-source tokenizer library for Python. It provides exact token counts for all OpenAI models. It supports multiple encodings including o200k_base (current models) and cl100k_base (legacy models like GPT-4 Turbo).

What is the difference between GPT-5 and GPT-5.1?

GPT-5.1 is an updated version of GPT-5 with improved reasoning and instruction following. Both use the same o200k_base tokenizer and 400K context window, and have the same input pricing ($1.25/1M tokens).

How much does GPT-5 cost per 1000 tokens?

GPT-5 costs $0.00125 per 1K input tokens ($1.25 per 1M) and $0.01 per 1K output tokens ($10.00 per 1M). GPT-5 Mini is significantly cheaper at $0.00025 per 1K input tokens. Output tokens always cost more than input tokens.

Token Counters by Provider

Pricing data as of February 7, 2026. Prices change frequently -- always verify with the official provider documentation: OpenAI | Anthropic | Google Gemini | Groq | Together AI

Privacy & Limitations

  • All calculations run entirely in your browser -- nothing is sent to any server.
  • Results are estimates and may vary based on actual conditions.

Related Tools

Related Tools

View all tools

OpenAI Token Counter FAQ

How many tokens does GPT-5 use per word?

GPT-5 uses the o200k_base tokenizer, which averages about 1.3 tokens per English word (or roughly 4 characters per token). Technical terms and code use more tokens per word.

What encoding does GPT-4o use?

GPT-4o uses the o200k_base encoding with a vocabulary of approximately 200,000 tokens. This is the same encoding used across all current OpenAI models including GPT-5.x and GPT-4.1.

How to count tokens for the OpenAI API?

You can use this free online counter for estimates, or use the official tiktoken Python library for exact counts. Install it with pip install tiktoken, then use tiktoken.encoding_for_model('gpt-4o') to get the encoder.

What is tiktoken?

tiktoken is OpenAI's open-source tokenizer library for Python. It provides exact token counts for all OpenAI models. It supports multiple encodings including o200k_base (current models) and cl100k_base (legacy models).

What is the difference between GPT-5 and GPT-5.1?

GPT-5.1 is an updated version of GPT-5 with improved reasoning and instruction following. Both use the same o200k_base tokenizer and 400K context window, but GPT-5.1 has the same input pricing ($1.25/1M tokens) while offering better performance.

How much does GPT-5 cost per 1000 tokens?

GPT-5 costs $0.00125 per 1K input tokens ($1.25 per 1M) and $0.01 per 1K output tokens ($10.00 per 1M). GPT-5 Mini is significantly cheaper at $0.00025 per 1K input tokens.

Request a New Tool
Improve This Tool