OpenAI Cost Calculator

Reference Pricing (per 1M tokens)

These are the standard API prices used by this calculator. Sources: OpenAI API Pricing, GPT-4.1 pricing, and GPT-4.1 model docs.

Model	Input	Cached input	Output

What This Tool Does

OpenAI Cost Calculator is a token pricing calculator that estimates API cost from input, cached input, and output tokens using per-1M token rates.

Use it when you need quick spend estimates before launching prompts in production or when comparing model-rate tradeoffs for the same workload.

Quick Answer

API cost is the sum of three components: input token cost, cached input token cost, and output token cost. Output often contributes the largest share when responses are long.

Formula: (input / 1,000,000 × input_rate) + (cached / 1,000,000 × cached_rate) + (output / 1,000,000 × output_rate).

How to Use

Select a model from the dropdown.
Enter input, optional cached input, and output token counts.
Read per-part costs and total estimate, then multiply by request volume for monthly planning.

Tip: Use AI Token Counter first when you only have raw text and need token estimates.

Inputs and Outputs

Field	What it means	Used in math
Input tokens	Tokens sent in the request prompt	`input / 1,000,000 × input_rate`
Cached input tokens	Prompt tokens billed at cached rate, if model supports it	`cached / 1,000,000 × cached_rate`
Output tokens	Tokens generated in the response	`output / 1,000,000 × output_rate`
Total cost	Estimated per-request API spend	Sum of all components

Worked Examples

Scenario	Token pattern	What changes cost most
Short assistant reply	300 input, 0 cached, 120 output	Raising max output tokens has the biggest impact.
Long document Q&A with repeated system prompt	4,000 input, 2,500 cached, 500 output	Higher cache-hit share lowers total cost.
Edge case: output-heavy generation	600 input, 0 cached, 4,000 output	Output pricing dominates total spend.

For monthly planning: cost_per_request × requests_per_day × days_per_month.

Common Mistakes

Treating words as tokens one-to-one. Token counts vary by tokenizer and text style.
Ignoring output length. Long responses can cost more than the prompt.
Assuming cached pricing applies to all models.
Budgeting from stale rates without checking provider pricing updates.

Frequently Asked Questions

How do I calculate OpenAI API cost from tokens?

Multiply each token type by its per-1M rate and add them. This page uses input, cached input, and output components separately.

What are cached input tokens?

Cached input tokens are reusable prompt-prefix tokens billed at a reduced rate when a model supports prompt caching.

Can I use this for monthly API budget planning?

Yes. Calculate per-request cost first, then multiply by projected daily request volume and number of days.

Does this tool show exact invoice totals?

No. It is an estimate. Final billing depends on real usage, model behavior, and provider pricing at request time.

Why does output cost sometimes dominate?

Many models have higher output-token rates than input-token rates, so longer completions increase cost quickly.

Are pricing rates fixed forever?

No. Rates can change. Check the linked provider pricing page before final budgeting decisions.

Trust and Limitations

This tool runs in your browser and is designed for planning and comparison. It does not process payments or represent a billing statement. Pricing references are listed in the rate table and should be re-checked when API providers update pricing.

Related tools: AI Token Counter · JSON Size Estimator

Privacy & Limitations

All calculations run entirely in your browser -- nothing is sent to any server.
Results are estimates and may vary based on actual conditions.

Related Tools

View all tools

AI Token Counter

Estimate tokens and characters for a prompt

OpenAI Token Counter

Count tokens and estimate costs for GPT-5.x, GPT-4o, and other OpenAI models

Claude Token Counter

Count tokens and estimate costs for Claude Opus 4.6, Sonnet, and Anthropic models

Gemini Token Counter

Count tokens and estimate costs for Google Gemini 3 Pro, 2.5 Pro and Flash models

Llama Token Counter

Count tokens and estimate costs for Meta Llama 4, 3.3, and open-source LLM models

OpenAI Cost Calculator FAQ

How do I calculate OpenAI API cost from tokens?

Multiply each token bucket by its rate, then add the results. Cost = (input_tokens / 1,000,000 × input_rate) + (cached_tokens / 1,000,000 × cached_rate) + (output_tokens / 1,000,000 × output_rate).

What are cached input tokens?

Cached input tokens are prompt tokens that the API can reuse from previous requests when the prefix matches. These tokens are usually billed at a lower cached-input rate for models that support caching.

Does this calculator use official OpenAI rates?

This calculator uses published per-1M token API rates listed in the pricing table on this page and links to OpenAI pricing sources. Pricing can change, so always verify before budgeting.

Why is output often more expensive than input?

Many models bill output tokens at a higher rate than input tokens, so long responses can dominate total cost even with modest input size.

Can I estimate monthly spend with this tool?

Yes. First calculate cost per request, then multiply by your expected number of requests per day and by the number of days in a month.

What if my model has no cached input rate?

If cached pricing is not available for a selected model, cached tokens are treated as zero in this calculator and only input and output pricing are applied.

Are token estimates the same as word counts?

No. Tokens are not words. English text is often near 4 characters per token on average, but the exact count depends on tokenizer behavior and text content.

Is this tool a billing invoice?

No. This tool provides an estimate for planning and comparison. Final billed amounts depend on actual API usage and provider pricing at the time of requests.