LLM Cost Calculator

How LLM pricing works

Most LLM providers charge per million tokens — separately for input (your prompt) and output (the model's response). Output tokens are usually 3–5x more expensive than input tokens because generation is computationally heavier.

The formula is simple:

monthly_cost = (input_tokens / 1M) * input_price + (output_tokens / 1M) * output_price

Multiply by your monthly request volume to get your total bill.

Key cost factors

Input vs output ratio: Prompts with short outputs (summarization, classification) are cheaper than open-ended generation
Model tier: Frontier models (Claude 3 Opus, GPT-4 Turbo) cost 10–100x more than small models (Haiku, GPT-4o Mini)
Prompt caching: Anthropic and OpenAI offer caching discounts of up to 90% on repeated context
Batch mode: Many providers offer 50% discounts for non-real-time batch workloads

Related tools

AI Token Calculator — Count tokens for any model before estimating costs
Context Window Calculator — Visualize your context usage

Usage Configuration

Target Models for Comparison

Monthly Billing Projections

How LLM pricing works

Key cost factors

Related tools

Related guides