AI Token Cost Calculator

Compare API pricing across 43+ LLM models. Estimate your monthly costs for Gemini 3.1, Claude Opus 4.7, GPT-5.4, Gemma 4, GLM-5.1, Canva AI 2.0, DeepSeek, Llama 4 and more.

Your Usage

~750 words = 1,000 tokens
Average response length
Quick presets:
Cheapest Option
Best Value (Quality/Price)
Most Expensive
Total Tokens / Period

Price Comparison

Model Input $/1M Output $/1M Total Cost ▼ Context Quality

Frequently Asked Questions

How are AI token costs calculated?

AI APIs charge per token processed. Input tokens (your prompt) and output tokens (the response) are priced separately. Most providers price per 1 million tokens. Your total cost = (input tokens × input price) + (output tokens × output price) × number of requests.

What is a token?

A token is roughly 3/4 of a word in English. 1,000 tokens ≈ 750 words. Code tends to use more tokens per character than natural language. Different models use different tokenizers, so exact counts vary slightly.

Which AI model is cheapest?

As of April 2026, DeepSeek V3 and Google Gemini 2.0 Flash offer the lowest per-token pricing. However, the cheapest model may not be the best value — consider quality, speed, and context window size for your use case.

Claude vs GPT-4o: which costs less?

Claude Sonnet 4 is generally cheaper than GPT-4o for most workloads, especially for output-heavy tasks. Claude Opus 4 costs more than GPT-4o. Use the calculator above to compare with your specific usage pattern.

How can I reduce my AI API costs?

Key strategies: 1) Use prompt caching to avoid re-processing repeated context. 2) Choose the right model tier — use smaller models for simple tasks. 3) Optimize prompts to reduce input tokens. 4) Batch requests where possible. 5) Use cached/discounted input pricing when available.

What is prompt caching?

Prompt caching lets you reuse previously processed input tokens at a discount (typically 50-90% off). Anthropic, OpenAI, and Google all offer caching. It's especially valuable for RAG applications or system prompts that repeat across requests.

How much does Gemma 4 cost vs Gemini?

Gemma 4 is Google's open-weight model family released April 2026, with 31B and 26B MoE variants. Since it's open-source, you can self-host it for free (just compute costs). Via API providers like Google AI Studio, the 31B model costs roughly $0.15/M input and $0.60/M output tokens — significantly cheaper than Gemini 2.5 Pro. The 26B MoE variant is even cheaper at $0.06/M input. Quality is lower than Gemini Pro models, but it's ideal for edge/local deployment or budget-constrained applications.

How much does Gemini 3.1 Pro cost?

Gemini 3.1 Pro, Google's latest flagship released April 2026, costs $2.00 per million input tokens and $12.00 per million output tokens. It features a massive 2M token context window, real-time voice and vision processing, and strong multi-task reasoning. The Flash-Lite variant starts at just $0.25/M input — ideal for high-volume, cost-sensitive applications.

Gemini 3.1 Pro vs Claude Opus 4.7: which is better value?

Gemini 3.1 Pro ($2/M input, $12/M output) is cheaper than Claude Opus 4.7 ($5/$25) — roughly 2.5x cheaper on input and output. Gemini 3.1 offers a 2M context window vs Opus's 1M. Both excel at complex reasoning. Claude Opus 4.7 leads on autonomous agent tasks and self-verification; Gemini 3.1 Pro leads on multimodal real-time processing. For cost-sensitive workloads, Gemini 3.1 Pro delivers better value.

How much does Claude Opus 4.7 cost?

Claude Opus 4.7, the latest Anthropic flagship released April 16, 2026, costs $5 per million input tokens and $25 per million output tokens — same as Opus 4.6. It supports 1M+ token context with prompt caching at $0.50/M for cache hits (up to 90% savings). Batch API offers 50% off. Opus 4.7 adds improved vision (3x image resolution), better self-verification, and stronger performance on complex reasoning and autonomous coding tasks.

Claude Opus 4.7 vs GPT-4.1: which is better for coding?

Claude Opus 4.7 ($5/$25 per M tokens) and GPT-4.1 ($2/$8) are both top coding models in April 2026. Opus 4.7 scores higher on complex reasoning, autonomous agent tasks, and self-verification, while GPT-4.1 is cheaper and has a 1M context window. For heavy coding workloads, GPT-4.1 costs ~60% less; for hard reasoning and agentic tasks, Opus 4.7 delivers superior results.

More Online Tools

Age Calculator AI Agent Cost Calculator Epoch Converter Screen Resolution Checker LLM Security Checker

Brush Games

Brush the Kitten Brush the Puppy Brush the Bunny

Drawing Challenges

Draw a Perfect Circle Draw a Perfect Egg Draw a Perfect Heart

Fun Games

The 10 Game BPM Tap Tempo

Brain & Skill Tests

Reaction Time Test Color Memory Test Mouse Accuracy Test