Advertisement
Prices updated April 2026

AI API Cost Calculator

Estimate your monthly AI API spend across OpenAI, Anthropic Claude, Google Gemini, Mistral, DeepSeek and more. Compare 15+ models instantly — free, no signup required.

✦ 15 models · April 2026 pricing · Batch + caching support
10,000
500
300
0%

ℹ Batch discounts applied automatically — OpenAI: 50%, Anthropic: 50%, others: standard rate.

Select Model GPT-4o
Estimated Monthly Cost
$0.00
Input tokens/mo
Output tokens/mo
Total tokens
Input cost
Output cost
Cache savingsn/a
Per request
Per 1,000 requests
Annual estimate
Output tokens cost 3–5× more than input · Prices from provider docs
Advertisement
Full Model Comparison — Your Workload click any row to select
Advertisement
Get Started with an AI API Provider

How to Use This AI API Cost Calculator

Enter your expected monthly request volume, average input tokens per request (your prompt + any context), and average output tokens (the AI's response length). The calculator instantly computes your estimated monthly spend and ranks all 15+ models from cheapest to most expensive for your exact workload.

What Are Input vs Output Tokens?

Tokens are chunks of text — roughly 0.75 words per token in English. Input tokens are everything you send to the model: the system prompt, user message, and any conversation history or retrieved context. Output tokens are the model's response. Output tokens cost significantly more — typically 3–5× the input price — because generating text requires more compute than reading it.

Understanding Prompt Caching

If your prompts include a large, repeated system prompt or static context (common in RAG, agents, and chatbots), prompt caching can reduce input costs by 50–90%. Anthropic Claude and OpenAI both support caching. Use the Advanced tab to model this saving.

When to Use Batch API

If your workload is not latency-sensitive — bulk data processing, evaluation runs, overnight jobs — OpenAI's Batch API and Anthropic's Message Batches both offer 50% off standard pricing. Switch to the Batch Mode tab to calculate your discounted cost.

Which AI API Is the Cheapest in 2026?

For high-volume, cost-sensitive workloads: Google Gemini Flash Lite ($0.075/$0.30 per million tokens) and Mistral Small 3.1 ($0.10/$0.30) are the most affordable capable options. DeepSeek V3 ($0.27/$1.10) is excellent value for complex tasks. For maximum capability, GPT-4.1 and Claude Sonnet 4 offer the best quality-to-cost ratio.

Frequently Asked Questions

GPT-4o costs $2.50 per million input tokens and $10.00 per million output tokens as of April 2026. With prompt caching enabled, cached input tokens drop to $1.25/M. The Batch API reduces both by 50%.
Claude Sonnet 4 costs $3.00/M input and $15.00/M output. Claude Haiku 3.5 is $0.80/$4.00 — ideal for high-volume tasks. Claude Opus 4 is $15.00/$75.00 for maximum capability. All support 50% batch discounts and aggressive prompt caching.
As of April 2026: Google Gemini Flash Lite at $0.075/$0.30, Mistral Small 3.1 at $0.10/$0.30, and DeepSeek V3 at $0.27/$1.10 per million tokens are the cheapest capable options. Google AI Studio also offers a generous free tier on Gemini models.
Formula: (monthly_requests × avg_input_tokens / 1,000,000 × input_price) + (monthly_requests × avg_output_tokens / 1,000,000 × output_price). For example: 10,000 requests × 500 input tokens = 5M input tokens. At GPT-4o pricing ($2.50/M): 5 × $2.50 = $12.50 for input alone.
OpenAI does not offer a persistent free API tier — you pay per token from the start, though new accounts receive trial credits. Google AI Studio offers the most generous free tier in 2026, with free access to Gemini Flash at moderate rate limits.
GPT-4.1 has a 1M token context window vs GPT-4o's 128k, improved instruction following, and lower pricing ($2.00/$8.00 vs $2.50/$10.00). GPT-4.1 is generally the better choice for new projects in 2026.