ℹ Batch discounts applied automatically — OpenAI: 50%, Anthropic: 50%, others: standard rate.
Estimate your monthly AI API spend across OpenAI, Anthropic Claude, Google Gemini, Mistral, DeepSeek and more. Compare 15+ models instantly — free, no signup required.
✦ 15 models · April 2026 pricing · Batch + caching supportℹ Batch discounts applied automatically — OpenAI: 50%, Anthropic: 50%, others: standard rate.
Enter your expected monthly request volume, average input tokens per request (your prompt + any context), and average output tokens (the AI's response length). The calculator instantly computes your estimated monthly spend and ranks all 15+ models from cheapest to most expensive for your exact workload.
Tokens are chunks of text — roughly 0.75 words per token in English. Input tokens are everything you send to the model: the system prompt, user message, and any conversation history or retrieved context. Output tokens are the model's response. Output tokens cost significantly more — typically 3–5× the input price — because generating text requires more compute than reading it.
If your prompts include a large, repeated system prompt or static context (common in RAG, agents, and chatbots), prompt caching can reduce input costs by 50–90%. Anthropic Claude and OpenAI both support caching. Use the Advanced tab to model this saving.
If your workload is not latency-sensitive — bulk data processing, evaluation runs, overnight jobs — OpenAI's Batch API and Anthropic's Message Batches both offer 50% off standard pricing. Switch to the Batch Mode tab to calculate your discounted cost.
For high-volume, cost-sensitive workloads: Google Gemini Flash Lite ($0.075/$0.30 per million tokens) and Mistral Small 3.1 ($0.10/$0.30) are the most affordable capable options. DeepSeek V3 ($0.27/$1.10) is excellent value for complex tasks. For maximum capability, GPT-4.1 and Claude Sonnet 4 offer the best quality-to-cost ratio.