Cheapest LLM API (2026): OpenAI vs Anthropic vs Google Gemini
Short answer: The cheapest major LLM API in 2026 is Google Gemini 2.5 Flash-Lite at $0.10 / $0.40 per million input/output tokens — cheaper than OpenAI's GPT-5.4-nano ($0.20 / $1.25) and well below Anthropic's cheapest, Claude Haiku 4.5 ($1 / $5). For mid-tier quality, Gemini 2.5 Flash ($0.30 / $2.50) is the value leader, ahead of GPT-5.4-mini ($0.75 / $4.50). At the flagship tier the providers converge — OpenAI GPT-5.5 and Claude Opus 4.8 both charge $5 input. All three offer a 50% Batch discount and ~90% off cached input. Verified against official pricing pages, June 2026.
Cheapest model per provider (the budget tier)
| Provider | Cheapest model | Input /1M | Output /1M |
|---|---|---|---|
| Google Gemini 🏆 | Gemini 2.5 Flash-Lite | $0.10 | $0.40 |
| OpenAI | GPT-5.4-nano | $0.20 | $1.25 |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 |
On the budget tier, Gemini 2.5 Flash-Lite is the cheapest LLM API outright — half the input cost of GPT-5.4-nano and a tenth of Claude Haiku's input. For high-volume, low-complexity work (classification, routing, extraction, tagging), this is the single biggest cost lever you have.
Full cross-provider per-token comparison
Prices are USD per million tokens (MTok), standard synchronous tier, paid plans.
| Tier | Model | Input | Output |
|---|---|---|---|
| Budget | Gemini 2.5 Flash-Lite | $0.10 | $0.40 |
| GPT-5.4-nano | $0.20 | $1.25 | |
| Claude Haiku 4.5 | $1.00 | $5.00 | |
| Mid | Gemini 2.5 Flash | $0.30 | $2.50 |
| GPT-5.4-mini | $0.75 | $4.50 | |
| Gemini 2.5 Pro (≤200K) | $1.25 | $10.00 | |
| GPT-5.4 | $2.50 | $15.00 | |
| Flagship | Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Opus 4.8 | $5.00 | $25.00 | |
| OpenAI GPT-5.5 | $5.00 | $30.00 |
Gemini 2.5 Pro also has a long-context tier: above 200K tokens it rises to $2.50 / $15. OpenAI's GPT-5.5-pro ($30 / $180) and Gemini's 3.1 Pro Preview ($2 / $12) sit above the general flagships for the hardest reasoning.
Winner: which is cheapest for what
- Cheapest API overall: Google Gemini 2.5 Flash-Lite ($0.10 / $0.40) — nothing from OpenAI or Anthropic undercuts it.
- Best value mid-tier: Gemini 2.5 Flash ($0.30 / $2.50) — strong quality at a fraction of flagship cost.
- Cheapest from OpenAI: GPT-5.4-nano ($0.20 / $1.25), with GPT-5.4-mini ($0.75 / $4.50) the mid pick.
- Cheapest from Anthropic: Claude Haiku 4.5 ($1 / $5) — pricier per token, but often chosen for output quality.
- Cheapest flagship reasoning: Gemini 2.5 Pro ($1.25 / $10) is materially cheaper than GPT-5.5 ($5 / $30) and Claude Opus 4.8 ($5 / $25).
The headline: Google is the price leader across budget and mid tiers in 2026, and even its flagship undercuts OpenAI and Anthropic on input. Pick Claude or GPT-5.5 when output quality justifies the premium; pick Gemini when cost-per-token is the deciding factor.
Cut costs further (works on all three)
- Right-size the model: route simple tasks to the budget tier; reserve flagships for genuinely hard reasoning. Biggest single lever.
- Batch API: a flat 50% discount on OpenAI, Anthropic, and Gemini for asynchronous (non-time-sensitive) jobs.
- Prompt / context caching: repeated context (system prompts, long documents) is discounted ~90% on all three.
Worked example: 1M input + 200K output on Gemini 2.5 Flash-Lite is (1,000,000 × $0.10 + 200,000 × $0.40) / 1,000,000 = $0.18. The same workload on Claude Haiku 4.5 is $2.00, and on GPT-5.4-nano $0.45. With the Batch API, the Gemini cost halves to $0.09.
Frequently asked
What is the cheapest LLM API in 2026?
Google Gemini 2.5 Flash-Lite at $0.10 input / $0.40 output per million tokens. It is cheaper than OpenAI's GPT-5.4-nano ($0.20 / $1.25) and Anthropic's Claude Haiku 4.5 ($1 / $5), and drops to $0.05 / $0.20 via the Batch API.
Is Gemini cheaper than OpenAI and Claude?
At the budget and mid tiers, clearly yes. At the flagship tier the gap narrows but Gemini still leads on input price: Gemini 2.5 Pro is $1.25 input vs $5 for both GPT-5.5 and Claude Opus 4.8.
Is the cheapest API the right choice?
Not always. Budget models trade some reasoning quality for price. For classification, routing, extraction, and high-volume simple tasks, the cheapest tier is usually fine. For nuanced writing, complex agents, and hard reasoning, a mid or flagship model often pays for itself in fewer retries.
Where can I see each provider's full pricing?
See our dedicated references: Anthropic API pricing, OpenAI API pricing, and the head-to-head ChatGPT vs Claude API cost. For alternatives and routing strategy, see best OpenAI & Claude API alternatives.