Google Gemini API pricing (2026): the per-token cost table

Short answer: Per million tokens, the Gemini API costs $0.10 in / $0.40 out for Gemini 2.5 Flash-Lite (the cheapest model from any major provider), $0.30 / $2.50 for Gemini 2.5 Flash, and $1.25 / $10 for Gemini 2.5 Pro (up to 200K tokens; $2.50 / $15 above that). Newer options: Gemini 3.5 Flash $1.50 / $9 and Gemini 3.1 Pro Preview $2 / $12. The Batch API takes 50% off and context caching cuts cached input ~90%. Verified against Google's official pricing page, June 2026.

Gemini API model pricing (per 1M tokens)

ModelInputOutputBest for
Gemini 2.5 Flash-Lite (cheapest)$0.10$0.40High-volume classification, routing, extraction
Gemini 2.5 Flash$0.30$2.50Best value workhorse for most tasks
Gemini 3.1 Flash-Lite$0.25$1.50Newer lite tier, stronger reasoning
Gemini 3.5 Flash$1.50$9.00Latest Flash; higher quality, native thinking
Gemini 2.5 Pro (≤200K)$1.25$10.00Flagship reasoning at low cost
Gemini 2.5 Pro (>200K)$2.50$15.00Long-context flagship
Gemini 3.1 Pro Preview (≤200K)$2.00$12.00Newest Pro preview; hardest reasoning

Prices are USD per million tokens (MTok), standard paid tier. Flash and Flash-Lite use a single flat rate; Gemini 2.5 Pro and 3.1 Pro Preview charge more above 200K tokens. Note: Gemini 2.0 Flash and 2.0 Flash-Lite were shut down on June 1, 2026 — migrate to the 2.5 line.

What is the cheapest Gemini model?

Gemini 2.5 Flash-Lite, at $0.10 / $0.40 per million tokens, is the cheapest Gemini model — and the cheapest production model from any major LLM provider. Three stacking levers cut it further:

For how Gemini stacks up against the other providers, see cheapest LLM API: OpenAI vs Anthropic vs Gemini.

Batch API and context caching discounts

FeatureDiscountApplies to
Batch API50% off in & outAsync jobs (24-hour SLA), all models
Context caching~90% off cached inputRepeated context, plus per-hour storage fee
Free tierNo charge (rate-limited)Testing and low-volume use

Worked example: cost of a typical Gemini 2.5 Flash request

A request that sends 20,000 input tokens and generates 2,000 output tokens on Gemini 2.5 Flash:

The same request on Gemini 2.5 Flash-Lite costs (20,000 × $0.10 + 2,000 × $0.40) / 1,000,000 = $0.0028 — about 4x cheaper, which is why Flash-Lite is the right pick for high-volume, low-complexity work.

How Gemini API pricing compares

Gemini is the price leader across the budget and mid tiers in 2026. Gemini 2.5 Flash-Lite ($0.10/$0.40) undercuts OpenAI's GPT-5.4-nano ($0.20/$1.25) and Anthropic's Claude Haiku 4.5 ($1/$5); Gemini 2.5 Flash ($0.30/$2.50) beats GPT-5.4-mini ($0.75/$4.50). Even the flagship Gemini 2.5 Pro ($1.25 input) is cheaper than GPT-5.5 and Claude Opus 4.8 (both $5 input). Full breakdown: Cheapest LLM API, OpenAI API pricing, and Anthropic API pricing.