How much does the Google Gemini API cost?

As of June 2026, Gemini API prices per million tokens are: Gemini 2.5 Flash-Lite at $0.10 input / $0.40 output (the cheapest); Gemini 2.5 Flash at $0.30 / $2.50; Gemini 2.5 Pro at $1.25 / $10 for prompts up to 200K tokens ($2.50 / $15 above 200K); Gemini 3.5 Flash at $1.50 / $9; and Gemini 3.1 Pro Preview at $2 / $12. The Batch API gives a 50% discount and context caching cuts cached input by roughly 90%.

Does Gemini charge more for long context?

Yes, for some models. Gemini 2.5 Pro and Gemini 3.1 Pro Preview use tiered pricing: prompts up to 200K tokens are billed at the standard rate, and prompts above 200K tokens are billed at a higher rate (for 2.5 Pro, $2.50 input / $15 output instead of $1.25 / $10). The Flash and Flash-Lite models use a single flat rate regardless of context length.

Does the Gemini API have a free tier?

Yes. The Gemini API offers a free tier with rate-limited access to several models, which is more generous than OpenAI's or Anthropic's API trials. Beyond the free tier, usage is billed per token at the paid rates above. The Batch API (50% off) and context caching (~90% off cached input) apply on the paid tier.

Google Gemini API pricing (2026): the per-token cost table

Q: What is the cheapest Gemini model?

Gemini 2.5 Flash-Lite is the cheapest Gemini model at $0.10 per million input tokens and $0.40 per million output tokens. With the Batch API it drops to $0.05 / $0.20, and cached input is about $0.01 per million. This is the cheapest production model from any major provider, undercutting OpenAI's GPT-5.4-nano ($0.20/$1.25) and Anthropic's Claude Haiku 4.5 ($1/$5).

Short answer: Per million tokens, the Gemini API costs $0.10 in / $0.40 out for Gemini 2.5 Flash-Lite (the cheapest model from any major provider), $0.30 / $2.50 for Gemini 2.5 Flash, and $1.25 / $10 for Gemini 2.5 Pro (up to 200K tokens; $2.50 / $15 above that). Newer options: Gemini 3.5 Flash $1.50 / $9 and Gemini 3.1 Pro Preview $2 / $12. The Batch API takes 50% off and context caching cuts cached input ~90%. Verified against Google's official pricing page, June 2026.

Gemini API model pricing (per 1M tokens)

Model	Input	Output	Best for
Gemini 2.5 Flash-Lite (cheapest)	$0.10	$0.40	High-volume classification, routing, extraction
Gemini 2.5 Flash	$0.30	$2.50	Best value workhorse for most tasks
Gemini 3.1 Flash-Lite	$0.25	$1.50	Newer lite tier, stronger reasoning
Gemini 3.5 Flash	$1.50	$9.00	Latest Flash; higher quality, native thinking
Gemini 2.5 Pro (≤200K)	$1.25	$10.00	Flagship reasoning at low cost
Gemini 2.5 Pro (>200K)	$2.50	$15.00	Long-context flagship
Gemini 3.1 Pro Preview (≤200K)	$2.00	$12.00	Newest Pro preview; hardest reasoning

Prices are USD per million tokens (MTok), standard paid tier. Flash and Flash-Lite use a single flat rate; Gemini 2.5 Pro and 3.1 Pro Preview charge more above 200K tokens. Note: Gemini 2.0 Flash and 2.0 Flash-Lite were shut down on June 1, 2026 — migrate to the 2.5 line.

What is the cheapest Gemini model?

Gemini 2.5 Flash-Lite, at $0.10 / $0.40 per million tokens, is the cheapest Gemini model — and the cheapest production model from any major LLM provider. Three stacking levers cut it further:

Batch API: a flat 50% discount — $0.05 in / $0.20 out per million.
Context caching: cached input is about $0.01 per million (roughly 90% off), plus a small per-hour storage charge.
Right-sizing: reserve 2.5 Pro for hard reasoning and route everything else to Flash-Lite or Flash.

For how Gemini stacks up against the other providers, see cheapest LLM API: OpenAI vs Anthropic vs Gemini.

Batch API and context caching discounts

Feature	Discount	Applies to
Batch API	50% off in & out	Async jobs (24-hour SLA), all models
Context caching	~90% off cached input	Repeated context, plus per-hour storage fee
Free tier	No charge (rate-limited)	Testing and low-volume use

Worked example: cost of a typical Gemini 2.5 Flash request

A request that sends 20,000 input tokens and generates 2,000 output tokens on Gemini 2.5 Flash:

Input: 20,000 × $0.30 / 1,000,000 = $0.0060
Output: 2,000 × $2.50 / 1,000,000 = $0.0050
Total: $0.011 per request

The same request on Gemini 2.5 Flash-Lite costs (20,000 × $0.10 + 2,000 × $0.40) / 1,000,000 = $0.0028 — about 4x cheaper, which is why Flash-Lite is the right pick for high-volume, low-complexity work.

How Gemini API pricing compares

Gemini is the price leader across the budget and mid tiers in 2026. Gemini 2.5 Flash-Lite ($0.10/$0.40) undercuts OpenAI's GPT-5.4-nano ($0.20/$1.25) and Anthropic's Claude Haiku 4.5 ($1/$5); Gemini 2.5 Flash ($0.30/$2.50) beats GPT-5.4-mini ($0.75/$4.50). Even the flagship Gemini 2.5 Pro ($1.25 input) is cheaper than GPT-5.5 and Claude Opus 4.8 (both $5 input). Full breakdown: Cheapest LLM API, OpenAI API pricing, and Anthropic API pricing.