Anthropic API pricing (2026): the Claude per-token cost table
Short answer: Per million tokens, the Claude API costs $5 in / $25 out for Opus (4.8, 4.7, and 4.6), $3 in / $15 out for Sonnet 4.6, and $1 in / $5 out for Haiku 4.5 — the cheapest current model. A prompt-cache read costs 10% of the input rate, and the Batch API takes 50% off both input and output. Verified against Anthropic's official pricing page, June 2026.
Claude API model pricing (per 1M tokens)
| Model | Input | Output | Best for |
|---|---|---|---|
| Claude Opus 4.8 (latest) | $5 | $25 | Hardest reasoning, agentic and critical work |
| Claude Opus 4.7 / 4.6 | $5 | $25 | Same price as 4.8; prior Opus releases |
| Claude Sonnet 4.6 | $3 | $15 | Production default; best quality-to-cost balance |
| Claude Haiku 4.5 (cheapest) | $1 | $5 | Routing, classification, high-volume simple tasks |
Prices are USD per million tokens (MTok) for the first-party Claude API with default global routing. Opus 4.8, 4.7, 4.6 and Sonnet 4.6 include the full 1M-token context window at standard per-token pricing. Older models (Opus 4.1 and earlier, at the legacy $15/$75 Opus rate) are deprecated.
What is the cheapest Claude API model?
Haiku 4.5, at $1 / $5 per million tokens, is the cheapest current Claude model. You can push the effective cost lower in three ways, which stack:
- Prompt caching: a cache read is 0.1x input — $0.10 per million tokens on Haiku.
- Batch API: 50% off both directions — $0.50 in / $2.50 out per million.
- Right-sizing: route easy requests to Haiku and reserve Opus for genuinely hard reasoning.
If you need to go cheaper than Haiku for trivial tasks, Google's Gemini Flash tier undercuts it — see OpenAI & Claude API alternatives.
Prompt caching and Batch API discounts
| Feature | Multiplier vs base input | Effect |
|---|---|---|
| 5-minute cache write | 1.25x | Pays off after one cache read |
| 1-hour cache write | 2x | Pays off after two cache reads |
| Cache read (hit) | 0.1x | 90% off repeated context |
| Batch API | 0.5x in & out | 50% off async (non-time-sensitive) jobs |
Web search as a server tool is billed separately at $10 per 1,000 searches plus token costs. Caching and batching discounts stack with each other.
Worked example: cost of a typical Sonnet 4.6 request
A request that sends 20,000 input tokens and generates 2,000 output tokens on Sonnet 4.6:
- Input: 20,000 × $3 / 1,000,000 = $0.060
- Output: 2,000 × $15 / 1,000,000 = $0.030
- Total: $0.090 per request
If 16,000 of the input tokens are served from a prompt cache (0.1x), the input drops to (4,000 × $3 + 16,000 × $0.30) / 1,000,000 = $0.0168, taking the request to about $0.047 — roughly half.
How Claude API pricing compares to OpenAI
At the flagship tier the two are close: Claude Opus 4.8 and OpenAI GPT-5.5 both charge $5 input, with Opus at $25 output vs GPT-5.5 at $30. OpenAI tends to win at the cheap and mid tiers — GPT-5.4-nano ($0.20 / $1.25) undercuts Haiku 4.5. For the full side-by-side, see ChatGPT vs Claude API cost and the OpenAI API pricing table.