What image formats does the Claude API support?

The Claude API supports JPEG, PNG, GIF, and WebP. The maximum image size per request is 20 MB. Images larger than approximately 1568 × 1568 pixels are automatically downscaled internally — there is no benefit to sending larger images. Up to 20 images can be included in a single request.

Should I use base64 or URL for Claude image input?

Use base64 for images you host or have locally — it's the most reliable option since Anthropic fetches the image from the encoded data directly. Use the URL source type for publicly accessible images, but note that Anthropic will fetch the URL at request time, so the URL must be publicly reachable without authentication. For production workloads, base64 is more predictable.

How much does image input cost with the Claude API?

Images are billed as input tokens. The exact token count depends on image size — a small image may be ~1,500 tokens and a large one up to ~5,800 tokens (Anthropic's 'low' and 'high' detail equivalents). For cost-sensitive applications, resize images to the minimum useful resolution before sending — there is no quality benefit to sending larger images than the model can use.

How to send images to the Claude API (vision): Python guide

Claude's vision capability lets you include images alongside text in API requests using multimodal content blocks. This guide covers both source types (base64 and URL), the correct message structure, multiple images in one request, and practical limits you need to know before building image-processing pipelines.

The 30-second answer

Message structure: set content to a list of blocks — each block is either {"type": "image", "source": {...}} or {"type": "text", "text": "..."}.
Base64 source: {"type": "base64", "media_type": "image/jpeg", "data": "<b64string>"}
URL source: {"type": "url", "url": "https://..."} — URL must be publicly accessible.
Limits: up to 20 images per request, 20 MB per image, JPEG/PNG/GIF/WebP only. Images are billed as input tokens.

Base64 image input

Read the image file, base64-encode it, and pass it in the source object. This is the most reliable approach for images you control:

import anthropic
import base64

client = anthropic.Anthropic()

# Load and encode the image
with open("screenshot.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe what you see in this screenshot."
                }
            ]
        }
    ]
)

print(message.content[0].text)

The media_type must match the actual image format — use "image/jpeg", "image/png", "image/gif", or "image/webp". Claude does not auto-detect format from the binary data; providing the wrong media_type will cause an error.

URL image input

For publicly accessible images, pass the URL directly. Anthropic fetches the image at request time:

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/240px-PNG_transparency_demonstration_1.png"
                    }
                },
                {
                    "type": "text",
                    "text": "What is in this image?"
                }
            ]
        }
    ]
)

URL sources must be publicly reachable without authentication — signed URLs with short expiry times may fail if Anthropic's servers fetch them with latency. For anything behind auth, use base64.

Multiple images in one request

Include multiple image blocks in the same content list — up to 20 images per request. You can interleave text and image blocks freely:

messages=[
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Compare these two product screenshots:"
            },
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/png",
                    "data": image_1_b64
                }
            },
            {
                "type": "text",
                "text": "vs."
            },
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/png",
                    "data": image_2_b64
                }
            },
            {
                "type": "text",
                "text": "Which UI is cleaner? List the differences."
            }
        ]
    }
]

Claude maintains positional context — it can distinguish "the first image" from "the second image" in its response. For document workflows, you can send multiple page screenshots in a single request.

Practical limits and cost

Limit	Value
Max images per request	20
Max image file size	20 MB
Supported formats	JPEG, PNG, GIF, WebP
Max useful resolution	~1568 × 1568 px (larger images are downscaled)
Approximate token cost (small image ~512px)	~1,500 input tokens
Approximate token cost (large image ~1568px)	~5,800 input tokens

Images are billed as input tokens at the same rate as text. At claude-sonnet-4-5 pricing ($3/MTok input), a large image costs roughly $0.017. For high-volume image processing, resize images before sending — there is no quality benefit above ~1568px on either side, and you pay for extra pixels in token cost.

Working with PDFs (base64)

For PDF analysis, use the document source type rather than the image type:

with open("report.pdf", "rb") as f:
    pdf_data = base64.standard_b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=2048,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": pdf_data
                    }
                },
                {
                    "type": "text",
                    "text": "Summarise the key findings."
                }
            ]
        }
    ]
)

PDF support uses the document block type, not the image block type. The PDF is converted to images internally; the same 20 MB file limit applies. Multi-page PDFs work — Claude reads all pages in sequence.

FAQ

What image formats does Claude support? JPEG, PNG, GIF, WebP. Maximum 20 MB per image. Up to 20 images per request.

Base64 vs URL? Use base64 for images you control and for production. Use URL for publicly accessible images where fetching latency is acceptable. Signed/auth-gated URLs are unreliable.

How much does image input cost? Images are billed as input tokens. Roughly 1,500 tokens for a small image and 5,800 for a large one. Resize to ~1568px max before sending to avoid paying for pixels that don't improve quality.

Last updated May 28, 2026. Code examples verified against the Anthropic Python SDK and Claude API documentation. API behaviour may change — confirm against the official docs before deploying to production.