What does the Claude API 529 overloaded_error mean?

A 529 overloaded_error means Anthropic's API is temporarily overloaded, usually because of high traffic across all users. It is a transient, server-side condition, not a problem with your request — the correct response is to wait briefly and retry with exponential backoff.

Is a 529 the same as a 429 rate-limit error?

No. A 429 rate_limit_error means your account hit its own rate limit; a 529 overloaded_error means Anthropic's systems are overloaded across all users. Note that a sharp spike in your usage can also trigger 429s due to acceleration limits, so ramping traffic gradually helps avoid both.

Should I retry on a 529 error?

Yes. Because a 529 is temporary, retrying with exponential backoff and jitter is the recommended approach. The official Anthropic SDKs also retry failed requests automatically, and the retry behavior is configurable.

Claude API `overloaded_error` (HTTP 529): why it happens and how to handle it

If your Claude API call just came back with HTTP status 529 and an overloaded_error, the short version is: it's not your fault, and it's not your rate limit. Anthropic's API is temporarily overloaded across all users, and the request should be retried. This page explains exactly what the error is, how it differs from a 429 rate-limit error, and the retry pattern that handles it cleanly.

The 30-second answer

What it is: 529 overloaded_error = Anthropic's API is temporarily overloaded (high traffic across all users). Transient and server-side.
What to do: retry the request with exponential backoff + jitter. Don't hammer it with immediate retries.
It's not a 429: a 429 rate_limit_error is your account hitting its limit; a 529 is Anthropic-wide overload. Different cause, different fix.
Easiest path: the official Anthropic SDKs already retry failed requests automatically (and the behavior is configurable) — so a lot of the time, using the SDK's built-in retries is the whole fix.

What the 529 `overloaded_error` actually is

Anthropic's API uses a predictable set of HTTP error codes. The 529 sits at the top of that list:

Status	Error type	Meaning
400	`invalid_request_error`	Problem with the format or content of your request
401	`authentication_error`	Problem with your API key
403	`permission_error`	Your key can't use the requested resource
404	`not_found_error`	Resource not found
413	`request_too_large`	Request exceeds the size limit (32 MB on the Messages API)
429	`rate_limit_error`	Your account hit a rate limit
500	`api_error`	Unexpected internal error on Anthropic's side
529	`overloaded_error`	The API is temporarily overloaded

The key word in the 529 definition is temporarily. It's a capacity signal, not a correctness signal: your request was well-formed and authenticated, but Anthropic's systems were too busy to serve it at that moment. Per Anthropic's docs, 529s occur "when APIs experience high traffic across all users" — so it tends to come in waves during peak demand, and it clears on its own.

529 vs. 429: don't confuse overload with your own rate limit

These two get conflated constantly, and the fix is different for each.

429 rate_limit_error — your account or organization hit its rate limit (requests/minute or tokens/minute for your tier). The fix is to slow down, raise your limits, or spread the load.
529 overloaded_error — Anthropic's infrastructure is overloaded across all users. The fix is to back off and retry; there's nothing wrong with your account.

There's one important overlap worth knowing in 2026: per Anthropic's current docs, if your organization has a sharp spike in usage, you can now see 429 errors (because of acceleration limits on the API) in situations that previously returned 529. The practical takeaway is the same for both: ramp traffic up gradually and keep usage patterns consistent rather than spiking.

How to handle a 529 in code

The correct pattern is exponential backoff with jitter: wait a short, growing interval between retries, with a little randomness so many clients don't retry in lockstep. A minimal example:

import time, random, anthropic

client = anthropic.Anthropic()

def create_with_retry(max_attempts=5, **kwargs):
    for attempt in range(max_attempts):
        try:
            return client.messages.create(**kwargs)
        except anthropic.APIStatusError as e:
            # Retry on transient overload / server / rate-limit conditions
            if e.status_code in (429, 500, 529) and attempt < max_attempts - 1:
                delay = (2 ** attempt) + random.uniform(0, 1)  # 1s, 2s, 4s... + jitter
                time.sleep(delay)
                continue
            raise

Three things make this robust:

Exponential growth (1s, 2s, 4s, 8s…) gives the overload time to clear instead of piling on more requests.
Jitter (the random fraction) prevents a "thundering herd" where every client retries at the same instant and re-overloads the API.
A retry cap so a sustained outage fails loudly instead of hanging forever.

The easier path: let the SDK retry

You often don't need to write the loop above at all. The official Anthropic SDKs include built-in support for retries and error handling, and the retry behavior is configurable. If you're already using the Python or TypeScript SDK, lean on its automatic retries first and only add a custom wrapper if you need behavior the SDK doesn't cover.

Watch for overload errors mid-stream

If you're using the streaming Messages API over SSE, an error can occur after a 200 response has already started — including an overloaded_error delivered as a stream event rather than an HTTP status. Handle error events inside your stream loop, not just around the initial request, or a mid-stream overload will look like a truncated response.

How to reduce how often you hit 529s

Ramp gradually, stay consistent. Sudden spikes are the most likely thing to trip both 529 overloads and 429 acceleration limits. Smooth, predictable traffic is treated better.
Move bulk work to the Batch API. If you're firing thousands of non-urgent requests, the Message Batches API is built for that and lets you poll for results instead of holding live connections — far less likely to collide with peak-load overload.
Use streaming for long requests. For anything that might run over ~10 minutes, stream the response (or batch it). This also avoids idle-connection timeouts, a separate failure mode that's easy to mistake for an overload.
Add idempotency at your layer. Because retries are expected, make sure a retried request can't double-charge a user or duplicate a side effect in your app.

Is it me or is it Anthropic?

A 529 is Anthropic-side by definition, but if you're seeing a sustained wall of them (not the occasional blip), check Anthropic's status page to confirm a broader incident before spending hours debugging your own code. Every API response also includes a unique request-id header (e.g. req_011CSHoEeqs5C35K2UUqR7Fy); the official SDKs expose it as a property on the response object. Capture it in your logs — if you open a support ticket, that ID is the fastest way to get a specific request looked at.

FAQ

Does a 529 cost me anything? No — a failed request that returns an error isn't a successful generation, so you're not billed for output tokens on a 529.

How many times should I retry? There's no fixed magic number; a handful of attempts (e.g. 4–5) with exponential backoff is typical. If overloads persist past that, it's better to surface the failure than to keep hammering.

Will switching models help? Not for a 529 — overload is infrastructure-wide, not model-specific. Backoff is the answer, not changing from, say, claude-opus-4-7 to claude-sonnet-4-6.

Last updated May 27, 2026. Error codes and limits verified against Anthropic's official API documentation. Anthropic may change these over time — confirm specifics in the current docs before relying on them in production.

Claude API overloaded_error (HTTP 529): why it happens and how to handle it