Claude API overloaded_error (HTTP 529): why it happens and how to handle it
If your Claude API call just came back with HTTP status 529 and an overloaded_error, the short version is: it's not your fault, and it's not your rate limit. Anthropic's API is temporarily overloaded across all users, and the request should be retried. This page explains exactly what the error is, how it differs from a 429 rate-limit error, and the retry pattern that handles it cleanly.
The 30-second answer
- What it is:
529 overloaded_error= Anthropic's API is temporarily overloaded (high traffic across all users). Transient and server-side. - What to do: retry the request with exponential backoff + jitter. Don't hammer it with immediate retries.
- It's not a 429: a
429 rate_limit_erroris your account hitting its limit; a529is Anthropic-wide overload. Different cause, different fix. - Easiest path: the official Anthropic SDKs already retry failed requests automatically (and the behavior is configurable) — so a lot of the time, using the SDK's built-in retries is the whole fix.
What the 529 overloaded_error actually is
Anthropic's API uses a predictable set of HTTP error codes. The 529 sits at the top of that list:
| Status | Error type | Meaning |
|---|---|---|
| 400 | invalid_request_error | Problem with the format or content of your request |
| 401 | authentication_error | Problem with your API key |
| 403 | permission_error | Your key can't use the requested resource |
| 404 | not_found_error | Resource not found |
| 413 | request_too_large | Request exceeds the size limit (32 MB on the Messages API) |
| 429 | rate_limit_error | Your account hit a rate limit |
| 500 | api_error | Unexpected internal error on Anthropic's side |
| 529 | overloaded_error | The API is temporarily overloaded |
The key word in the 529 definition is temporarily. It's a capacity signal, not a correctness signal: your request was well-formed and authenticated, but Anthropic's systems were too busy to serve it at that moment. Per Anthropic's docs, 529s occur "when APIs experience high traffic across all users" — so it tends to come in waves during peak demand, and it clears on its own.
529 vs. 429: don't confuse overload with your own rate limit
These two get conflated constantly, and the fix is different for each.
- 429
rate_limit_error— your account or organization hit its rate limit (requests/minute or tokens/minute for your tier). The fix is to slow down, raise your limits, or spread the load. - 529
overloaded_error— Anthropic's infrastructure is overloaded across all users. The fix is to back off and retry; there's nothing wrong with your account.
There's one important overlap worth knowing in 2026: per Anthropic's current docs, if your organization has a sharp spike in usage, you can now see 429 errors (because of acceleration limits on the API) in situations that previously returned 529. The practical takeaway is the same for both: ramp traffic up gradually and keep usage patterns consistent rather than spiking.
How to handle a 529 in code
The correct pattern is exponential backoff with jitter: wait a short, growing interval between retries, with a little randomness so many clients don't retry in lockstep. A minimal example:
import time, random, anthropic
client = anthropic.Anthropic()
def create_with_retry(max_attempts=5, **kwargs):
for attempt in range(max_attempts):
try:
return client.messages.create(**kwargs)
except anthropic.APIStatusError as e:
# Retry on transient overload / server / rate-limit conditions
if e.status_code in (429, 500, 529) and attempt < max_attempts - 1:
delay = (2 ** attempt) + random.uniform(0, 1) # 1s, 2s, 4s... + jitter
time.sleep(delay)
continue
raise
Three things make this robust:
- Exponential growth (1s, 2s, 4s, 8s…) gives the overload time to clear instead of piling on more requests.
- Jitter (the random fraction) prevents a "thundering herd" where every client retries at the same instant and re-overloads the API.
- A retry cap so a sustained outage fails loudly instead of hanging forever.
The easier path: let the SDK retry
You often don't need to write the loop above at all. The official Anthropic SDKs include built-in support for retries and error handling, and the retry behavior is configurable. If you're already using the Python or TypeScript SDK, lean on its automatic retries first and only add a custom wrapper if you need behavior the SDK doesn't cover.
Watch for overload errors mid-stream
If you're using the streaming Messages API over SSE, an error can occur after a 200 response has already started — including an overloaded_error delivered as a stream event rather than an HTTP status. Handle error events inside your stream loop, not just around the initial request, or a mid-stream overload will look like a truncated response.
How to reduce how often you hit 529s
- Ramp gradually, stay consistent. Sudden spikes are the most likely thing to trip both 529 overloads and 429 acceleration limits. Smooth, predictable traffic is treated better.
- Move bulk work to the Batch API. If you're firing thousands of non-urgent requests, the Message Batches API is built for that and lets you poll for results instead of holding live connections — far less likely to collide with peak-load overload.
- Use streaming for long requests. For anything that might run over ~10 minutes, stream the response (or batch it). This also avoids idle-connection timeouts, a separate failure mode that's easy to mistake for an overload.
- Add idempotency at your layer. Because retries are expected, make sure a retried request can't double-charge a user or duplicate a side effect in your app.
Is it me or is it Anthropic?
A 529 is Anthropic-side by definition, but if you're seeing a sustained wall of them (not the occasional blip), check Anthropic's status page to confirm a broader incident before spending hours debugging your own code. Every API response also includes a unique request-id header (e.g. req_011CSHoEeqs5C35K2UUqR7Fy); the official SDKs expose it as a property on the response object. Capture it in your logs — if you open a support ticket, that ID is the fastest way to get a specific request looked at.
FAQ
Does a 529 cost me anything? No — a failed request that returns an error isn't a successful generation, so you're not billed for output tokens on a 529.
How many times should I retry? There's no fixed magic number; a handful of attempts (e.g. 4–5) with exponential backoff is typical. If overloads persist past that, it's better to surface the failure than to keep hammering.
Will switching models help? Not for a 529 — overload is infrastructure-wide, not model-specific. Backoff is the answer, not changing from, say, claude-opus-4-7 to claude-sonnet-4-6.
Related
- API vs ChatGPT Plus / Claude Pro — when the API is actually cheaper
- Is the Anthropic API cheaper than Claude Pro?
Last updated May 27, 2026. Error codes and limits verified against Anthropic's official API documentation. Anthropic may change these over time — confirm specifics in the current docs before relying on them in production.