OpenAI API 429 error: rate limit vs. quota (and how to fix each)

A 429 from the OpenAI API is the single most misdiagnosed error in the platform, because the same status code covers two completely different problems with opposite fixes. Before you add a retry loop, figure out which one you have — retrying the wrong kind wastes time and money.

The 30-second answer

Step 1 — read the error body, not just the status code

Every OpenAI error returns JSON with an error object. The type/code inside it tells you which 429 you've got:

{
  "error": {
    "message": "Rate limit reached for ...",
    "type": "requests",            // or "tokens"
    "code": "rate_limit_exceeded"  // vs "insufficient_quota"
  }
}

That one field decides everything below.

Case A: rate_limit_exceeded (you're going too fast)

Your account has per-minute limits on requests (RPM) and tokens (TPM), set by your usage tier. You tripped one of them. Note that the token limit counts the max_tokens you request, not just what you use — so an oversized max_tokens can trigger a TPM 429 even on a short reply.

Fix it:

import time, random
from openai import OpenAI, RateLimitError

client = OpenAI()

def create_with_retry(max_attempts=5, **kwargs):
    for attempt in range(max_attempts):
        try:
            return client.chat.completions.create(**kwargs)
        except RateLimitError as e:
            # Only backoff helps rate limits — bail out on quota errors
            if getattr(e, "code", "") == "insufficient_quota":
                raise
            if attempt < max_attempts - 1:
                time.sleep((2 ** attempt) + random.uniform(0, 1))
                continue
            raise

Case B: insufficient_quota (you're out of credits / hit a cap)

This 429 has nothing to do with speed. It means the account has no available credit balance, no valid payment method, or has hit a usage limit you (or your org admin) set. Retrying does nothing — it will return 429 on every attempt until billing is sorted.

Fix it: add or update a payment method, top up your credit balance, and check the usage-limit (budget) settings in your account — a soft/hard monthly cap that's been reached produces exactly this error. If you manage a team, confirm the org-level limit hasn't been exhausted by another project.

How to stop hitting 429s in the first place

FAQ

Is a 429 the same as being "overloaded"? No — that's a server-side condition on some APIs (e.g. Anthropic's 529 overloaded_error). A 429 is about your account's limit or quota, not the provider being busy.

How many retries? A handful (4–5) with exponential backoff is typical for rate_limit_exceeded. For insufficient_quota, zero — fix billing instead.


Related

Last updated May 27, 2026. Behavior verified against OpenAI's rate-limit and error-code documentation. Providers change limits and headers over time — confirm in the current docs before relying on specifics in production.