How to set a hard spend limit on the OpenAI API

The fear with any usage-based API is the runaway bill — an agent that loops, a retry storm, a stray 1M-token paste. The good news: on the OpenAI API you can create a ceiling that physically cannot be exceeded. The trick is knowing which setting is a true hard cap and which is just an alert.

The 30-second answer

Option 1 — the real hard ceiling: prepaid credits, auto-recharge off

OpenAI's API runs on prepaid credits: you buy a balance, and usage is deducted from it. With auto-recharge turned off, that balance is an absolute wall — when it's gone, the API returns 429 insufficient_quota and stops, full stop. There is no overage to surprise you on next month's invoice.

Set it up:

Caveats worth knowing: purchased credits expire after one year and are non-refundable, and each trust tier caps how much credit you can hold at once. But for "I never want to be billed more than $X," prepaid-with-auto-recharge-off is the answer.

Option 2 — auto-recharge with a monthly recharge limit (convenience + a softer cap)

If you don't want production to halt the instant credits run dry, turn auto-recharge ON but set a monthly recharge limit. Configure three values: the recharge amount (how much to buy each time), the threshold (the balance that triggers a recharge), and the optional monthly recharge limit (the most it will auto-buy per month). The monthly recharge limit is what stops auto-recharge from quietly turning a bug into a four-figure bill — without it, auto-recharge has no ceiling.

Option 3 — usage limits & alerts (early warning, not a wall)

In your organization's limits settings you can set usage thresholds that email you as spend climbs. Treat these as a smoke alarm, not a circuit breaker: they tell you something's wrong, but on their own they don't stop requests. Pair them with Option 1 or 2 so you get both the warning and the stop.

The bill-spike trap to defend against

Most "how did I spend that much?!" stories come from one of these:

The defense is layered: a prepaid hard cap so the worst case is bounded, usage alerts so you hear about it early, and a hard retry cap in your code so a loop can't run forever.

FAQ

Can I set a per-key or per-project limit? Organization/project-level limits and budgets are configurable in settings; the account-wide hard ceiling is still your prepaid balance. Use project limits to divide a budget, not to replace the balance cap.

Will hitting the cap corrupt anything? No — requests just fail cleanly with insufficient_quota. Make sure your app handles that error gracefully (don't infinite-retry it — see the 429 guide).


Related

Last updated May 27, 2026. Billing behavior verified against OpenAI's prepaid-billing and usage-limit documentation. OpenAI may change billing controls — confirm in your account settings before relying on specifics.