OpenAI structured outputs: Python guide (json_schema + Pydantic)

OpenAI structured outputs guarantee that the model's response matches a JSON Schema you define exactly — not just valid JSON, but the right shape. This replaced json_mode for most production use cases. This guide covers the raw json_schema format, Pydantic integration via the SDK's parse() method, strict mode requirements, and the refusal case you need to handle.

The 30-second answer

Raw json_schema approach

Define the schema inline in response_format. The outer object needs a name (used for caching), the schema itself, and strict: true:

from openai import OpenAI
import json

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Extract: name, role, company from: 'Sarah Chen, VP of Engineering at Stripe'"}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "contact_extraction",
            "strict": True,
            "schema": {
                "type": "object",
                "additionalProperties": False,
                "properties": {
                    "name": {"type": "string"},
                    "role": {"type": "string"},
                    "company": {"type": ["string", "null"]}
                },
                "required": ["name", "role", "company"]
            }
        }
    }
)

data = json.loads(response.choices[0].message.content)
print(data)  # {'name': 'Sarah Chen', 'role': 'VP of Engineering', 'company': 'Stripe'}

Key constraints for strict mode: every object must have "additionalProperties": false; every property must appear in "required". To make a field optional, use a ["type", "null"] union (as with company above) rather than omitting it from required.

Nested schemas and arrays

Nested objects and arrays work the same way — each nested object needs its own "additionalProperties": false:

schema = {
    "type": "object",
    "additionalProperties": False,
    "properties": {
        "title": {"type": "string"},
        "authors": {
            "type": "array",
            "items": {
                "type": "object",
                "additionalProperties": False,
                "properties": {
                    "name": {"type": "string"},
                    "affiliation": {"type": ["string", "null"]}
                },
                "required": ["name", "affiliation"]
            }
        },
        "year": {"type": "integer"},
        "abstract": {"type": ["string", "null"]}
    },
    "required": ["title", "authors", "year", "abstract"]
}

Arrays of primitives don't need the additionalProperties constraint — that only applies to objects. Enums work normally: "type": "string", "enum": ["pending", "active", "cancelled"].

Pydantic integration with parse()

The beta.chat.completions.parse() method converts a Pydantic model to a JSON Schema automatically and returns a parsed model instance:

from openai import OpenAI
from pydantic import BaseModel
from typing import Optional

client = OpenAI()

class ContactInfo(BaseModel):
    name: str
    role: str
    company: Optional[str] = None

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Extract: 'Sarah Chen, VP of Engineering at Stripe'"}
    ],
    response_format=ContactInfo
)

message = response.choices[0].message

# Check for refusals first
if message.refusal:
    print(f"Refused: {message.refusal}")
else:
    contact = message.parsed  # ContactInfo instance
    print(contact.name)       # 'Sarah Chen'
    print(contact.company)    # 'Stripe'

Note the beta namespace — this method is in the beta client for now. It's stable enough for production use. The parse() method does not change what's sent to the API; it only handles schema generation and response parsing on your side.

Handling the refusal case

When the model determines it cannot safely fulfill a structured output request (e.g. the user asks for personal data extraction in a way that triggers safety guidelines), it sets message.refusal instead of populating message.content. Always check both:

message = response.choices[0].message

if message.refusal:
    # Model declined — message.parsed is None
    print(f"Model refused: {message.refusal}")
elif message.parsed:
    # Pydantic path
    result = message.parsed
else:
    # Raw json_schema path
    result = json.loads(message.content)

If you use the raw json_schema approach (not parse()), check message.content for the refusal text — the field name is the same. A refused request still uses tokens; it just won't give you the schema-conformant output.

Structured outputs vs json_mode

Featurejson_modeStructured outputs
Output guaranteed to be valid JSONYesYes
Output guaranteed to match your schemaNoYes (strict mode)
Schema definition requiredNoYes
Refusal caseNoYes — check message.refusal
Pydantic integrationManualNative via parse()
Use caseLegacy / looseProduction extraction/parsing

For any new code, use structured outputs. The json_mode parameter is still available but it offers no schema enforcement — it just tells the model to produce valid JSON, which it may or may not do reliably depending on the prompt.

FAQ

What's the difference between json_mode and structured outputs? json_mode only guarantees valid JSON. Structured outputs guarantee the JSON matches your exact schema. Use structured outputs for any production extraction pipeline.

Can I use Pydantic with structured outputs? Yes — client.beta.chat.completions.parse(response_format=YourModel, ...) handles schema generation and parsing automatically. Access the result via message.parsed.

What strict mode requires: all objects need "additionalProperties": false, all properties must be in "required", optional fields use ["type", "null"] unions. Strict mode adds minor latency on the first call to a given schema, then caches.

Last updated May 28, 2026. Code examples verified against the OpenAI Python SDK v1.x and OpenAI structured outputs documentation. API behaviour may change — confirm against the official docs before deploying to production.