Can I use Pydantic with OpenAI structured outputs?

Yes. The OpenAI Python SDK v1.x supports parse=True (or client.beta.chat.completions.parse()) which accepts a Pydantic model as the response_format. The SDK automatically converts your Pydantic model to a JSON Schema, sends the request, and parses the response back into a model instance. Access the parsed object via response.choices[0].message.parsed.

What does strict mode do in structured outputs?

Setting strict: true in the json_schema definition forces the model to exactly match your schema — no extra fields, no missing required fields. All object types in the schema must include additionalProperties: false, and all properties must appear in required. Strict mode adds minor latency on the first request while the schema is processed, but subsequent requests to the same schema are fast due to caching.

OpenAI structured outputs: Python guide (json_schema + Pydantic)

Q: What is the difference between json_mode and structured outputs?

json_mode (response_format: {type: 'json_object'}) only guarantees the output is valid JSON — it does not enforce any schema. Structured outputs (response_format: {type: 'json_schema', json_schema: {...}}) guarantee the output matches a specific JSON Schema you provide, including required fields, types, and the strict no-extra-properties constraint. Use structured outputs for production; json_mode is now mostly superseded.

OpenAI structured outputs guarantee that the model's response matches a JSON Schema you define exactly — not just valid JSON, but the right shape. This replaced json_mode for most production use cases. This guide covers the raw json_schema format, Pydantic integration via the SDK's parse() method, strict mode requirements, and the refusal case you need to handle.

The 30-second answer

Use response_format with type: "json_schema" and provide a json_schema object containing your schema name, the schema itself, and "strict": true.
Strict mode requires "additionalProperties": false on all objects, and all properties in "required". Optional fields use ["type", "null"] union types.
Pydantic shortcut: use client.beta.chat.completions.parse(response_format=YourModel, ...) — the SDK handles schema generation and parsing.
Always check for refusals: message.refusal is set instead of message.parsed when the model declines to follow the schema.

Raw json_schema approach

Define the schema inline in response_format. The outer object needs a name (used for caching), the schema itself, and strict: true:

from openai import OpenAI
import json

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Extract: name, role, company from: 'Sarah Chen, VP of Engineering at Stripe'"}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "contact_extraction",
            "strict": True,
            "schema": {
                "type": "object",
                "additionalProperties": False,
                "properties": {
                    "name": {"type": "string"},
                    "role": {"type": "string"},
                    "company": {"type": ["string", "null"]}
                },
                "required": ["name", "role", "company"]
            }
        }
    }
)

data = json.loads(response.choices[0].message.content)
print(data)  # {'name': 'Sarah Chen', 'role': 'VP of Engineering', 'company': 'Stripe'}

Key constraints for strict mode: every object must have "additionalProperties": false; every property must appear in "required". To make a field optional, use a ["type", "null"] union (as with company above) rather than omitting it from required.

Nested schemas and arrays

Nested objects and arrays work the same way — each nested object needs its own "additionalProperties": false:

schema = {
    "type": "object",
    "additionalProperties": False,
    "properties": {
        "title": {"type": "string"},
        "authors": {
            "type": "array",
            "items": {
                "type": "object",
                "additionalProperties": False,
                "properties": {
                    "name": {"type": "string"},
                    "affiliation": {"type": ["string", "null"]}
                },
                "required": ["name", "affiliation"]
            }
        },
        "year": {"type": "integer"},
        "abstract": {"type": ["string", "null"]}
    },
    "required": ["title", "authors", "year", "abstract"]
}

Arrays of primitives don't need the additionalProperties constraint — that only applies to objects. Enums work normally: "type": "string", "enum": ["pending", "active", "cancelled"].

Pydantic integration with `parse()`

The beta.chat.completions.parse() method converts a Pydantic model to a JSON Schema automatically and returns a parsed model instance:

from openai import OpenAI
from pydantic import BaseModel
from typing import Optional

client = OpenAI()

class ContactInfo(BaseModel):
    name: str
    role: str
    company: Optional[str] = None

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Extract: 'Sarah Chen, VP of Engineering at Stripe'"}
    ],
    response_format=ContactInfo
)

message = response.choices[0].message

# Check for refusals first
if message.refusal:
    print(f"Refused: {message.refusal}")
else:
    contact = message.parsed  # ContactInfo instance
    print(contact.name)       # 'Sarah Chen'
    print(contact.company)    # 'Stripe'

Note the beta namespace — this method is in the beta client for now. It's stable enough for production use. The parse() method does not change what's sent to the API; it only handles schema generation and response parsing on your side.

Handling the refusal case

When the model determines it cannot safely fulfill a structured output request (e.g. the user asks for personal data extraction in a way that triggers safety guidelines), it sets message.refusal instead of populating message.content. Always check both:

message = response.choices[0].message

if message.refusal:
    # Model declined — message.parsed is None
    print(f"Model refused: {message.refusal}")
elif message.parsed:
    # Pydantic path
    result = message.parsed
else:
    # Raw json_schema path
    result = json.loads(message.content)

If you use the raw json_schema approach (not parse()), check message.content for the refusal text — the field name is the same. A refused request still uses tokens; it just won't give you the schema-conformant output.

Structured outputs vs `json_mode`

Feature	json_mode	Structured outputs
Output guaranteed to be valid JSON	Yes	Yes
Output guaranteed to match your schema	No	Yes (strict mode)
Schema definition required	No	Yes
Refusal case	No	Yes — check `message.refusal`
Pydantic integration	Manual	Native via `parse()`
Use case	Legacy / loose	Production extraction/parsing

For any new code, use structured outputs. The json_mode parameter is still available but it offers no schema enforcement — it just tells the model to produce valid JSON, which it may or may not do reliably depending on the prompt.

FAQ

What's the difference between json_mode and structured outputs? json_mode only guarantees valid JSON. Structured outputs guarantee the JSON matches your exact schema. Use structured outputs for any production extraction pipeline.

Can I use Pydantic with structured outputs? Yes — client.beta.chat.completions.parse(response_format=YourModel, ...) handles schema generation and parsing automatically. Access the result via message.parsed.

What strict mode requires: all objects need "additionalProperties": false, all properties must be in "required", optional fields use ["type", "null"] unions. Strict mode adds minor latency on the first call to a given schema, then caches.

Last updated May 28, 2026. Code examples verified against the OpenAI Python SDK v1.x and OpenAI structured outputs documentation. API behaviour may change — confirm against the official docs before deploying to production.