Structured Output
Intent
Constrain LLM outputs to follow a specific schema (JSON, XML, typed objects), ensuring reliable downstream processing.
Problem
LLMs generate free-form text by default. When you need to parse the output programmatically — to feed it to another system, store in a database, or pass to the next step in a chain — free-form text is unreliable. The model might format things differently each time, include unexpected fields, or wrap the data in conversational text.
Solution
Specify an output schema (JSON Schema, Pydantic model, TypeScript type) and instruct the model to conform to it. Modern LLM APIs support constrained decoding that guarantees valid JSON matching your schema. For older models, use parsing with retry logic. This is critical for agentic systems where tool calls, decisions, and intermediate results all need to be machine-readable.
Diagram
Schema: { action: string, parameters: { target: string, value: number } }
Without structured output:
LLM → "I'll set the temperature to 72 degrees for the living room."
(Good luck parsing that reliably)
With structured output:
LLM → { "action": "set_temperature", "parameters": { "target": "living_room", "value": 72 } }
(Directly usable by code)When to Use
- Any time LLM output needs to be parsed by code
- Function/tool calling responses
- Data extraction and classification tasks
- Multi-step pipelines where output format must be consistent
When NOT to Use
- Free-form creative writing or conversation
- When the output is presented directly to humans
Pros & Cons
Pros
- Guaranteed valid, parseable output (with constrained decoding)
- Eliminates output parsing failures
- Self-documenting: the schema IS the documentation
- Enables reliable multi-step pipelines
Cons
- Constrained decoding can slightly reduce output quality
- Complex schemas may confuse the model
- Not all providers support full schema enforcement
- Schema design requires upfront thought
Implementation Steps
- 1Define your output schema (JSON Schema, Pydantic, Zod, etc.)
- 2Use your LLM provider's structured output feature if available
- 3If not available, include the schema in the prompt with clear instructions
- 4Add validation: verify the output matches the schema even with enforcement
- 5Implement retry with feedback for schema violations
- 6Keep schemas simple — complex nested schemas increase error rates
Real-World Example
Email Classification Pipeline
Schema requires: { category: 'support' | 'sales' | 'spam', urgency: 'low' | 'medium' | 'high', summary: string, action_needed: boolean }. Every email is classified into this exact structure, enabling automated routing and prioritization without parsing ambiguity.
from openai import OpenAI
from pydantic import BaseModel, Field
client = OpenAI()
class TaskExtraction(BaseModel):
tasks: list[str] = Field(description="Actionable tasks extracted from the message")
priority: str = Field(description="Overall priority: low, medium, high")
deadline: str | None = Field(description="Deadline if mentioned, ISO format")
def extract_tasks(message: str) -> TaskExtraction:
completion = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract tasks, priority, and deadline from messages."},
{"role": "user", "content": message},
],
response_format=TaskExtraction,
)
return completion.choices[0].message.parsed
result = extract_tasks("Finish the report by Friday and review the budget — high priority!")
# result.tasks -> ["Finish the report", "Review the budget"]
# result.priority -> "high"
# result.deadline -> "2025-03-14"