Engineering

Output validation: pydantic, zod, and friends in production

Validation libraries are the discipline that catches LLM-output variance before it hits downstream code.

Yash ShahApril 20, 20262 min read

Validation libraries — pydantic in Python, zod in TypeScript, similar tools elsewhere — are the discipline that catches LLM-output variance before it reaches the downstream code that doesn't tolerate variance. Most teams know they should use them. The teams that ship reliably actually do.

Schema layering

The validation layer sits between the LLM and the consumer:

LLM produces output.
Validation parses against schema.
Valid: pass to consumer.
Invalid: log, retry, fallback, or escalate.

The consumer sees only valid outputs. The variance is contained.

Error surface

Validation errors are observability events:

What field failed?
What was expected vs. received?
What was the original LLM output?

These let the team diagnose issues. Common patterns:

A specific field is consistently wrong → prompt issue.
Outputs occasionally have extra fields → prompt or model issue.
Confidence scores out of range → prompt issue.

Each pattern informs a fix.

Reviewer ritual

Validation failures are reviewed:

Daily summary of validation rates.
Drilled into for patterns.
Fixes shipped.

Without review, validation becomes silent fallback that hides real issues. With review, validation becomes a quality signal.

Performance cost

Validation is fast (typically sub-millisecond) but not free. For high-volume features, the cumulative cost matters slightly. The right approach is to validate at the boundary, not at every internal step.

A real stack

A team's setup:

LLM call with structured output mode.
Pydantic model defines the shape.
model.model_validate_json(response.text) at the boundary.
ValidationError → log + retry once + fallback to default.
Fallback rate tracked in observability.

Simple. Reliable. Works.

What we won't ship

LLM outputs crossing system boundaries without validation.

Validation that swallows errors silently.

Schemas that are too permissive (allowing anything; the validation isn't doing work).

Schemas that are too restrictive (rejecting valid outputs).

Close

Validation libraries are the practical layer that turns LLM outputs into trustworthy data. Schema. Validate. Handle failures. Review patterns. Ship. The library is mature; the discipline is what makes it work.

Output validation: pydantic, zod, and friends in production

Schema layering

Error surface

Reviewer ritual

Performance cost

A real stack

What we won't ship

Close

Related reading

Determinism harnesses for non-deterministic systems

Multi-agent orchestration: from kitchen brigade to opera

Retry strategies that don't compound errors