Engineering

Eval ownership in an org: PM, eng, or QA?

Eval needs a named owner. The right role depends on the org.

Yash ShahMarch 3, 20262 min read

A team's eval set was "everyone's responsibility." It didn't grow. It didn't get reviewed. Quality drifted.

Eval needs an owner. The right role depends on the org.

The three options

Product Manager. Owns what good means. Authors cases that reflect product intent. Reviews drift trends.

Engineering. Owns the eval infrastructure. Adds cases mined from production. Maintains the test pipeline.

QA. Owns the eval-as-test-suite framing. Ensures coverage. Authors edge cases.

Each can work; each has trade-offs.

When each wins

PM-owned: when product clarity is the bottleneck.
Eng-owned: when infrastructure is immature or the eval needs deep technical investment.
QA-owned: when the team has a strong QA function and clear product specs.

For early-stage startups: PM-owned often works (PM has product clarity, engineering supports). For larger orgs: dedicated eval engineer or ML-ops role.

Reviewer ritual

The owner:

Drives quarterly reviews.
Surfaces eval-related concerns to leadership.
Owns eval-set growth.
Makes the call on threshold changes.

A real org

A team we work with:

PM owns each feature's eval set.
Engineering owns the infrastructure (CI, dashboard, mining tooling).
Quarterly review attended by PM, lead eng, leadership.

The split works because each role has clear accountability.

Trade-offs

PM-owned: product alignment, may lack technical depth.
Eng-owned: technical depth, may lack product-quality lens.
QA-owned: systematic, may lack product authority.

The right choice depends on the org's maturity and which role can execute.

What we won't ship

Evals as "everyone's responsibility."

Owners without authority to make decisions.

Owners without time allocated to eval work.

Skipping the org-level review of eval health.

Close

Eval ownership is org design. PM, eng, or QA. Each can work. The team picks based on capabilities and stage. Skip the ownership question and the eval atrophies.

Eval ownership in an org: PM, eng, or QA?

The three options

When each wins

Reviewer ritual

A real org

Trade-offs

What we won't ship

Close

Related reading

Determinism harnesses for non-deterministic systems

Multi-agent orchestration: from kitchen brigade to opera

Retry strategies that don't compound errors