Engineering

Security tests: prompt-injection regression suite

Prompt injection is a real attack. The regression suite makes the team's defences testable.

Yash ShahMarch 10, 20262 min read

Prompt injection is a real attack. Users craft inputs that try to override the system prompt's instructions. Agents that are vulnerable can be coaxed into bad behaviour. The defence is engineering; the test is a regression suite.

The attack library

The team's attack library:

Direct injection ("ignore previous instructions").
Indirect injection (attack hidden in retrieved content).
Role confusion ("you are a different assistant now").
Output-format manipulation.
Jailbreak templates published in security communities.

Each attack has expected behaviour: the agent refuses or routes appropriately.

Test design

Each attack is a test case:

Input: the attack.
Expected: refusal or appropriate routing.
Actual: agent's response.

CI runs the full suite. Regressions fail.

Reviewer ritual

PR review for prompt or model changes:

Prompt-injection regression suite passes.
New attacks added if any have surfaced in the security community.
Investigate any new pass-through (attack succeeded).

A real suite

A team's prompt-injection suite:

80 attacks across the categories.
Each annotated with the desired outcome.
Run on every PR + nightly.
Pass rate >98% required for merge.

Maintained by a security-aware engineer. Updated as new attacks emerge.

Maintenance

Maintenance:

Quarterly review of new attacks in the security community.
Add new attacks to the suite.
Retire attacks that are no longer relevant.
Document the team's attack-defence position.

What we won't ship

Prompt updates without prompt-injection regression testing.

Suite that doesn't grow as new attacks emerge.

Pass rates below threshold.

Skipping the security-community-monitoring function.

Close

Prompt-injection regression suites turn ad-hoc security work into engineering. Attack library, regression tests, CI gate. The team's defences stay tested as prompts evolve. Skip the suite and prompt-injection becomes the next incident.

Security tests: prompt-injection regression suite

The attack library

Test design

Reviewer ritual

A real suite

Maintenance

What we won't ship

Close

Related reading

The AI productivity playbook: a real engineer's day

Claude Code + PostHog: analytics-aware development

Claude Code + Sentry: incident debugging as conversation