Tagged · AI Engineering

Field notes,
AI Engineering.

46 articles in this tag — part of the Jaypore Labs journal.

01
Engineering
Determinism harnesses for non-deterministic systems
Apr 30, 20262 min read
02
Engineering
What makes an eval good
Apr 29, 20267 min read
03
Engineering
Integration tests for AI features: contract or behavioural?
Apr 27, 20263 min read
04
Engineering
CI strategy: smoke vs. full suite for LLM apps
Apr 24, 20262 min read
05
Engineering
End-to-end tests for AI workflows: scope and survival
Apr 23, 20262 min read
06
Engineering
The post-launch test plan: what runs forever
Apr 15, 20263 min read
07
Engineering
LLM evals are restaurant health inspections
Apr 14, 20264 min read
08
Engineering
Tests for retrieval pipelines
Apr 10, 20262 min read
09
Engineering
Cost tests: catching the prompt that doubled spend
Apr 9, 20262 min read
10
Engineering
UX tests for AI-generated content
Apr 8, 20262 min read
11
Engineering
Property-based testing for LLM features
Apr 1, 20262 min read
12
Engineering
Building your first eval set from scratch
Mar 31, 20268 min read
13
Engineering
Regression cohorts: catching what evals miss
Mar 26, 20263 min read
14
Engineering
Drift tests vs. functional tests: separate lanes
Mar 25, 20263 min read
15
Engineering
Privacy tests: PII redaction assertions
Mar 24, 20262 min read
16
AI
RAG is a public library (and Dewey was right)
Mar 24, 20264 min read
17
Engineering
Golden-set discipline
Mar 20, 20263 min read
18
Engineering
Test-data management for AI: synthetic vs. real
Mar 19, 20262 min read
19
Engineering
Behavioural assertions: testing 'should-ness'
Mar 18, 20262 min read
20
Engineering
Eval taxonomy: golden, behavioural, drift, safety
Mar 18, 20263 min read
21
Engineering
PII in test fixtures: the boring legal slope
Mar 16, 20263 min read
22
Engineering
Mock LLMs in tests: when to fake, when to call
Mar 12, 20263 min read
23
Engineering
The new test pyramid for AI products
Mar 11, 20267 min read
24
Engineering
Security tests: prompt-injection regression suite
Mar 10, 20262 min read
25
Engineering
Authoring eval cases
Mar 5, 20262 min read
26
Engineering
Snapshot tests: where they help, where they trap
Mar 5, 20262 min read
27
Engineering
Tests for streaming responses
Mar 5, 20262 min read
28
Engineering
Accessibility tests for AI surfaces
Mar 3, 20262 min read
29
Engineering
Eval-driven development
Mar 3, 20263 min read
30
Engineering
Performance tests: token budgets and latency SLAs
Mar 3, 20262 min read
31
AI
Small models are underrated: a case for boring infrastructure
Feb 24, 20264 min read
32
Engineering
Multi-model routing: the dispatcher pattern for LLMs
Feb 20, 20264 min read
33
Engineering
AI latency budgets: borrowing from network engineering
Feb 9, 20264 min read
34
Engineering
AI feature flags: a model rollout looks like a deployment
Feb 5, 20264 min read
35
Engineering
AI canary deployments: 1% traffic, 100% paranoia
Feb 2, 20264 min read
36
Engineering
Embedding model selection: the 5-minute decision tree
Jan 29, 20264 min read
37
Engineering
Vector DB architecture: pgvector, managed, or homemade
Jan 26, 20264 min read
38
Engineering
RAG vs. fine-tuning: a 90% decision tree
Jan 22, 20264 min read
39
Engineering
Token economics: what your unit cost actually is
Jan 19, 20264 min read
40
AI
AI for product managers: the new PRD looks like an eval set
Jan 14, 20264 min read
41
Leadership
AI team scaling: 1, 3, and 10 engineers
Jan 12, 20265 min read
42
Engineering
An AI-aware pull request template
Jan 8, 20265 min read
43
Engineering
Self-healing pipelines: the night shift you don't have to pay
Jan 5, 20264 min read
44
Engineering
Agent supervision loops: the OODA loop, re-implemented
Jan 2, 20264 min read
45
Engineering
EU AI Act: what changes in your engineering process
Dec 30, 20254 min read
46
AI
AI and air traffic control: a 70-year-old playbook for safe autonomy
Dec 18, 20254 min read

← Back to all posts

Field notes,AI Engineering.

Determinism harnesses for non-deterministic systems

What makes an eval good

Integration tests for AI features: contract or behavioural?

CI strategy: smoke vs. full suite for LLM apps

End-to-end tests for AI workflows: scope and survival

The post-launch test plan: what runs forever

LLM evals are restaurant health inspections

Tests for retrieval pipelines

Cost tests: catching the prompt that doubled spend

UX tests for AI-generated content

Property-based testing for LLM features

Building your first eval set from scratch

Regression cohorts: catching what evals miss

Drift tests vs. functional tests: separate lanes

Privacy tests: PII redaction assertions

RAG is a public library (and Dewey was right)

Golden-set discipline

Test-data management for AI: synthetic vs. real

Behavioural assertions: testing 'should-ness'

Eval taxonomy: golden, behavioural, drift, safety

PII in test fixtures: the boring legal slope

Mock LLMs in tests: when to fake, when to call

The new test pyramid for AI products

Security tests: prompt-injection regression suite

Authoring eval cases

Snapshot tests: where they help, where they trap

Tests for streaming responses

Accessibility tests for AI surfaces

Eval-driven development

Performance tests: token budgets and latency SLAs

Small models are underrated: a case for boring infrastructure

Multi-model routing: the dispatcher pattern for LLMs

AI latency budgets: borrowing from network engineering

AI feature flags: a model rollout looks like a deployment

AI canary deployments: 1% traffic, 100% paranoia

Embedding model selection: the 5-minute decision tree

Vector DB architecture: pgvector, managed, or homemade

RAG vs. fine-tuning: a 90% decision tree

Token economics: what your unit cost actually is

AI for product managers: the new PRD looks like an eval set

AI team scaling: 1, 3, and 10 engineers

An AI-aware pull request template

Self-healing pipelines: the night shift you don't have to pay

Agent supervision loops: the OODA loop, re-implemented

EU AI Act: what changes in your engineering process

AI and air traffic control: a 70-year-old playbook for safe autonomy

Field notes,
AI Engineering.