Tagged · AI Engineering
Field notes,
AI Engineering.
46 articles in this tag — part of the Jaypore Labs journal.
- 01Engineering
Determinism harnesses for non-deterministic systems
Apr 30, 20262 min read - 02Engineering
What makes an eval good
Apr 29, 20267 min read - 03Engineering
Integration tests for AI features: contract or behavioural?
Apr 27, 20263 min read - 04Engineering
CI strategy: smoke vs. full suite for LLM apps
Apr 24, 20262 min read - 05Engineering
End-to-end tests for AI workflows: scope and survival
Apr 23, 20262 min read - 06Engineering
The post-launch test plan: what runs forever
Apr 15, 20263 min read - 07Engineering
LLM evals are restaurant health inspections
Apr 14, 20264 min read - 08Engineering
Tests for retrieval pipelines
Apr 10, 20262 min read - 09Engineering
Cost tests: catching the prompt that doubled spend
Apr 9, 20262 min read - 10Engineering
UX tests for AI-generated content
Apr 8, 20262 min read - 11Engineering
Property-based testing for LLM features
Apr 1, 20262 min read - 12Engineering
Building your first eval set from scratch
Mar 31, 20268 min read - 13Engineering
Regression cohorts: catching what evals miss
Mar 26, 20263 min read - 14Engineering
Drift tests vs. functional tests: separate lanes
Mar 25, 20263 min read - 15Engineering
Privacy tests: PII redaction assertions
Mar 24, 20262 min read - 16AI
RAG is a public library (and Dewey was right)
Mar 24, 20264 min read - 17Engineering
Golden-set discipline
Mar 20, 20263 min read - 18Engineering
Test-data management for AI: synthetic vs. real
Mar 19, 20262 min read - 19Engineering
Behavioural assertions: testing 'should-ness'
Mar 18, 20262 min read - 20Engineering
Eval taxonomy: golden, behavioural, drift, safety
Mar 18, 20263 min read - 21Engineering
PII in test fixtures: the boring legal slope
Mar 16, 20263 min read - 22Engineering
Mock LLMs in tests: when to fake, when to call
Mar 12, 20263 min read - 23Engineering
The new test pyramid for AI products
Mar 11, 20267 min read - 24Engineering
Security tests: prompt-injection regression suite
Mar 10, 20262 min read - 25Engineering
Authoring eval cases
Mar 5, 20262 min read - 26Engineering
Snapshot tests: where they help, where they trap
Mar 5, 20262 min read - 27Engineering
Tests for streaming responses
Mar 5, 20262 min read - 28Engineering
Accessibility tests for AI surfaces
Mar 3, 20262 min read - 29Engineering
Eval-driven development
Mar 3, 20263 min read - 30Engineering
Performance tests: token budgets and latency SLAs
Mar 3, 20262 min read - 31AI
Small models are underrated: a case for boring infrastructure
Feb 24, 20264 min read - 32Engineering
Multi-model routing: the dispatcher pattern for LLMs
Feb 20, 20264 min read - 33Engineering
AI latency budgets: borrowing from network engineering
Feb 9, 20264 min read - 34Engineering
AI feature flags: a model rollout looks like a deployment
Feb 5, 20264 min read - 35Engineering
AI canary deployments: 1% traffic, 100% paranoia
Feb 2, 20264 min read - 36Engineering
Embedding model selection: the 5-minute decision tree
Jan 29, 20264 min read - 37Engineering
Vector DB architecture: pgvector, managed, or homemade
Jan 26, 20264 min read - 38Engineering
RAG vs. fine-tuning: a 90% decision tree
Jan 22, 20264 min read - 39Engineering
Token economics: what your unit cost actually is
Jan 19, 20264 min read - 40AI
AI for product managers: the new PRD looks like an eval set
Jan 14, 20264 min read - 41Leadership
AI team scaling: 1, 3, and 10 engineers
Jan 12, 20265 min read - 42Engineering
An AI-aware pull request template
Jan 8, 20265 min read - 43Engineering
Self-healing pipelines: the night shift you don't have to pay
Jan 5, 20264 min read - 44Engineering
Agent supervision loops: the OODA loop, re-implemented
Jan 2, 20264 min read - 45Engineering
EU AI Act: what changes in your engineering process
Dec 30, 20254 min read - 46AI
AI and air traffic control: a 70-year-old playbook for safe autonomy
Dec 18, 20254 min read