Field notes
The studio,
out loud.
Lessons from shipping AI products, building with small teams, and the occasional strong opinion about software.
- 01Engineering
The AI productivity playbook: a real engineer's day
May 15, 20268 min read - 02Engineering
Claude Code + PostHog: analytics-aware development
May 14, 20267 min read - 03Engineering
Claude Code + Sentry: incident debugging as conversation
May 13, 20267 min read - 04Engineering
Claude Code + Supabase: a working integration via MCP
May 12, 20267 min read - 05Engineering
Effective MCP patterns: keeping AI tools safe at scale
May 11, 20268 min read - 06Engineering
MCP fundamentals: connecting your AI tools to your team's stack
May 8, 20268 min read - 07Engineering
Claude Code vs. Codex: which to reach for
May 7, 20267 min read - 08Engineering
Getting started with Codex: install to first real task
May 6, 20267 min read - 09Engineering
Getting started with Claude Code: install to first real task
May 5, 20268 min read - 10Engineering
AI tools for software engineers: a practical orientation
May 4, 20267 min read - 11Engineering
Determinism harnesses for non-deterministic systems
Apr 30, 20262 min read - 12Engineering
Multi-agent orchestration: from kitchen brigade to opera
Apr 30, 20263 min read - 13Engineering
Retry strategies that don't compound errors
Apr 30, 20263 min read - 14Engineering
Tech lead: PR reviews deeper than 'lgtm'
Apr 30, 20264 min read - 15Engineering
Your first MCP server (Node)
Apr 29, 20268 min read - 16Engineering
MCP error handling: tell the model what went wrong
Apr 29, 20262 min read - 17Engineering
Security: threat-model first draft from architecture
Apr 29, 20264 min read - 18Engineering
What makes an eval good
Apr 29, 20267 min read - 19Strategy
Sales: discovery summariser that keeps the human
Apr 28, 20265 min read - 20Engineering
Data: pipeline DAG explainer + drift detector
Apr 28, 20265 min read - 21Engineering
MCP for CI/CD: build-system tools as agent inputs
Apr 28, 20262 min read - 22Engineering
Trend evals vs. threshold evals
Apr 28, 20262 min read - 23Engineering
Backend: API design + endpoint scaffolding
Apr 27, 20269 min read - 24Engineering
Data: SQL refactors and lineage maps
Apr 27, 20265 min read - 25Engineering
Fall-back chains: cheap → expensive → human
Apr 27, 20263 min read - 26Engineering
Integration tests for AI features: contract or behavioural?
Apr 27, 20263 min read - 27Engineering
CI strategy: smoke vs. full suite for LLM apps
Apr 24, 20262 min read - 28Engineering
Self-consistency: when N=3 beats a smarter prompt
Apr 24, 20263 min read - 29Engineering
SRE: postmortem first drafts that don't blame
Apr 24, 20265 min read - 30Engineering
Tech writer: doc audits that catch what humans miss
Apr 24, 20264 min read - 31AI
Agents in government: constituent services with public-records care
Apr 23, 20265 min read - 32Engineering
Cost guardrails: stop runaway agents before billing does
Apr 23, 20266 min read - 33Engineering
End-to-end tests for AI workflows: scope and survival
Apr 23, 20262 min read - 34Engineering
MCP for actioning tools (PR creator, ticket closer)
Apr 23, 20262 min read - 35Engineering
Frontend: accessibility passes that finally get done
Apr 22, 20264 min read - 36Engineering
MCP and the Claude Code workflow specifically
Apr 22, 20262 min read - 37Engineering
Pairwise judges: A/B agreement at scale
Apr 22, 20262 min read - 38Engineering
Pinning model versions through provider migrations
Apr 22, 20262 min read - 39Engineering
Drift catchers: detecting style shifts
Apr 21, 20262 min read - 40Engineering
Eval CI: the pass/fail gate that's actually useful
Apr 21, 20262 min read - 41Engineering
Prompt invariance: prompts that survive paraphrase
Apr 21, 20263 min read - 42Engineering
Tool failure modes: timeouts, retries, idempotency
Apr 21, 20264 min read - 43Engineering
Context engineering: what to load, what to defer
Apr 20, 20264 min read - 44Engineering
Output validation: pydantic, zod, and friends in production
Apr 20, 20262 min read - 45Engineering
Versioning model + prompt as a unit
Apr 20, 20263 min read - 46AI Development
Why AI-First Development Matters for Modern SaaS Products
Apr 20, 20262 min read - 47AI
Agents in hospitality: reservations + recovery
Apr 17, 20265 min read - 48AI
Agents in HR: recruiting agents and the bias receipts they leave behind
Apr 17, 20265 min read - 49Engineering
Backend: database migrations without fear
Apr 17, 20265 min read - 50Engineering
ML: feature-store query rewrites
Apr 17, 20264 min read - 51Engineering
Building agents that explain themselves
Apr 16, 20263 min read - 52Engineering
Constrained decoding: the underrated lever
Apr 16, 20263 min read - 53Engineering
Mobile (Android): Compose rollout audits
Apr 16, 20264 min read - 54Engineering
Safety guardrails: refusal patterns that don't make agents useless
Apr 16, 20263 min read - 55Strategy
Finance: variance commentary that reads like the CFO wrote it
Apr 15, 20265 min read - 56Engineering
Confidence calibration: when 'I don't know' is the answer
Apr 15, 20263 min read - 57Engineering
Counter-example mining
Apr 15, 20263 min read - 58Engineering
The post-launch test plan: what runs forever
Apr 15, 20263 min read - 59Engineering
SRE: runbook generation that captures the response
Apr 15, 20265 min read - 60AI
Agents in legal: contract review with receipts
Apr 14, 20265 min read - 61AI
Agents on the factory floor
Apr 14, 20265 min read - 62Strategy
The hand-off contract — turning an AI employee from prototype to permanent
Apr 14, 20265 min read - 63Engineering
LLM evals are restaurant health inspections
Apr 14, 20264 min read - 64Engineering
Retiring an agent
Apr 14, 20263 min read - 65Engineering
Long-horizon tasks: keeping an agent on rails for hours
Apr 13, 20264 min read - 66Engineering
MCP authorization: per-user permissions
Apr 13, 20262 min read - 67Engineering
MCP composition: when one server should call another
Apr 13, 20262 min read - 68Engineering
MCP server versioning: shipping breaking changes safely
Apr 13, 20262 min read - 69Engineering
MCP transport: stdio vs. HTTP vs. SSE
Apr 13, 20262 min read - 70Engineering
Deploying agents in CI: scoped, audited, repeatable
Apr 10, 20267 min read - 71AI
Agents in media: news summary with a corrections workflow
Apr 10, 20264 min read - 72Engineering
Caching deterministic prefixes
Apr 10, 20263 min read - 73Engineering
Eval result storage and versioning
Apr 10, 20262 min read - 74Engineering
Tests for retrieval pipelines
Apr 10, 20262 min read - 75Engineering
Beyond MCP: tool-use specs in major models
Apr 9, 20262 min read - 76Engineering
Cost tests: catching the prompt that doubled spend
Apr 9, 20262 min read - 77Engineering
The judge pattern for confidence
Apr 9, 20263 min read - 78Engineering
MCP in 10 minutes
Apr 9, 20266 min read - 79Engineering
QA: test-plan generation from acceptance criteria
Apr 9, 20265 min read - 80Engineering
Versioning agent behaviour: prompts as source code
Apr 8, 20263 min read - 81Strategy
Founder ops: board-deck content from raw metrics
Apr 8, 20265 min read - 82Strategy
HR: performance-review draft assistant
Apr 8, 20265 min read - 83Strategy
EM: PR reviewer that flags scope creep
Apr 8, 20265 min read - 84Engineering
UX tests for AI-generated content
Apr 8, 20262 min read - 85Engineering
Agent observability: traces that tell you what happened
Apr 7, 20266 min read - 86AI
Agents in construction: estimator copilots in margin-thin work
Apr 7, 20264 min read - 87AI
Agents in energy: grid monitoring with a safety case
Apr 7, 20264 min read - 88Strategy
An AI employee isn't a bot — it's a teammate with a desk
Apr 7, 20269 min read - 89Engineering
Eval anti-patterns: when evals make products worse
Apr 7, 20263 min read - 90AI
Agents for non-profits: donor research on a tight budget
Apr 6, 20265 min read - 91Engineering
Browsing agents: scraping vs. structured tools
Apr 6, 20263 min read - 92Engineering
Eval-driven prompt iteration
Apr 6, 20262 min read - 93Engineering
Tool-use evals: right tool, right order
Apr 6, 20262 min read - 94Engineering
Voice-first agents: the latency budget you live within
Apr 6, 20263 min read - 95Engineering
Agent memory: what to write down, what to forget
Apr 3, 20263 min read - 96Engineering
Hallucination checks: cite-or-it-didn't-happen
Apr 3, 20263 min read - 97Engineering
MCP server observability
Apr 3, 20262 min read - 98Engineering
Prompt evolution: how agents get worse without you noticing
Apr 3, 20263 min read - 99Engineering
Red-teaming your own prompt
Apr 3, 20263 min read - 100Development
Building Electron Apps That Scale: Lessons from Healthcare Software
Apr 2, 20263 min read - 101Engineering
EM: 1:1 prep + roadmap sanity check
Apr 2, 20264 min read - 102Engineering
Frontend: component scaffolding + state machines
Apr 2, 20264 min read - 103Engineering
Full-stack: a real feature in an afternoon
Apr 2, 20265 min read - 104Engineering
Tests for tool-using agents: trace assertions
Apr 2, 20263 min read - 105Strategy
Product: synthesising 1,000 tickets into 7 themes
Apr 1, 20265 min read - 106Strategy
Sales: pipeline reviewer + forecast challenger
Apr 1, 20265 min read - 107Engineering
MCP authentication: tokens, scopes, OAuth
Apr 1, 20262 min read - 108Engineering
MCP server rate limits: the polite-rejection pattern
Apr 1, 20262 min read - 109Engineering
Property-based testing for LLM features
Apr 1, 20262 min read - 110Engineering
Building your first eval set from scratch
Mar 31, 20268 min read - 111Engineering
Evals for agents: trajectory + outcome
Mar 31, 20267 min read - 112Engineering
MCP and secrets management
Mar 31, 20262 min read - 113Engineering
MCP server hosting: local, sidecar, remote
Mar 31, 20262 min read - 114Engineering
MCP tool naming: making tools discoverable
Mar 31, 20262 min read - 115AI
Agents in telecom: diagnostics that route faster than tier-1
Mar 30, 20265 min read - 116Engineering
LLM-as-judge: when to trust it, when not
Mar 30, 20267 min read - 117Engineering
MCP for data tools (Postgres, BigQuery, S3)
Mar 30, 20262 min read - 118Healthcare
What medieval scribes teach us about AI scribes
Mar 30, 20265 min read - 119Engineering
Structured output: JSON mode, schemas, why one beats the other
Mar 30, 20267 min read - 120AI
Agents in sales: SDR copilots that don't get you blocked
Mar 27, 20265 min read - 121Engineering
Idempotency keys for LLM calls
Mar 27, 20263 min read - 122Engineering
OSS maintainer: triage + contributor-guide updates
Mar 27, 20264 min read - 123Engineering
Prompts are recipes, not spells
Mar 27, 20264 min read - 124Engineering
Why we need MCP at all
Mar 27, 20262 min read - 125Engineering
Human eval workflows: instructions that don't vary
Mar 26, 20262 min read - 126Engineering
Judging open-ended output without a rubric
Mar 26, 20262 min read - 127AI
MCP servers are USB-C for AI
Mar 26, 20265 min read - 128Engineering
MCP tool schemas: arg shapes that help
Mar 26, 20262 min read - 129Engineering
Regression cohorts: catching what evals miss
Mar 26, 20263 min read - 130AI
Agents in support: tier-1 deflection without tier-1 backlash
Mar 25, 20265 min read - 131AI
Agents in insurance: claims processing, speed vs. accuracy
Mar 25, 20264 min read - 132Engineering
Code-writing agents: the test-first discipline
Mar 25, 20263 min read - 133Engineering
Drift tests vs. functional tests: separate lanes
Mar 25, 20263 min read - 134Engineering
Plan vs. act: the agent loop everyone gets wrong
Mar 25, 20266 min read - 135AI
Agents in retail: shoppers that don't feel creepy
Mar 24, 20265 min read - 136Strategy
Comms: crisis-comms first-draft drafter
Mar 24, 20265 min read - 137Engineering
Privacy tests: PII redaction assertions
Mar 24, 20262 min read - 138AI
RAG is a public library (and Dewey was right)
Mar 24, 20264 min read - 139Engineering
Sub-agents: when 1+1 actually equals 2
Mar 24, 20264 min read - 140AI
Your AI agent should plan like a kitchen brigade
Mar 23, 20265 min read - 141Engineering
Calibrating your judge: meta-evals
Mar 23, 20262 min read - 142Strategy
Product: PRD draft from a discovery transcript
Mar 23, 20265 min read - 143Engineering
Security: code-pattern audits and CVE sweeps
Mar 23, 20264 min read - 144Engineering
Tool design: write tools the way you write APIs
Mar 23, 20268 min read - 145AI
Agents in logistics: route planning with a human in the loop
Mar 20, 20265 min read - 146Strategy
Comms: internal newsletter that captures the actual week
Mar 20, 20265 min read - 147Engineering
Golden-set discipline
Mar 20, 20263 min read - 148Engineering
Why probabilistic systems still need deterministic contracts
Mar 20, 20267 min read - 149Engineering
Refusal grammars: predictable, not surprising
Mar 20, 20263 min read - 150AI
Agents for research synthesis
Mar 19, 20265 min read - 151Engineering
MCP for internal tools (Linear, Notion, Slack analogues)
Mar 19, 20262 min read - 152Engineering
ML: eval harness from a spec
Mar 19, 20264 min read - 153Engineering
Multimodal agents: when adding vision actually helps
Mar 19, 20264 min read - 154Engineering
Test-data management for AI: synthetic vs. real
Mar 19, 20262 min read - 155AI
Agents in healthcare: scribe yes, nurse no
Mar 18, 20268 min read - 156Engineering
Behavioural assertions: testing 'should-ness'
Mar 18, 20262 min read - 157Engineering
Eval taxonomy: golden, behavioural, drift, safety
Mar 18, 20263 min read - 158Engineering
Evals for retrieval: separating retrieval from synthesis
Mar 18, 20262 min read - 159Engineering
Your first MCP server (Python)
Mar 18, 20262 min read - 160Engineering
Agent A/B tests: comparing without confusing your users
Mar 17, 20263 min read - 161AI Tools
Your AI coding assistant is a midwife, not a genius
Mar 17, 20265 min read - 162Strategy
CS: onboarding playbook generator per customer
Mar 17, 20265 min read - 163Engineering
The deterministic-envelope pattern
Mar 17, 20263 min read - 164Engineering
MCP and prompt injection: ambient instructions
Mar 17, 20262 min read - 165AI
Agents in finance: compliance with an audit trail
Mar 16, 20264 min read - 166Strategy
Recruiting: JD writer + screening-question generator
Mar 16, 20265 min read - 167Engineering
Few-shot drift: why golden examples poison new versions
Mar 16, 20263 min read - 168Engineering
The judge pattern: agents that grade other agents
Mar 16, 20264 min read - 169Engineering
PII in test fixtures: the boring legal slope
Mar 16, 20263 min read - 170Engineering
Architect: vendor-comparison architecture doc
Mar 13, 20263 min read - 171Strategy
Finance: AP-invoice auditor on a small ops team
Mar 13, 20265 min read - 172Strategy
Legal-ops: internal policy answerer with citation discipline
Mar 13, 20265 min read - 173Engineering
A senior engineer's day with Claude Code
Mar 13, 20269 min read - 174Engineering
Skills files: recipes the model can call
Mar 13, 20264 min read - 175Strategy
Marketing: SEO audits as a Claude Code workflow
Mar 12, 20265 min read - 176Engineering
Evals that survive a model bump
Mar 12, 20263 min read - 177Engineering
Managed agents: when to reach for them
Mar 12, 20264 min read - 178Engineering
Mock LLMs in tests: when to fake, when to call
Mar 12, 20263 min read - 179Engineering
The red set: adversarial cases you're allowed to fail
Mar 12, 20262 min read - 180AI
Agents in agriculture: yield prediction with weather data
Mar 11, 20265 min read - 181AI
Agents in pharma: literature review with citation discipline
Mar 11, 20265 min read - 182Engineering
The new test pyramid for AI products
Mar 11, 20267 min read - 183Engineering
Per-feature evals vs. per-model evals
Mar 11, 20262 min read - 184Engineering
Sampling production traffic for eval
Mar 11, 20262 min read - 185AI
In-product agents that earn renewal
Mar 10, 20265 min read - 186Strategy
Marketing: the campaign-brief copilot
Mar 10, 20265 min read - 187Strategy
Operations: the SOP writer that updates itself
Mar 10, 20265 min read - 188Engineering
Security tests: prompt-injection regression suite
Mar 10, 20262 min read - 189Engineering
Temperature, top-p, and the production tradeoff
Mar 10, 20263 min read - 190Strategy
HR: onboarding-buddy automation
Mar 9, 20265 min read - 191Strategy
EM: sprint-planning copilot for managers
Mar 9, 20265 min read - 192Strategy
Operations: vendor-comparison briefs in 30 minutes
Mar 9, 20265 min read - 193SaaS
From MVP to Scale: A Practical Guide for SaaS Founders
Mar 9, 20264 min read - 194Engineering
QA: flaky test triage at scale
Mar 9, 20265 min read - 195Engineering
DevOps: CI pipeline diagnosis at 2am
Mar 6, 20264 min read - 196Engineering
DevOps: Terraform refactor with a watchful copilot
Mar 6, 20265 min read - 197Engineering
The future of MCP
Mar 6, 20262 min read - 198Engineering
MCP testing: harnesses, fixtures, regressions
Mar 6, 20262 min read - 199Engineering
Output post-processors that don't hide the truth
Mar 6, 20263 min read - 200AI
Agents in education: tutor agents and the assessment problem
Mar 5, 20265 min read - 201Engineering
Authoring eval cases
Mar 5, 20262 min read - 202Strategy
Founder ops: investor-update auto-drafter
Mar 5, 20265 min read - 203Engineering
Snapshot tests: where they help, where they trap
Mar 5, 20262 min read - 204Engineering
Tests for streaming responses
Mar 5, 20262 min read - 205Engineering
Agent rollback: kill switches on day one
Mar 4, 20263 min read - 206AI
Agents in marketing: campaign agents and the brand-voice problem
Mar 4, 20265 min read - 207Engineering
Determinism for tool calls: keys, ordering, side-effects
Mar 4, 20262 min read - 208Engineering
Output diffing in CI
Mar 4, 20263 min read - 209Engineering
Reading an eval dashboard
Mar 4, 20262 min read - 210Engineering
Accessibility tests for AI surfaces
Mar 3, 20262 min read - 211AI
Agents in real estate: lead qualification at speed
Mar 3, 20264 min read - 212Engineering
Eval-driven development
Mar 3, 20263 min read - 213Engineering
Eval ownership in an org: PM, eng, or QA?
Mar 3, 20262 min read - 214Engineering
Performance tests: token budgets and latency SLAs
Mar 3, 20262 min read - 215AI
The agent maturity curve
Mar 2, 20269 min read - 216Engineering
Auto-generated eval cases from production logs
Mar 2, 20262 min read - 217Strategy
CS: renewal-risk scoring that explains itself
Mar 2, 20265 min read - 218Strategy
Recruiting: sourcing brief from a single role description
Mar 2, 20265 min read - 219Engineering
Eval cost management
Mar 2, 20262 min read - 220Engineering
Mobile (iOS): UIKit-to-SwiftUI translation
Mar 2, 20264 min read - 221AI
Agents in fashion: stylist yes, designer no
Feb 27, 20264 min read - 222Engineering
AI-native debugging: the rubber duck got smarter
Feb 26, 20264 min read - 223Engineering
Claude Code + Jira: standups without the standing
Feb 25, 20263 min read - 224AI
Small models are underrated: a case for boring infrastructure
Feb 24, 20264 min read - 225AI
Agents in gaming: from live-ops support to dynamic NPCs
Feb 23, 20264 min read - 226Engineering
Multi-model routing: the dispatcher pattern for LLMs
Feb 20, 20264 min read - 227Engineering
Claude Code + Linear: where work lives, the agent lives
Feb 19, 20263 min read - 228AI
Agents in music: the producer's new intern
Feb 18, 20264 min read - 229Engineering
Semantic caching: why your top 1% of queries cost 60% of your bill
Feb 17, 20264 min read - 230Engineering
Claude Code + Notion: docs become structured data
Feb 16, 20264 min read - 231AI
Agents in podcasting: editor yes, host no
Feb 13, 20264 min read - 232Engineering
AI cost attribution: a chargeback model for LLM spend
Feb 12, 20264 min read - 233Engineering
Claude Code + Slack: standups, escalations, and the back-channel
Feb 11, 20263 min read - 234AI
Agents in publishing: from slush pile to acquisition
Feb 10, 20264 min read - 235Engineering
AI latency budgets: borrowing from network engineering
Feb 9, 20264 min read - 236AI
Agents in libraries: Dewey's heir is silicon
Feb 6, 20264 min read - 237Engineering
AI feature flags: a model rollout looks like a deployment
Feb 5, 20264 min read - 238Engineering
Claude Code + Datadog: 2 a.m. is for the agent now
Feb 4, 20264 min read - 239AI
Agents in museums: catalog, conservation, and the curator
Feb 3, 20263 min read - 240Engineering
AI canary deployments: 1% traffic, 100% paranoia
Feb 2, 20264 min read - 241AI
Agents in veterinary medicine: scribe for the species that can't talk
Jan 30, 20264 min read - 242Engineering
Embedding model selection: the 5-minute decision tree
Jan 29, 20264 min read - 243Engineering
Claude Code + Stripe: revenue-aware development
Jan 28, 20264 min read - 244AI
Agents in aviation: copilot in the literal sense
Jan 27, 20264 min read - 245Engineering
Vector DB architecture: pgvector, managed, or homemade
Jan 26, 20264 min read - 246AI
Agents in maritime: the wheelhouse and the warehouse
Jan 23, 20264 min read - 247Engineering
RAG vs. fine-tuning: a 90% decision tree
Jan 22, 20264 min read - 248Engineering
Claude Code + Figma: design handoff in one prompt
Jan 21, 20264 min read - 249AI
Agents in utilities: the boring grid is the perfect customer
Jan 20, 20263 min read - 250Engineering
Token economics: what your unit cost actually is
Jan 19, 20264 min read - 251AI
Agents in mental health: triage yes, therapy no
Jan 16, 20264 min read - 252Engineering
AI incident response: the postmortem template you'll wish you had
Jan 15, 20264 min read - 253AI
AI for product managers: the new PRD looks like an eval set
Jan 14, 20264 min read - 254AI
Agents in fitness: a coach who reads your data
Jan 13, 20264 min read - 255Leadership
AI team scaling: 1, 3, and 10 engineers
Jan 12, 20265 min read - 256AI
Agents in dental practice: chart, code, and call
Jan 9, 20264 min read - 257Engineering
An AI-aware pull request template
Jan 8, 20265 min read - 258AI
AI for designers: from mood board to motion principles
Jan 7, 20264 min read - 259AI
Agents in architecture firms: drafting is the first 10%
Jan 6, 20264 min read - 260Engineering
Self-healing pipelines: the night shift you don't have to pay
Jan 5, 20264 min read - 261Engineering
Agent supervision loops: the OODA loop, re-implemented
Jan 2, 20264 min read - 262AI
AI for data scientists: notebooks are the new IDE
Dec 31, 20254 min read - 263Engineering
EU AI Act: what changes in your engineering process
Dec 30, 20254 min read - 264Leadership
AI for the CTO: a sixty-minute audit
Dec 29, 20254 min read - 265Engineering
HIPAA and AI: the BAA is the first conversation
Dec 26, 20254 min read - 266Leadership
Twelve AI procurement questions every buyer should ask
Dec 24, 20255 min read - 267Leadership
AI for founders: the 'is this a product?' filter
Dec 23, 20255 min read - 268AI
Agents in sports analytics: the scout's new tablet
Dec 22, 20254 min read - 269AI
AI and the symphony conductor: orchestration is older than software
Dec 19, 20254 min read - 270AI
AI and air traffic control: a 70-year-old playbook for safe autonomy
Dec 18, 20254 min read