This site is about building AI systems that actually work in production — not demos. Here’s a map of the writing and how the pieces connect.
On GPU Infrastructure
GPU infrastructure decisions look like price comparisons. They’re not — they’re configuration problems, data placement problems, and interconnect problems that $/hr doesn’t capture.
- GPU Infrastructure: The Five Calculations That Actually Matter New — A framework built while building a GPU recommendation engine, with a full worked example on a 70B fine-tuning scenario.
On Enterprise RAG
What it takes to build a RAG system a compliance officer or clinical analyst can actually rely on — deterministic retrieval, evidence gating, and the gap between working and trustworthy.
- The Trust Layer: What Separates Good RAG from Enterprise RAG New — Four bugs found while stress-testing a RAG system for regulated industries, and the architectural properties they reveal.
On AI PC Evaluation
Why enterprise AI PC procurement is harder than vendor benchmarks suggest — and what it takes to measure the right things.
- The AI PC Buying Problem Every Enterprise Needs to Solve New — Three user groups, one procurement decision, and why vendor metrics don’t answer the question IT leaders actually need answered.
On Agentic Architecture
The foundation. What agentic systems actually look like when you move past the demo and into something deployable.
Why Your AI Agent Demo Looks Great and Your Production System Doesn’t — The gap between demos and production, the four buckets most “crushing it” claims fall into, and the architecture pattern I keep returning to.
Designing a Professional Digital Twin: The Architecture — What it looks like when you model professional expertise as an agent specification — personas, tools, skills, rules, and memory. Includes the full Sri System.
On MCP in Production
A ground-up account of using the Model Context Protocol as a service-to-service layer in a regulated KYC system — the tradeoffs of the pattern and what it takes to run it reliably.
I Used MCP as a Service-to-Service Protocol. Here’s What I Learned. — Why I used MCP as a transport layer between a LangGraph orchestrator and four integration servers, and the tradeoffs that come with it.
MCP in Production, Part 1: Persistent Sessions, Pooling, and Fault Tolerance — Five transport-layer decisions, each driven by a real failure: session pooling, dead connection eviction, cancel scope isolation, timeouts, and heartbeat design.
MCP in Production, Part 2: Authentication, Observability, and Operational Design — Bearer token auth at the transport layer, correlation IDs across four servers, lazy session init, and clean shutdown.
New posts go to /posts/. Organized by topic as the archive grows.
Subscribe to get new posts by email.