Architecture

MCP in Production, Part 1: Persistent Sessions, Pooling, and Fault Tolerance

MCP in Production · Part 1 of 2 Part 2: Authentication, Observability, and Operational Design → Most MCP client examples open a session, call a tool, and close the session. That pattern is fine for demos. It breaks in production in ways that aren’t obvious until you’re staring at a hung process or a spike in latency. This is Part 1 of a two-part series on what it takes to run an MCP client reliably. I’ll cover the transport layer: sessions, pooling, dead connection recovery, timeouts, and the heartbeat. Part 2 covers the system layer: authentication, observability, and operational design. ...

MCP in Production, Part 2: Authentication, Observability, and Operational Design

MCP in Production · Part 2 of 2 ← Part 1: Persistent Sessions, Pooling, and Fault Tolerance Part 1 covered the transport layer — keeping sessions alive, recovering from failures, and a few edge cases that only surface when you’re running a real pool under real failure conditions. This part covers what I’d call system readiness: the things that separate a working prototype from something I could hand to a client and say “deploy this.” ...

Designing a Professional Digital Twin: The Architecture

Over the last year, I’ve been building production-grade agentic AI systems — LangGraph state machines, multi-agent orchestration, deterministic validation pipelines designed for regulated environments. And somewhere in that work, I noticed something: the architecture I was using to build reliable AI agents was a pretty accurate model of how I actually operate professionally. So I mapped it out. Not as a second brain or a structured resume. As an agent specification — a design exercise in making professional expertise explicit, structured, and transferable. ...

I Used MCP as a Service-to-Service Protocol. Here's What I Learned.

When I designed the architecture for my KYC onboarding orchestrator, I made a deliberate choice: use MCP not as an LLM-to-tool protocol — the way it was originally designed — but as a service-to-service protocol between a LangGraph orchestrator and a set of independently deployable integration servers. It worked. But it came with real tradeoffs I want to document, because I don’t think this pattern is well understood yet. Background: What I Built The system onboards corporate clients through a fixed sequence of checks — entity profile retrieval, credit rating, sanctions screening, PEP check, CRM update, document generation. Each of those integrations runs as a separate MCP server. A LangGraph graph orchestrates the sequence by calling MCP tools directly from its nodes. ...

Why Your AI Agent Demo Looks Great and Your Production System Doesn't

I’ve spent the last several months building agentic AI systems — not demoing them, building them. And I want to share something that took me longer than I’d like to admit to fully internalize. The hype is real. The gap is also real. And the gap is closing — but not in the way most people think. This reflects where I am in March 2026, building on roughly 18 months of hands-on agentic work. The field is moving fast and I expect some of this to age. ...