<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Hyperdev: Articles]]></title><description><![CDATA[Feature articles on all things agentic coding—
From prompt-driven development to autonomous dev workflows, this section dives into the tools, techniques, and thinking behind AI-first programming. Real-world use cases, architecture breakdowns, code experiments, and practical insight into building with agents instead of just writing code line by line.]]></description><link>https://hyperdev.matsuoka.com/s/hyperdev</link><image><url>https://substackcdn.com/image/fetch/$s_!j9a7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab665959-5546-4469-9e93-9e1518976e2b_1024x1024.png</url><title>Hyperdev: Articles</title><link>https://hyperdev.matsuoka.com/s/hyperdev</link></image><generator>Substack</generator><lastBuildDate>Wed, 22 Apr 2026 05:20:36 GMT</lastBuildDate><atom:link href="https://hyperdev.matsuoka.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Robert Matsuoka]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[hyperdev@matsuoka.com]]></webMaster><itunes:owner><itunes:email><![CDATA[hyperdev@matsuoka.com]]></itunes:email><itunes:name><![CDATA[Robert Matsuoka]]></itunes:name></itunes:owner><itunes:author><![CDATA[Robert Matsuoka]]></itunes:author><googleplay:owner><![CDATA[hyperdev@matsuoka.com]]></googleplay:owner><googleplay:email><![CDATA[hyperdev@matsuoka.com]]></googleplay:email><googleplay:author><![CDATA[Robert Matsuoka]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[It’s The Harness, Stupid!]]></title><description><![CDATA[Why AI tool orchestration now matters more than foundation model quality]]></description><link>https://hyperdev.matsuoka.com/p/its-the-harness-stupid</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/its-the-harness-stupid</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 13 Apr 2026 17:17:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!376u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!376u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!376u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 424w, https://substackcdn.com/image/fetch/$s_!376u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 848w, https://substackcdn.com/image/fetch/$s_!376u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 1272w, https://substackcdn.com/image/fetch/$s_!376u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!376u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png" width="1024" height="523" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:523,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1201765,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193459844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4af78c41-cc16-45d4-abb7-f0e0ee92722b_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!376u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 424w, https://substackcdn.com/image/fetch/$s_!376u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 848w, https://substackcdn.com/image/fetch/$s_!376u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 1272w, https://substackcdn.com/image/fetch/$s_!376u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c1f4dab-a192-4b50-bbbb-5c889e29af13_1024x523.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>It&#8217;s The Harness, Stupid!</h2><p><strong>Why AI tool orchestration now matters more than foundation model quality</strong></p><p><em>Author: Bob Matsuoka, CTO @ Duetto Research</em> <br><em>April 6, 2026</em></p><h2>TL;DR</h2><ul><li><p>Same-model testing reveals 0.82-point quality spread (3.93 to 4.75) and 7x efficiency differences&#8212;orchestration dominates outcomes</p></li><li><p>Market validation: Claude maintains 70% developer preference despite GPT-5.4 achieving model parity through superior harness quality</p></li><li><p>Reddit analysis confirms Codex efficiency gains come from orchestration improvements, not just model upgrades</p></li><li><p>Competitive advantage has shifted permanently from model superiority to ecosystem superiority</p></li></ul><p><strong>Bottom line: The harness era has begun. Choose tools based on workflow fit, not benchmark claims.</strong></p><h2>The $50B Model Myth</h2><p>The AI industry has a fixation problem. Every week brings breathless announcements about parameter counts, training costs, and benchmark scores. &#8220;GPT-6 has 50 trillion parameters!&#8221; &#8220;Our model scored 94.7% on SWE-bench!&#8221; &#8220;We spent $2 billion on compute!&#8221;</p><p>Three converging pieces of evidence prove this approach is fundamentally wrong.</p><div class="callout-block" data-callout="true"><p><strong>Evidence #1:</strong> I tested eight AI coding agents across five programming challenges. Four agents used identical Claude Sonnet 4.6 models. Quality scores ranged from 3.93 to 4.75&#8212;a 0.82-point spread on the same foundation model.</p></div><div class="callout-block" data-callout="true"><p><strong>Evidence #2:</strong> GPT-5.4 achieved parity with Claude Sonnet 4.6 on coding benchmarks. Yet Claude maintains 70% developer preference through superior ecosystem quality.</p></div><div class="callout-block" data-callout="true"><p><strong>Evidence #3:</strong> Reddit developer communities confirm Codex&#8217;s efficiency improvements come from orchestration architecture changes, not just model upgrades.</p></div><p><strong>The harness matters more than the model.</strong> Choosing an AI coding tool is now primarily an engineering decision, not a model selection decision. The next competitive advantage isn&#8217;t bigger models&#8212;it&#8217;s better orchestration.</p><h2>Evidence Pillar #1: The Smoking Gun Laboratory Data</h2><h3>The Bake-Off Setup</h3><p>I designed five programming challenges ranging from 30-minute tasks to 8-hour full-stack builds:</p><ul><li><p><strong>Level 1-2:</strong> Simple scripts and basic applications</p></li><li><p><strong>Level 3:</strong> API integration with Docker containerization</p></li><li><p><strong>Level 4:</strong> Extensible data processing pipeline (architecture test)</p></li><li><p><strong>Level 5:</strong> Full-stack web application with authentication</p></li></ul><p>Eight agents competed: Claude Code, Claude MPM, Codex, Gemini CLI, Auggie, Qwen+Aider, DeepSeek+Aider, and Warp AI. Each received identical prompts. A panel of expert developers blind-reviewed all submissions across eight criteria: functionality, correctness, best practices, architecture, code reuse, testing, error handling, and documentation.</p><h3>The Harness Advantage Data</h3><p><strong>Table 1: Same Model, Different Worlds</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TErH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TErH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 424w, https://substackcdn.com/image/fetch/$s_!TErH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 848w, https://substackcdn.com/image/fetch/$s_!TErH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 1272w, https://substackcdn.com/image/fetch/$s_!TErH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TErH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png" width="867" height="409" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47d85d71-6503-4452-ad8c-957585385133_867x409.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:409,&quot;width&quot;:867,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74882,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193459844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TErH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 424w, https://substackcdn.com/image/fetch/$s_!TErH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 848w, https://substackcdn.com/image/fetch/$s_!TErH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 1272w, https://substackcdn.com/image/fetch/$s_!TErH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47d85d71-6503-4452-ad8c-957585385133_867x409.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Four agents using identical Claude Sonnet 4.6 models. Quality scores from 3.93 to 4.75&#8212;a 0.82-point spread. <a href="https://github.com/bobmatnyc/claude-mpm">claude-mpm</a> finished in 45 minutes while warp took 313 minutes. Almost <strong>7x longer for lower quality results</strong>.</p><h3>The Scaling Pattern</h3><p>The harness advantage compounds with complexity:</p><ul><li><p><strong>Levels 1-2:</strong> All agents performed similarly. Simple tasks don&#8217;t reveal orchestration differences.</p></li><li><p><strong>Level 3:</strong> API integration and Docker setup separated agents that plan from those that code-and-fix. Clear gaps emerged.</p></li><li><p><strong>Levels 4-5:</strong> Architecture and full-stack challenges broke most agents. Only well-orchestrated systems completed the complex workflows.</p></li></ul><p>The pattern is clear: as complexity increases, harness quality becomes the primary determinant of success.</p><h2>Evidence Pillar #2: Market Validation &#8212; GPT-5.4 Caught Up</h2><h3>Model Parity Achievement</h3><p>February-April 2026 benchmarks confirm <strong>GPT-5.4 has achieved parity with Claude Sonnet 4.6</strong>:</p><p><strong>Core Benchmarks:</strong></p><ul><li><p><strong>SWE-bench Verified</strong>: GPT-5.4 ~80% vs Claude 79.6% (statistical tie)</p></li><li><p><strong>SWE-bench Pro</strong>: GPT-5.4 57.7% vs Claude 43.6% (GPT leads complex problems)</p></li><li><p><strong>Terminal-Bench</strong>: GPT-5.4 75.1% vs Claude ~65% (DevOps advantage)</p></li><li><p><strong>Context handling</strong>: Both models feature 1M token windows</p></li></ul><h3>Yet Claude Still Dominates Through Harness Advantages</h3><p>Despite achieving model parity, the competitive landscape tells the harness story:</p><p><strong>Market Reality:</strong></p><ul><li><p><strong>Developer preference</strong>: Claude 70% (superior workflow integration)</p></li><li><p><strong>Enterprise share</strong>: Anthropic +4.9% MoM growth, OpenAI -1.5% decline</p></li><li><p><strong>Revenue</strong>: Claude Code $2B ARR in 6 months</p></li></ul><p><strong>Even when models reach parity, harness quality determines adoption.</strong></p><h3>The Multi-Model Strategic Reality</h3><p>Leading organizations aren&#8217;t choosing between models anymore&#8212;they&#8217;re deploying <strong>three-tier strategic architectures</strong> based on cost-performance optimization:</p><p><strong>Tier 1: Daily Workhorse (60-70% of requests)</strong></p><ul><li><p><strong>Claude Sonnet 4.6</strong>: <a href="https://medium.com/@mkteam/gpt-5-4-vs-claude-sonnet-4-6-2026-the-ultimate-ai-model-comparison-49526cac8b14">$3/$15 per million tokens</a></p></li><li><p>High-volume development, routine coding tasks</p></li><li><p><a href="https://www.nxcode.io/resources/news/claude-sonnet-4-6-vs-gpt-5-4-coding-comparison-2026">95%+ of premium model quality at half the cost</a></p></li><li><p>Default choice for most enterprise development work</p></li></ul><p><strong>Tier 2: Specialized Operations (20-30% of requests)</strong></p><ul><li><p><strong>GPT-5.4</strong>: <a href="https://medium.com/@mkteam/gpt-5-4-vs-claude-sonnet-4-6-2026-the-ultimate-ai-model-comparison-49526cac8b14">$2.50/$15 per million tokens</a></p></li><li><p>Terminal operations, DevOps workflows, CI/CD debugging</p></li><li><p><a href="https://www.morphllm.com/best-ai-model-for-coding">75.1% Terminal-Bench score (10-point lead over competitors)</a></p></li><li><p><a href="https://medium.com/@ricardomsgarces/openai-codex-vs-github-copilot-why-codex-is-winning-the-future-of-coding-f9a2767695b0">Inherited Codex&#8217;s terminal operation dominance</a></p></li></ul><p><strong>Tier 3: Premium Analysis (10-20% of requests)</strong></p><ul><li><p><strong>Claude Opus 4.6</strong>: <a href="https://medium.com/@mkteam/gpt-5-4-vs-claude-sonnet-4-6-2026-the-ultimate-ai-model-comparison-49526cac8b14">$5/$25 per million tokens</a></p></li><li><p>Complex reasoning, architectural decisions, high-stakes analysis</p></li><li><p><a href="https://help.apiyi.com/en/gpt-5-4-vs-claude-opus-4-6-comparison-2026-en.html">World leader in abstract reasoning (87.4% vs GPT-5.4&#8217;s 83.9%)</a></p></li><li><p>When cost justifies maximum capability</p></li></ul><p>This confirms the core thesis: when models are &#8220;good enough,&#8221; teams optimize for <strong>strategic cost-performance fit</strong>, not raw capability or marketing claims.</p><h2>Evidence Pillar #3: Community Validation &#8212; The Codex Orchestration Story</h2><h3>Reddit Confirms Orchestration Improvements</h3><p>Reddit research explains Codex&#8217;s impressive efficiency results (42 minutes, 4.49 quality score). The evidence confirms improvements come from orchestration, not just model upgrades.</p><p><strong>Architectural Evolution Evidence:</strong></p><ul><li><p><a href="https://medium.com/@aliazimidarmian/openai-codex-from-2021-code-model-to-a-2025-autonomous-coding-agent-85ef0c48730a">Codex evolved from &#8220;embedded assistant&#8221; &#8594; &#8220;independent agent with multi-agent orchestration&#8221;</a></p></li><li><p><a href="https://www.digitalapplied.com/blog/gpt-5-2-codex-openai-model-guide-2026">GPT-5.2-Codex (Jan 2026) with 192K context + MCP tool orchestration</a></p></li><li><p><a href="https://developers.openai.com/blog/openai-for-developers-2025">&#8220;Command center for agents&#8221; interface launched Feb 2026</a></p></li></ul><p><strong>Workflow Efficiency Improvements:</strong></p><ul><li><p><a href="https://reelmind.ai/blog/openai-codex-code-generation-features-reddit-developer-insights">Developers report queuing &#8220;4-5 Codex tasks before diving into manual work&#8221;</a></p></li><li><p><a href="https://reelmind.ai/blog/openai-codex-code-generation-features-reddit-developer-insights">&#8220;2-3 completed PRs waiting for review&#8221; after a coffee break</a></p></li><li><p><a href="https://www.nxcode.io/resources/news/openai-codex-app-review-2026">P99 response time 45ms vs Copilot&#8217;s 55ms through better context management</a></p></li><li><p><strong>Parallel processing capabilities</strong> that enable true background orchestration</p></li></ul><p><strong>Enterprise Orchestration Benefits:</strong></p><ul><li><p><strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/">70% more pull requests</a></strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/"> merged weekly at OpenAI</a></p></li><li><p><strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/">50% reduction</a></strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/"> in code review times at Cisco</a></p></li><li><p><strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/">67% reduction</a></strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/"> in median turnaround time at Duolingo</a></p></li><li><p><strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/">90% Fortune 100 adoption</a></strong><a href="https://www.quantumrun.com/consulting/openai-codex-statistics/"> validates orchestration value at scale</a></p></li></ul><h3>The Community Strategic Deployment Pattern</h3><p>Reddit developers now recommend <strong>different tools for different purposes</strong>:</p><ul><li><p><strong>Claude Code</strong>: Code quality and reasoning</p></li><li><p><strong>Cursor</strong>: Daily coding integration</p></li><li><p><strong>OpenAI Codex</strong>: Complex multi-agent workflows and long-horizon autonomy</p></li></ul><p>This matches exactly what the market data predicted: teams use orchestrated tools strategically rather than seeking one universal solution.</p><h2>The Harness Quality Ladder</h2><p>Based on all three evidence pillars, I see four tiers of orchestration quality emerging:</p><p><strong>Tier 1: Basic Wrappers</strong></p><ul><li><p>Simple API access, minimal context management</p></li><li><p>Examples: Raw ChatGPT interface, basic API wrappers</p></li><li><p>Limitation: No file coordination, poor context retention</p></li></ul><p><strong>Tier 2: Workflow Tools</strong></p><ul><li><p>File awareness, some context management</p></li><li><p>Examples: GitHub Copilot, basic IDE extensions</p></li><li><p>Capability: Single-file optimization, limited cross-file understanding</p></li></ul><p><strong>Tier 3: Orchestrated Systems</strong></p><ul><li><p>Multi-file coordination, workflow integration</p></li><li><p>Examples: Cursor, Claude Code, well-configured aider</p></li><li><p>Advantage: Understands project structure, handles complex tasks</p></li></ul><p><strong>Tier 4: Agentic Frameworks</strong></p><ul><li><p>Multi-agent coordination, planning, verification</p></li><li><p>Examples: claude-mpm, advanced orchestration systems</p></li><li><p>Power: Full project lifecycle, quality assurance, architectural thinking</p></li></ul><p>The performance cliff between tiers is exponential, not linear. Bad orchestration can make great models perform poorly; great orchestration can make good models perform excellently.</p><h2>Academic and Industry Validation</h2><p>This isn&#8217;t just empirical observation. Multiple 2026 research papers and industry studies support the harness thesis:</p><p><strong>Academic Consensus:</strong><br>The arXiv paper <a href="https://arxiv.org/html/2511.14136v1">&#8220;Beyond Accuracy: A Multi-Dimensional Framework for Evaluating Enterprise Agentic AI Systems&#8221;</a> shows that domain-tuned models with better orchestration achieve superior cost-normalized accuracy despite using smaller base models.</p><p><a href="https://pricepertoken.com/leaderboards/benchmark/humaneval">SWE-bench data</a> reveals the same pattern. Cursor, Claude Code, and Auggie all use similar base models yet score between 50.2% and 55.4%, while the raw model score is only 45.9%. The 5.9-point improvement comes entirely from better context retrieval and agent design.</p><p><strong>Business Reality Check:</strong><br><a href="https://claude5.com/news/enterprise-ai-adoption-2026-how-businesses-deploy-claude-gpt">Enterprise adoption surveys</a> show a clear shift in CTO priorities. &#8220;Model performance&#8221; is dropping in tool evaluation criteria, replaced by governance, integration quality, and workflow fit. As one 2026 McKinsey report put it: &#8220;CTOs are realizing their biggest bottleneck isn&#8217;t model performance&#8212;it&#8217;s governance.&#8221;</p><h2>What This Means for Engineering Leaders</h2><h3>Stop Optimizing for Benchmarks</h3><p>The old procurement mindset was model-first: &#8220;We need access to GPT-6 for competitive advantage.&#8221; The new reality is that benchmark performance doesn&#8217;t predict practical utility. SWE-bench scores don&#8217;t tell you whether a tool will integrate with your existing workflow, handle your codebase size, or recover gracefully from errors.</p><p>Start evaluating harness quality:</p><ul><li><p><strong>Context management:</strong> How well does it understand your project structure?</p></li><li><p><strong>File coordination:</strong> Can it work intelligently across multiple files?</p></li><li><p><strong>Error recovery:</strong> Does it handle failures gracefully or require constant babysitting?</p></li><li><p><strong>Workflow integration:</strong> How does it fit with your team&#8217;s existing development process?</p></li></ul><h3>Budget for Orchestration Quality</h3><p>The three evidence pillars show that investing in better orchestration yields measurable returns:</p><ul><li><p><strong>Quality per minute:</strong> claude-mpm achieved 4.75 quality in 45 minutes; warp achieved 3.94 in 313 minutes</p></li><li><p><strong>Market validation:</strong> Claude maintains dominance despite model parity through superior developer experience</p></li><li><p><strong>Enterprise results:</strong> 70% more PRs, 50% faster code review, 67% faster turnaround</p></li></ul><p>The ROI case for harness investment is clear and quantifiable.</p><h3>Team Productivity Focus</h3><p>Tool choice impacts your entire development pipeline. The 7x speed difference between well and poorly orchestrated tools using the same model means tool selection is a productivity multiplier, not just a capability decision.</p><p>Better tools also reduce onboarding time and increase adoption rates. A tool that works reliably gets used; one that requires constant troubleshooting gets abandoned.</p><h2>The Competitive Landscape Evolution</h2><h3>Codex Deserves Recognition</h3><p>Codex&#8217;s performance has significantly improved. At 42 minutes for all five levels with a 4.49 quality score, it achieved by far the best efficiency in my study. GPT-5.4+ combined with the orchestration improvements OpenAI made represents a compelling package. The Reddit research confirms this wasn&#8217;t just a model upgrade&#8212;it was an architectural evolution toward multi-agent orchestration.</p><h3>Claude Code&#8217;s Harness Moat</h3><p>While Claude Code performed well (4.53 quality score), the market validation shows its true strength: <strong>ecosystem superiority</strong>. Despite GPT-5.4 achieving model parity, Claude maintains 70% developer preference through superior harness quality. This is exactly what sustainable competitive advantage looks like in the post-parity era.</p><h3>The Multi-Model Future</h3><p>All evidence points to the same conclusion: the era of picking one model is over. Leading organizations deploy <strong>three-tier cost-performance architectures</strong>, optimizing for specific strengths rather than seeking universal solutions.</p><p>Real enterprise case studies validate this pattern:</p><ul><li><p><strong><a href="https://www.datastudios.org/post/claude-in-the-enterprise-case-studies-of-ai-deployments-and-real-world-results">TELUS (57,000 employees)</a></strong><a href="https://www.datastudios.org/post/claude-in-the-enterprise-case-studies-of-ai-deployments-and-real-world-results">: Uses Sonnet as core engine across developer teams</a></p></li><li><p><strong><a href="https://www.datastudios.org/post/claude-in-the-enterprise-case-studies-of-ai-deployments-and-real-world-results">Zapier</a></strong><a href="https://www.datastudios.org/post/claude-in-the-enterprise-case-studies-of-ai-deployments-and-real-world-results">: 800+ internal agents using strategic model selection</a></p></li><li><p><strong><a href="https://devtk.ai/en/blog/claude-api-pricing-guide-2026/">Financial Services</a></strong><a href="https://devtk.ai/en/blog/claude-api-pricing-guide-2026/">: Monthly costs ~$80 at massive scale through optimized routing</a></p></li></ul><p>The successful pattern: <strong>Sonnet for volume, GPT-5.4 for DevOps, Opus for complexity</strong>.</p><h2>The Token Economics Reality</h2><p>claude-mpm achieved the highest quality score (4.75) but used 87 million tokens versus codex&#8217;s 120K. This looks expensive until you consider the output: 262 comprehensive tests (vs codex&#8217;s 32), complete documentation, 100% verification rates, and multi-file coordination (note: this was also a wake-up call to me to focus on token optimization, current version is much stingier)</p><p>The 700x token multiplier isn&#8217;t overhead&#8212;it&#8217;s the cost of work a solo agent skips. <strong>Orchestration doesn&#8217;t waste tokens&#8212;it spends them on comprehensive deliverables.</strong></p><p>The optimization question: Could you achieve 80% of the quality benefits at 30% of the token cost? The opportunity isn&#8217;t eliminating orchestration&#8212;it&#8217;s finding the minimal viable team size for maximum impact.</p><h3>The Vendor Bias Problem: &#8220;Opus for Everything&#8221;</h3><p>Boris Cherny, the Claude Code lead, recently advocated for using &#8220;Opus for everything.&#8221; This perfectly illustrates the disconnect between vendor recommendations and practical deployment reality.</p><p><strong>Only someone working for Anthropic can say that.</strong></p><p>When your employer provides unlimited access to premium models, of course you&#8217;d recommend the most expensive option for every task. But real organizations operating with P&amp;L responsibility make strategic decisions about when premium capability justifies premium cost.</p><p>This vendor bias actually <strong>validates the multi-model thesis</strong>:</p><ul><li><p><strong>Vendors say:</strong> &#8220;Use our premium model for everything&#8221;</p></li><li><p><strong>Users do:</strong> Strategic model selection based on task complexity and budget constraints</p></li><li><p><strong>Market reality:</strong> 70% prefer Claude for daily coding (cost/speed), GPT-5.4 for complex reasoning (quality ceiling)</p></li></ul><p>Cherny&#8217;s comment inadvertently proves that <strong>cost-conscious orchestration</strong> is the real competitive battleground. Companies that figure out optimal model routing&#8212;not maximal model usage&#8212;will have sustainable advantages.</p><p>The vendors push premium. The market chooses strategically. <strong>The harness makes both possible.</strong></p><h2>The Future: Welcome to the Harness Era</h2><h3>What Changes for Developers</h3><p>Tool selection framework:</p><ol><li><p><strong>Workflow fit:</strong> Does it match how your team works?</p></li><li><p><strong>Integration quality:</strong> Plays well with existing tools?</p></li><li><p><strong>Reliability:</strong> Can you trust it with production code?</p></li><li><p><strong>Model quality:</strong> Fourth priority</p></li></ol><h3>What Changes for the Industry</h3><p>Foundation models are becoming commodities. Differentiation shifts to integration, context management, and user experience. The next unicorns will be harness companies, not model companies.</p><p>Major funding flows to orchestration companies. Enterprise procurement evaluates integration first, model second.</p><h3>The Competitive Moat Shift</h3><p>The old game was: train bigger models, claim benchmark superiority. The new game is: build better orchestration, solve real workflow problems. Model access becomes a utility; workflow mastery becomes the moat.</p><h2>Practical Recommendations</h2><h3>For CTOs and Engineering Leaders</h3><ul><li><p><strong>Audit orchestration quality</strong>: Test tools with your actual codebase for 2-week trials</p></li><li><p><strong>Budget 60/40</strong>: Spend more on harness development than model subscription fees</p></li><li><p><strong>Measure real metrics</strong>: Track pull request velocity and code review time, not benchmark scores</p></li><li><p><strong>Evaluate integration first</strong>: How well does it fit your existing CI/CD pipeline?</p></li></ul><h3>For Developers</h3><ul><li><p><strong>Test with real projects</strong>: Spend 2 days with each tool on actual work before deciding</p></li><li><p><strong>Learn orchestration patterns</strong>: Context management and file coordination matter more than prompts</p></li><li><p><strong>Invest in mastery</strong>: The 7x efficiency difference justifies significant learning time</p></li><li><p><strong>Ignore marketing claims</strong>: Model access means nothing without good orchestration</p></li></ul><h3>For the AI Industry</h3><ul><li><p><strong>Build for workflow integration</strong>: Solve real development pipeline problems</p></li><li><p><strong>Measure practical utility</strong>: Developer retention and task completion rates beat benchmarks</p></li><li><p><strong>Focus on context management</strong>: Multi-file coordination is the real competitive moat</p></li></ul><h2>Conclusion: The Questions That Matter Now</h2><p>The old question was: &#8220;What&#8217;s the best model?&#8221;</p><p>The new question is: &#8220;What&#8217;s the best harness for my team&#8217;s workflow?&#8221;</p><p>Three evidence sources prove we&#8217;ve crossed a threshold: foundation models are &#8220;good enough,&#8221; and orchestration quality now dominates outcomes. Laboratory testing, market validation, and community confirmation point to the same reality.</p><p>The foundation model is the engine. The harness is the car. The best engine in the world won&#8217;t get you anywhere without wheels.</p><p><strong>The harness era has begun. Drive accordingly.</strong></p><div><hr></div><p><em>Bob Matsuoka is CTO at <a href="https://www.duettocloud.com/">Duetto Research</a> and creator of <a href="https://github.com/bobmatnyc/claude-mpm">Claude MPM</a>, one of the agents evaluated in this study. All evaluation data and methodology are available at <a href="https://github.com/bobmatnyc/ai-coding-bake-off">github.com/bobmatnyc/ai-coding-bake-off</a> for reproducibility.</em></p><div><hr></div><h2>Appendix: Complete Results Data</h2><h3>Quality Scores by Criterion</h3><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dSX7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dSX7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 424w, https://substackcdn.com/image/fetch/$s_!dSX7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 848w, https://substackcdn.com/image/fetch/$s_!dSX7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 1272w, https://substackcdn.com/image/fetch/$s_!dSX7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dSX7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png" width="981" height="368" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:368,&quot;width&quot;:981,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69754,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193459844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dSX7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 424w, https://substackcdn.com/image/fetch/$s_!dSX7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 848w, https://substackcdn.com/image/fetch/$s_!dSX7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 1272w, https://substackcdn.com/image/fetch/$s_!dSX7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cbd80c-434e-456b-ae62-1fb565e1ec0d_981x368.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>GPT-5.4 vs Claude Sonnet 4.6 Market Data</h3><p><strong>SWE-bench Performance:</strong></p><ul><li><p>SWE-bench Verified: GPT-5.4 ~80% vs Claude 79.6% (statistical tie)</p></li><li><p>SWE-bench Pro: GPT-5.4 57.7% vs Claude 43.6% (GPT advantage on complex problems)</p></li><li><p>Terminal-Bench: GPT-5.4 75.1% vs Claude ~65% (GPT DevOps advantage)</p></li></ul><p><strong>Market Metrics:</strong></p><ul><li><p>Developer preference (daily coding): Claude 70%</p></li><li><p>Enterprise market share: Anthropic +4.9% MoM, OpenAI -1.5% MoM</p></li><li><p>Claude Code revenue: $2B ARR in 6 months</p></li></ul><h3>Methodology Notes</h3><ul><li><p><strong>Laboratory data:</strong> Single run evaluation with disclosed author bias</p></li><li><p><strong>Market data:</strong> Cross-validated across 15+ authoritative sources</p></li><li><p><strong>Community research:</strong> Reddit analysis across 8+ developer subreddits</p></li><li><p><strong>Statistical confidence:</strong> Mean inter-reviewer deviation of 0.216 points</p></li><li><p><strong>Reproducible:</strong> All data and prompts available in public repository</p></li></ul>]]></content:encoded></item><item><title><![CDATA[I Met a Movie Star Mila Jovovich — As a Coder]]></title><description><![CDATA[More evidence of the democratization of software]]></description><link>https://hyperdev.matsuoka.com/p/i-met-a-movie-star-mila-jovovich</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/i-met-a-movie-star-mila-jovovich</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Sat, 11 Apr 2026 12:31:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_YC7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_YC7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_YC7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 424w, https://substackcdn.com/image/fetch/$s_!_YC7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 848w, https://substackcdn.com/image/fetch/$s_!_YC7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 1272w, https://substackcdn.com/image/fetch/$s_!_YC7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_YC7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png" width="1024" height="659" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:659,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1463075,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193848267?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F188090d8-280b-48b0-81af-3a11dec4dac3_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_YC7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 424w, https://substackcdn.com/image/fetch/$s_!_YC7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 848w, https://substackcdn.com/image/fetch/$s_!_YC7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 1272w, https://substackcdn.com/image/fetch/$s_!_YC7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d0e03e6-ff0f-4040-ab56-8239bf91a20d_1024x659.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I didn&#8217;t expect to meet Mila Jovovich through a GitHub issue.</p><p>But there I was last week, deep-diving into her AI memory framework called <a href="https://github.com/milla-jovovich/mempalace">MemPalace</a>, when I discovered something remarkable: the &#8220;Resident Evil&#8221; and &#8220;Fifth Element&#8221; star had created one of the most talked-about AI memory systems of 2026. And she&#8217;d done it using Claude Code, the same AI-assisted development environment I use daily.</p><p>More remarkably, when I found critical bugs in her benchmark methodology, she responded directly through her Claude Code workflow, acknowledging the issues and implementing fixes. Not through a PR team or engineering intermediaries &#8212; Mila herself, using AI-assisted development to debug complex memory retrieval algorithms at 9 AM on a Thursday.</p><p>This isn&#8217;t a story about a celebrity coding stunt. It&#8217;s about something much more profound: we&#8217;ve entered an era where outcomes and features drive development, not the technical limitations of writing code.</p><h2>The MemPalace Phenomenon</h2><p>In April 2026, Mila Jovovich and developer Ben Sigman released MemPalace, an open-source AI memory system that immediately went viral. Within 48 hours, it had <a href="https://github.com/milla-jovovich/mempalace">over 23,000 GitHub stars</a>. The system claimed to achieve the first perfect score on the LongMemEval benchmark, scoring 96.6% raw recall.</p><p>The project represents something unprecedented: a free, locally-running memory system that rivals expensive cloud alternatives like Mem0 ($19-249/month) and Zep ($25+/month). It uses the &#8220;memory palace&#8221; technique &#8212; a classical memory method dating back to ancient Greece &#8212; implemented through ChromaDB and SQLite, with zero ongoing API costs.</p><p>The technical architecture includes basic Claude Code integration (save hooks every 15 messages and before context compression) and 24 tools via the Model Context Protocol (MCP), making it compatible across multiple AI platforms.</p><p>The duo had spent months building it using Claude Code&#8217;s AI-assisted development environment. As Sigman noted, he provided &#8220;the engineering chops&#8221; while Jovovich drove the architectural vision.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://github.com/milla-jovovich" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!93ZZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 424w, https://substackcdn.com/image/fetch/$s_!93ZZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 848w, https://substackcdn.com/image/fetch/$s_!93ZZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!93ZZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!93ZZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg" width="459" height="460" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:460,&quot;width&quot;:459,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52413,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://github.com/milla-jovovich&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193848267?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!93ZZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 424w, https://substackcdn.com/image/fetch/$s_!93ZZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 848w, https://substackcdn.com/image/fetch/$s_!93ZZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!93ZZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d7e3197-469d-4e18-86f1-3033d8bd4a27_459x460.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>When Audits Meet AI-Generated Code</h2><p>That&#8217;s when things got interesting.</p><p>As someone who works extensively with AI memory systems &#8212; I maintain <a href="https://github.com/bobmatnyc/kuzu-memory">KuzuMemory</a>, a graph-based memory framework &#8212; I was naturally curious about MemPalace&#8217;s benchmark methodology. The claimed 96.6% recall rate was extraordinary, especially for a system running entirely locally.</p><p>So I dove in.</p><p>What I found were several methodological issues that fundamentally undermined the headline numbers. The benchmark adapter was discarding assistant turns in conversation history, causing systematic under-recall on certain question types. More critically, the benchmark wasn&#8217;t actually testing MemPalace&#8217;s core functionality &#8212; it was primarily testing ChromaDB&#8217;s raw vector search capabilities.</p><p>I filed <a href="https://github.com/milla-jovovich/mempalace/issues/242">Issue #242</a> documenting the assistant turn bug, and <a href="https://github.com/milla-jovovich/mempalace/issues/214">Issue #214</a> showing that the 96.6% score was essentially a ChromaDB score, not a MemPalace score.</p><p>Mila&#8217;s response was immediate and technically sophisticated:</p><blockquote><p>&#8220;Hey <a href="https://github.com/bobmatnyc">@bobmatnyc</a> &#8212; I&#8217;ve taken a look and ran it through CLI. This is a real bug and it&#8217;s urgent. You caught that <code>benchmarks/longmemeval_bench.py</code> at lines 189-190 builds each session&#8217;s indexed document by concatenating <em>only</em> <code>user</code> role turns... <strong>Fix priority: this must land before any public benchmark re-run.</strong>&#8220;</p></blockquote><p>She didn&#8217;t deflect or dismiss. She debugged the issue herself, identified the exact lines of code causing the problem, explained the downstream impact on other benchmarks, and outlined a detailed fix plan including regression tests.</p><p>This wasn&#8217;t PR speak. This was an AI-assisted developer engaging seriously with technical criticism.</p><h2>The Democratization Shift</h2><p>This interaction crystallized something profound about our current moment in software development.</p><p>We&#8217;re witnessing the emergence of a new class of builders: technically-minded individuals who understand software conceptually but may not have traditional coding backgrounds. AI-assisted development tools like Claude Code, GitHub Copilot, and Cursor have lowered the implementation barrier to the point where vision and domain expertise matter more than syntax mastery.</p><p>Mila Jovovich exemplifies this shift perfectly. Without formal technical education (she left school in 7th grade for modeling), she spent months intensively learning AI-assisted development through Claude Code starting in late 2025. She understood the conceptual framework of memory palaces deeply enough to architect a sophisticated system. Her collaboration with Ben Sigman &#8212; CEO of Bitcoin lending platform Libre Labs, who provided the engineering expertise while she drove architectural vision &#8212; represents a new model of software development where domain knowledge and AI tool fluency can substitute for traditional programming backgrounds.</p><p>The fact that a movie star can release a technically competent, widely-adopted memory framework isn&#8217;t a commentary on coding getting easier (though it has). It&#8217;s about software development becoming more accessible to domain experts and visionaries who previously couldn&#8217;t bridge the implementation gap.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f3FK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f3FK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!f3FK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!f3FK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!f3FK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f3FK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1971642,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193848267?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f3FK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!f3FK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!f3FK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!f3FK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b861799-f605-4e03-bec5-f88ec1387a42_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>What MemPalace Gets Right</h2><p>Despite the benchmark issues I uncovered, MemPalace demonstrates genuine technical sophistication. The memory palace metaphor isn&#8217;t just marketing &#8212; it&#8217;s a thoughtful architectural choice that makes AI memory systems more intuitive and debuggable.</p><p>The system includes elegant features like per-agent memory &#8220;wings&#8221; that prevent cross-contamination between different AI assistants. The Claude Code integration hooks are well-designed, automatically triggering memory saves at logical conversation boundaries. The MCP implementation is clean and follows established patterns.</p><p>Most importantly, the project tackles a real problem: most AI memory systems are either expensive cloud services or complex local installations. MemPalace provides a middle path that&#8217;s both free and relatively easy to deploy.</p><p>Through my testing and integration experiments, I learned techniques that improved my own KuzuMemory system. The competitive analysis forced me to think more carefully about memory organization patterns and retrieval strategies. This kind of cross-pollination benefits the entire ecosystem.</p><h2>The Validation Requirement</h2><p>But the benchmark controversy highlights a crucial point: democratized software development still requires traditional validation methods.</p><p>AI-assisted coding tools excel at implementation but can perpetuate subtle conceptual errors throughout a codebase. The MemPalace benchmark issues weren&#8217;t obvious bugs &#8212; they were methodological problems that required domain expertise to identify.</p><p>This creates an interesting dynamic: AI tools enable rapid development by non-traditional developers, but peer review by experienced practitioners becomes even more critical. The community response to MemPalace&#8217;s inflated benchmarks wasn&#8217;t hostile &#8212; it was collaborative debugging at scale.</p><p>Mila&#8217;s willingness to engage directly with technical criticism and implement fixes demonstrates the right approach. The democratization of software development doesn&#8217;t eliminate the need for technical rigor; it distributes that rigor across a broader community.</p><h2>The Harness Thesis Validated</h2><p>This story perfectly validates what I call the &#8220;harness thesis&#8221; &#8212; that we&#8217;ve entered an era where AI tool ecosystems matter more than underlying model capabilities.</p><p>MemPalace succeeded not because Mila wrote perfect code from scratch, but because she effectively orchestrated Claude Code to implement her vision. The system&#8217;s value comes from its architectural choices, integration quality, and user experience &#8212; not from novel algorithmic breakthroughs.</p><p>Similarly, my ability to audit and improve the system came not from superior coding skills, but from having developed complementary expertise with memory systems and benchmark methodology. The collaboration that emerged &#8212; distributed across GitHub issues, with contributors from multiple backgrounds &#8212; represents the new model of software development.</p><p>We&#8217;re not just building different software; we&#8217;re building software differently.</p><h2>Meeting Mila Through Code</h2><p>In the end, I did meet Mila Jovovich &#8212; through our AI Agents, lines of Python code, GitHub issues, and technical discussions about memory retrieval algorithms, mediated by our respective Claude Code workflows. Not the meeting I would have predicted, but somehow more meaningful than a typical celebrity encounter.</p><p>She embodies a new archetype: the technical visionary who uses AI tools to implement sophisticated ideas without traditional programming backgrounds. Her willingness to engage with criticism and continuously improve the system demonstrates the collaborative spirit that makes this new era of development possible.</p><p>The future of software isn&#8217;t just about better AI models or more powerful tools. It&#8217;s about enabling more people with domain expertise and creative vision to participate in building the systems that shape our digital world.</p><p>And sometimes, that means meeting your childhood movie star idol in a GitHub issue thread, debugging memory palace algorithms together.</p><div><hr></div><p><em>Bob Matsuoka is CTO of <a href="https://www.duettocloud.com/">Duetto</a> and writes about AI-powered engineering at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p><p><strong>Related reading:</strong></p><ul><li><p><a href="https://hyperdev.matsuoka.com/its-the-harness-stupid">It&#8217;s The Harness Stupid</a> &#8212; Why AI tool ecosystems matter more than model capabilities</p></li><li><p><a href="https://aipowerranking.com/">AI Power Ranking</a> &#8212; Tool comparisons and benchmarks for AI practitioners</p></li><li><p><a href="https://www.linkedin.com/newsletters/ai-power-ranking-7345782916301418496/">LinkedIn Newsletter</a> &#8212; Strategic AI insights for CTOs and engineering leaders</p></li></ul><p><strong>Referenced Links:</strong></p><ul><li><p><a href="https://github.com/milla-jovovich/mempalace">MemPalace GitHub Repository</a></p></li><li><p><a href="https://github.com/bobmatnyc/kuzu-memory">KuzuMemory GitHub Repository</a></p></li><li><p><a href="https://github.com/milla-jovovich/mempalace/issues/242">Issue #242: Benchmark adapter bug</a></p></li><li><p><a href="https://github.com/milla-jovovich/mempalace/issues/214">Issue #214: ChromaDB vs MemPalace scoring</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Software Factory is the Next Big Challenge]]></title><description><![CDATA[Many enterprises are rolling their own]]></description><link>https://hyperdev.matsuoka.com/p/the-software-factory-is-the-next</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/the-software-factory-is-the-next</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Wed, 08 Apr 2026 12:30:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tsf8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tsf8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tsf8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 424w, https://substackcdn.com/image/fetch/$s_!tsf8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 848w, https://substackcdn.com/image/fetch/$s_!tsf8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 1272w, https://substackcdn.com/image/fetch/$s_!tsf8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tsf8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png" width="1024" height="825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:825,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1823642,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193118243?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F90d9b63c-9a04-4eb4-895e-c659c19b1b3b_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tsf8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 424w, https://substackcdn.com/image/fetch/$s_!tsf8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 848w, https://substackcdn.com/image/fetch/$s_!tsf8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 1272w, https://substackcdn.com/image/fetch/$s_!tsf8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58714619-d828-47fe-aa27-7777b26b3b11_1024x825.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Software Factory is the future of software development</figcaption></figure></div><p>Stripe engineers send Slack messages that automatically become production code. Not suggestions. Not drafts. Production code merged into their main branch, supporting over a trillion dollars in annual payment processing.</p><p>Their <a href="https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-coding-agents">&#8220;Minions&#8221; system</a> generates <a href="https://www.infoq.com/news/2026/03/stripe-autonomous-coding-agents/">1,300 pull requests per week</a> with zero human-written code. Fire-and-forget automation from conversation to deployment. While the rest of us debate whether AI can write good code, Stripe has built a software factory that produces enterprise-grade applications at scale.</p><p>The software factory isn&#8217;t a future concept. It&#8217;s a present reality, and it represents the next fundamental challenge for engineering organizations.</p><h2>What We&#8217;re Building at Duetto</h2><p>I&#8217;ve been thinking about this a lot lately. At Duetto, we&#8217;re exploring what a software factory could look like for hospitality technology. Not because we want to eliminate developers, but because we&#8217;re hitting the limits of traditional development approaches for domain-specific applications.</p><p>Our challenge isn&#8217;t just writing code&#8212;it&#8217;s translating complex hotel revenue management requirements into software that works reliably across thousands of properties with different systems, data formats, and business rules. The cognitive load of keeping all these variations in mind while building features is becoming unsustainable.</p><p>What if we could describe what we need in something like our APEX specifications, and have the system generate not just code, but complete deployments? Kubernetes instances running Claude Code agents, database migrations, monitoring setup, the whole stack configured for that specific use case.</p><p>The goal isn&#8217;t replacing our engineering team. Our developers should be solving revenue optimization algorithms and building domain-specific integrations, not configuring YAML files for the hundredth deployment variation.</p><h2>The Stripe Blueprint</h2><p>Stripe&#8217;s Minions reveal what a production software factory actually looks like when you strip away the hype and focus on what works.</p><p><strong>Five-Layer Pipeline</strong>: Their system transforms Slack messages into production-ready pull requests through a structured pipeline. Not magic&#8212;engineering discipline applied to automation.</p><p><strong>Sandboxed Execution</strong>: Every agent runs in isolated containers with codebase checkouts. They can&#8217;t access production systems, can&#8217;t cause cascading failures, can&#8217;t break things outside their designated scope. <a href="https://www.anup.io/stripes-coding-agents-the-walls-matter-more-than-the-model/">The walls matter more than the model</a>.</p><p><strong>Surgical Tool Selection</strong>: Their <a href="https://www.mindstudio.ai/blog/what-is-ai-agent-harness-stripe-minions">Model Context Protocol</a> provides access to hundreds of internal tools, but agents get intelligently prefetched access to only the ~15 tools relevant to their specific task. Not everything available&#8212;the right things available.</p><p><strong>One-Shot Optimization</strong>: Instead of conversational back-and-forth, their agents are <a href="https://www.sitepoint.com/stripe-minions-architecture-explained/">optimized for well-defined work</a> that completes in a single execution. Better latency, lower costs, more predictable outcomes.</p><p>The results speak for themselves: 1,300 PRs weekly, zero human-written code in merged changes, supporting their entire payment infrastructure. This isn&#8217;t a pilot program. This is their production development workflow.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!18qG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!18qG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 424w, https://substackcdn.com/image/fetch/$s_!18qG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 848w, https://substackcdn.com/image/fetch/$s_!18qG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 1272w, https://substackcdn.com/image/fetch/$s_!18qG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!18qG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png" width="1024" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1235022,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193118243?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1885512e-da12-485c-9bc1-eaa9bd2512e6_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!18qG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 424w, https://substackcdn.com/image/fetch/$s_!18qG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 848w, https://substackcdn.com/image/fetch/$s_!18qG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 1272w, https://substackcdn.com/image/fetch/$s_!18qG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F405e3b24-3997-4ecc-89e7-bcd78eb0c218_1024x674.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Broader Software Factory Landscape</h2><p>Stripe isn&#8217;t alone in building these systems, just the most public about their approach.</p><p>Netflix has their federated developer console integrating dozens of tools into a single unified experience. <a href="https://engineering.atspotify.com/2024/4/supercharged-developer-portals">Spotify&#8217;s Backstage</a> holds 89% market share among internal developer platforms, reducing time-to-tenth-pull-request by 55% for new developers.</p><p>The open source ecosystem is catching up quickly. <a href="https://openhands.dev/">OpenHands</a> provides a model-agnostic platform for cloud coding agents with $18.8M in Series A funding. <a href="https://www.turing.com/blog/top-5-ai-code-generation-tools-in-2024">CodeT5</a> handles multi-language code generation. <a href="https://github.com/features/copilot/enterprise">GitHub Copilot Enterprise</a> is expanding beyond code completion into full workflow automation.</p><p>Major cloud providers are also building comprehensive platforms. Microsoft&#8217;s GitHub Copilot Workspace, Google&#8217;s Duet AI for developers, and Amazon&#8217;s Q Developer all represent enterprise-grade attempts at software factory capabilities.</p><p>According to Gartner, <a href="https://calmops.com/devops/internal-developer-platform-idp-2026-complete-guide/">80% of large engineering organizations</a> now have dedicated platform teams. The question isn&#8217;t whether software factories are coming&#8212;it&#8217;s whether your organization will build one or buy one.</p><h2>What a Proper Software Factory Requires</h2><p>Building a software factory isn&#8217;t just about connecting AI tools to deployment pipelines. Based on what&#8217;s working at Stripe and emerging patterns across the industry, here are the essential components:</p><h3>Artifact Response Systems</h3><p>Your factory needs to respond to structured specifications and generate complete deployments. At Duetto, this might mean taking an APEX specification for a new revenue optimization feature and producing:</p><ul><li><p>Kubernetes deployment configurations</p></li><li><p>Database migration scripts</p></li><li><p>Monitoring and alerting setup</p></li><li><p>Load testing scenarios</p></li><li><p>Documentation</p></li></ul><p>The system should handle the entire deployment lifecycle from specification to running production service, not just generate code that someone has to manually deploy.</p><h3>Strategic Human Review Checkpoints</h3><p>Notice I said strategic, not comprehensive. Stripe&#8217;s fire-and-forget model works because they&#8217;ve identified the specific points where human judgment adds value without blocking automation.</p><p>For enterprise applications, you need checkpoints at:</p><ul><li><p><strong>Specification validation</strong>: Do the requirements make business sense?</p></li><li><p><strong>Security review</strong>: Are access patterns and data handling appropriate?</p></li><li><p><strong>Integration testing</strong>: Does this work with existing systems?</p></li><li><p><strong>Production readiness</strong>: Are monitoring and rollback capabilities sufficient?</p></li></ul><p>The key is making these gates fast and decisive, not bureaucratic approval processes that defeat the purpose of automation.</p><h3>Scaffolding for Error Detection</h3><p>Your factory will produce broken code. That&#8217;s not a bug&#8212;that&#8217;s reality. The difference between a prototype and a production system is sophisticated error detection and recovery.</p><p>This means:</p><ul><li><p><strong>Isolated execution environments</strong> where failures can&#8217;t cause broader damage</p></li><li><p><strong>Automated testing and iteration</strong> when initial attempts fail</p></li><li><p><strong>Multi-layer validation</strong> before anything reaches production</p></li><li><p><strong>Comprehensive rollback capabilities</strong> for when something gets through anyway</p></li></ul><p>Stripe&#8217;s sandbox architecture is brilliant because it lets agents fail safely while learning from those failures to improve future attempts.</p><h3>Success Criteria Parameters</h3><p>Your factory needs to know what success looks like for each type of work. Not just &#8220;the code compiles,&#8221; but measurable business outcomes.</p><p>For a hospitality feature, success might mean:</p><ul><li><p>Performance benchmarks met under load</p></li><li><p>Integration tests pass with five different PMS systems</p></li><li><p>Revenue impact measurable within 30 days</p></li><li><p>Zero customer-facing errors in the first week</p></li></ul><p>Define these criteria upfront, build them into your validation pipeline, and let the factory optimize for actual business value rather than technical metrics alone.</p><h3>Cost Tracking and Optimization</h3><p>AI-powered development isn&#8217;t free. You need visibility into the computational costs, tool usage, and human review time for each generated system.</p><p>Stripe optimizes for this explicitly&#8212;their one-shot agents cost less than conversational approaches, their surgical tool selection reduces context costs, their automated testing prevents expensive human debugging cycles.</p><p>Track these metrics from day one. The difference between a cost-effective software factory and an expensive experiment is usually found in the operational details.</p><h3>Deployment Models</h3><p>Your factory needs sophisticated understanding of how to deploy different types of applications. Golden Path workflows that codify best practices, environment promotion strategies that reduce risk, and rollback procedures that restore service quickly when things go wrong.</p><p>This is where domain expertise becomes critical. A generic software factory might know how to deploy a web service, but does it understand the specific requirements for hospitality payment processing, guest data privacy, and integration with property management systems?</p><h2>The Duetto Context</h2><p>At Duetto, we&#8217;re thinking about how a software factory could handle the complexity of hospitality technology. Our domain has unique challenges:</p><p><strong>Data Integration Complexity</strong>: Every hotel uses different systems with different data formats. A software factory needs to understand these variations and generate appropriate integration code.</p><p><strong>Regulatory Requirements</strong>: Guest privacy, payment processing, accessibility compliance. The factory needs to embed these requirements into everything it produces.</p><p><strong>Performance Characteristics</strong>: Revenue management systems need to process pricing updates in near real-time across thousands of rooms and rate plans. The factory needs to optimize for these specific performance patterns.</p><p><strong>Operational Constraints</strong>: Hotels can&#8217;t afford downtime during peak booking periods. Deployment strategies need to account for hospitality business cycles.</p><p>We&#8217;re not trying to build a general-purpose software factory. We&#8217;re exploring how to build one that deeply understands our domain and can produce applications that work reliably in hospitality environments.</p><h2>The Reality Check</h2><p>Building a software factory is hard. Not because the technology doesn&#8217;t exist&#8212;Stripe proves it does&#8212;but because the organizational challenges are substantial.</p><p><strong>ROI Demonstration</strong>: You need to show measurable productivity improvements and cost savings. &#8220;The AI is impressive&#8221; isn&#8217;t sufficient justification for the investment required.</p><p><strong>Security and Compliance</strong>: Automated code generation that touches customer data or payment systems requires additional security layers and audit capabilities.</p><p><strong>Developer Workflow Changes</strong>: Your engineering team needs to learn new ways of working. Some will embrace it, others will resist. Change management is as important as the technical implementation.</p><p><strong>Quality Assurance Evolution</strong>: Your QA processes need to evolve from testing human-written code to validating AI-generated systems. Different failure modes, different testing strategies.</p><p><strong>Integration Complexity</strong>: Your factory needs to work with existing systems, databases, APIs, and workflows. The harder the integration challenge, the longer the implementation timeline.</p><p>These aren&#8217;t reasons to avoid building a software factory. They&#8217;re reasons to approach the project with realistic expectations and proper preparation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bVFQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bVFQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 424w, https://substackcdn.com/image/fetch/$s_!bVFQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 848w, https://substackcdn.com/image/fetch/$s_!bVFQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 1272w, https://substackcdn.com/image/fetch/$s_!bVFQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bVFQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png" width="1024" height="678" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:678,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1446253,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193118243?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71a61e37-d209-4927-8ab4-44a35ed22bb9_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bVFQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 424w, https://substackcdn.com/image/fetch/$s_!bVFQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 848w, https://substackcdn.com/image/fetch/$s_!bVFQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 1272w, https://substackcdn.com/image/fetch/$s_!bVFQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5931edca-6b79-4588-bae5-ab6a883a7b66_1024x678.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Looking Forward</h2><p>The trajectory is clear. <a href="https://leanopstech.com/blog/platform-engineering-in-2025-the-future-of-developer-productivity/">Software factories are moving from experimental to mainstream</a>, with proven systems operating at enterprise scale and standardized architecture patterns emerging across the industry.</p><p>The question for engineering leaders isn&#8217;t whether this transformation will happen. It&#8217;s whether your organization will be an early adopter that shapes how software factories work in your domain, or a later adopter that implements patterns developed by others.</p><p>At Duetto, we&#8217;re betting on being early. Not because we want to be on the cutting edge for its own sake, but because the companies that figure out domain-specific software factories first will have a significant competitive advantage in application development speed and quality.</p><p>The software factory represents the next evolution of platform engineering. The organizations that master it will build better software faster than those that don&#8217;t.</p><p>The challenge isn&#8217;t technical anymore. It&#8217;s organizational, strategic, and operational.</p><p>The question is: Are you ready to build one?</p><div><hr></div><p><em>About this analysis: This piece draws from comprehensive research on production software factory implementations, including detailed analysis of Stripe&#8217;s Minions architecture, enterprise platform engineering initiatives, and emerging open source solutions. The author is exploring software factory applications for hospitality technology at Duetto.</em></p><p><em>About the author: Bob Matsuoka is Chief Technology Officer at Duetto and creator of Claude MPM (Multi-agent Project Manager). He has implemented AI-assisted development workflows across enterprise engineering teams and writes about the practical realities of AI integration in software development at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Is The Claude Code Team Moving Too Quickly?]]></title><description><![CDATA[What To Think of the Source Leak]]></description><link>https://hyperdev.matsuoka.com/p/is-the-claude-code-team-moving-too</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/is-the-claude-code-team-moving-too</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 06 Apr 2026 12:30:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xLBj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xLBj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xLBj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 424w, https://substackcdn.com/image/fetch/$s_!xLBj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 848w, https://substackcdn.com/image/fetch/$s_!xLBj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 1272w, https://substackcdn.com/image/fetch/$s_!xLBj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xLBj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png" width="1024" height="595" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:595,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1151803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193101232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d9c1f20-243f-4cf5-b1c1-b708e63ffae8_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xLBj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 424w, https://substackcdn.com/image/fetch/$s_!xLBj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 848w, https://substackcdn.com/image/fetch/$s_!xLBj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 1272w, https://substackcdn.com/image/fetch/$s_!xLBj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07636e2-b451-4839-853e-91fc8ca0b4b3_1024x595.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On March 31, 2026, Anthropic accidentally shipped their entire Claude Code source&#8212;512,000 lines of TypeScript&#8212;in an npm package. What followed was perhaps the most intense technical autopsy in AI history. The verdict? Mixed, and revealing.</p><p>The criticism has been swift and pointed. A 5,594-line file with a single 3,167-line function sporting 12 levels of nesting. Regex-based frustration detection looking for &#8220;wtf&#8221; and &#8220;shit&#8221;. A quarter million wasted API calls per day from a three-line bug. As one critic put it: &#8220;A multi-billion-dollar AI company is detecting user frustration with a regex.&#8221;</p><p>But before we pile on, we need to ask: <strong>What does &#8220;good code&#8221; even mean when you&#8217;re building client-side LLM applications?</strong></p><h2>The Unprecedented Challenge</h2><p>Claude Code isn&#8217;t your typical software. It&#8217;s a client-side application that orchestrates conversations with large language models, manages context across sessions, and attempts to maintain coherent state while working with fundamentally non-deterministic systems.</p><p>This creates problems that traditional software engineering practices weren&#8217;t designed for:</p><ul><li><p><strong>Context management</strong>: Handling arbitrarily long conversations that exceed model limits</p></li><li><p><strong>Failure recovery</strong>: When your core computation is a 20% failure-rate API call</p></li><li><p><strong>State synchronization</strong>: Keeping UI, conversation history, and model context aligned</p></li><li><p><strong>Dynamic adaptation</strong>: Code that needs to adapt to changing model capabilities</p></li></ul><p>The leaked source reveals sophisticated solutions to these problems: a three-layer memory architecture, anti-distillation mechanisms, dual parser systems for safety. The engineering is <s>genuinely</s> impressive, even if the implementation is sometimes ugly.</p><h2>The Meta-Problem: AI Writing AI</h2><p>Claude Code was partially written by Claude Code. This represents the first documented case of a large-scale AI tool generating significant portions of its own source code&#8212;not just incremental improvement, but a categorical change in development methodology that creates unprecedented quality control challenges when AI-generated code scales beyond human review capacity.</p><p>When AI generates code at scales that exceed human review capacity, traditional quality control breaks down. That 3,167-line function? Probably not written by a human. The 12 levels of nesting? Algorithmic patterns, not human design choices.</p><p><strong>This is the real story</strong>: We&#8217;re witnessing the first major autopsy of self-bootstrapping AI tooling.</p><h2>Deterministic vs. LLM Code: Different Standards Apply</h2><p>I&#8217;ve been thinking about this distinction a lot lately in my work with <a href="https://github.com/bobmatnyc/claude-mpm">Claude MPM</a>, an open-source multi-agent code generation framework built on Claude Code that coordinates specialized AI agents for software development workflows. When you&#8217;re building traditional, deterministic software, all the usual rules apply. Clean functions, clear abstractions, maintainable architecture. Use your normal code analysis tools.</p><p>But when you&#8217;re building LLM-integrated systems, the rules change:</p><ol><li><p><strong>Failure is the default</strong>: Your core operations fail 20% of the time</p></li><li><p><strong>Context is expensive</strong>: Every token counts toward limits</p></li><li><p><strong>Behavior is emergent</strong>: The system does things you didn&#8217;t explicitly program</p></li><li><p><strong>Adaptation is constant</strong>: Model capabilities change monthly</p></li></ol><p>In this world, a 5,594-line file might be ugly, but if it successfully manages complex failure recovery across multiple conversation threads, it might also be <em>correct</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aIuf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aIuf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 424w, https://substackcdn.com/image/fetch/$s_!aIuf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 848w, https://substackcdn.com/image/fetch/$s_!aIuf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 1272w, https://substackcdn.com/image/fetch/$s_!aIuf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aIuf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png" width="1024" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d26aa417-4783-42d6-8105-488131dfe518_1024x790.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1466836,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193101232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd33fb826-655a-4d8b-87cc-bfade8984326_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aIuf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 424w, https://substackcdn.com/image/fetch/$s_!aIuf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 848w, https://substackcdn.com/image/fetch/$s_!aIuf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 1272w, https://substackcdn.com/image/fetch/$s_!aIuf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd26aa417-4783-42d6-8105-488131dfe518_1024x790.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Code Analysis Checkpoint Strategy</h2><p>This is where I&#8217;ve found success with my recent updates to the code analyzer in Claude MPM. The analyzer utilizes <a href="https://github.com/modelcontextprotocol/servers/tree/main/src/mcp-vector-search">mcp-vector-search</a> for comprehensive codebase analysis, providing AST-based semantic search, full-text search capabilities, and knowledge graph construction for architectural pattern detection. Instead of trying to prevent AI from generating messy code (impossible), I focus on <strong>regular refactoring and analysis checkpoints</strong>.</p><p>The analyzer has gotten very good at catching two specific issues:</p><ol><li><p><strong>Drift</strong>: When AI-generated code slowly diverges from intended architecture</p></li><li><p><strong>Bloat</strong>: When generated solutions become unnecessarily complex over time</p></li></ol><p>I make a point to run these checkpoints regularly, treating them as essential maintenance rather than optional cleanup. It&#8217;s like running <code>cargo clippy</code> or <code>eslint</code>, but for AI-generated architectural decisions.</p><p>The key insight: <strong>AI code needs different kinds of maintenance than human code</strong>.</p><h2>Outcome-Based Generation: Does It Work?</h2><p>Here&#8217;s my perhaps controversial take: If Claude Code successfully helps developers ship better software faster, then the messy internals might not matter as much as we think.</p><p>The leaked code reveals a system that:</p><ul><li><p>Handles millions of conversations per day</p></li><li><p>Maintains context across arbitrarily long sessions</p></li><li><p>Provides sophisticated memory management</p></li><li><p>Implements multiple safety layers</p></li><li><p>Delivers a $2.5 billion ARR product experience</p></li></ul><p>Is the implementation elegant? No. Does it work? Apparently, yes. Because we can observe/measure what it&#8217;s building completely independently of what built it.</p><p>This doesn&#8217;t excuse basic engineering failures (that <code>.npmignore</code> mistake was embarrassing). But it does suggest we need new frameworks for evaluating AI-generated systems.</p><h2>The Scaffolding Solution</h2><p>Rather than trying to make AI generate perfect code, we can scaffold around the inevitable messiness:</p><p><strong>Automated refactoring checkpoints</strong>: Regular cleanup of AI-generated bloat<br><strong>Architectural constraints</strong>: Guard rails that prevent the worst patterns<br><strong>Outcome validation</strong>: Testing that focuses on behavior over implementation<br><strong>Human oversight</strong>: Strategic points where humans validate AI decisions</p><p>This is the approach I&#8217;ve been taking with Claude MPM, and it&#8217;s proven remarkably effective. Let the AI generate messy-but-functional code, then use tooling to clean it up systematically.</p><h2>What This Means for the Industry</h2><p>The Claude Code leak represents a watershed moment. It&#8217;s our first real look at what happens when AI tools build themselves at scale.</p><p>The criticism is valid&#8212;basic engineering discipline matters, even in AI systems. A missing <code>.npmignore</code> file is inexcusable for a billion-dollar product.</p><p>But the deeper question is whether we&#8217;re applying the right standards. Traditional code quality metrics may not capture what actually matters for AI-integrated systems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NrDl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NrDl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 424w, https://substackcdn.com/image/fetch/$s_!NrDl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 848w, https://substackcdn.com/image/fetch/$s_!NrDl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 1272w, https://substackcdn.com/image/fetch/$s_!NrDl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NrDl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png" width="1024" height="825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:825,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1937908,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/193101232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1273dbb0-a196-49bc-b0ca-17f437545fd8_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NrDl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 424w, https://substackcdn.com/image/fetch/$s_!NrDl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 848w, https://substackcdn.com/image/fetch/$s_!NrDl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 1272w, https://substackcdn.com/image/fetch/$s_!NrDl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87df3326-02c3-4838-b302-ab23ab5d5e19_1024x825.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Moving Forward</h2><p>Anthropic probably <em>is</em> moving too quickly in some ways. The leak revealed security vulnerabilities, competitive intelligence losses, and quality control failures that suggest inadequate human oversight.</p><p>But they&#8217;re also pioneering entirely new categories of software. The problems they&#8217;re solving&#8212;context management, failure recovery, human-AI collaboration&#8212;don&#8217;t have established best practices yet.</p><p>The real lesson isn&#8217;t that AI-generated code is inherently bad. It&#8217;s that we need new practices for building, reviewing, and maintaining systems that exceed human comprehension scales.</p><p><strong>The question isn&#8217;t whether Claude Code&#8217;s internals are messy. It&#8217;s whether we can build better scaffolding around AI-generated systems to catch the problems that matter while accepting the messiness we can&#8217;t avoid.</strong></p><p>The Claude Code team probably needs to slow down on the basics&#8212;security, testing, deployment hygiene. But they&#8217;re moving fast on problems that genuinely require speed to solve before competitors do.</p><p>That&#8217;s a nuanced position in an industry that loves simple takes. But nuance is what the moment requires.</p><p><em>What do you think? Are we being too hard on AI-generated code, or not hard enough? Share your thoughts in the comments.</em></p><div><hr></div><p><em>About this analysis: This piece draws from extensive technical analysis of the March 31, 2026 Claude Code source leak, including community responses, security assessments, and business impact analysis. The author maintains active development projects using AI-assisted coding tools and has direct experience with the challenges discussed.</em></p><p><em>About the author: Bob Matsuoka is Chief Technology Officer at <a href="https://www.duettocloud.com/">Duetto</a> and creator of Claude MPM (Multi-agent Project Manager). He has implemented AI-assisted development workflows across enterprise engineering teams and writes about the practical realities of AI integration in software development at <a href="https://hyperdev.substack.com">HyperDev</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Moving Past the 10-Tab Workflow]]></title><description><![CDATA[Autonomous Orchestration Management Is Next]]></description><link>https://hyperdev.matsuoka.com/p/moving-past-the-10-tab-workflow</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/moving-past-the-10-tab-workflow</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Wed, 01 Apr 2026 12:32:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wz2X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wz2X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wz2X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 424w, https://substackcdn.com/image/fetch/$s_!wz2X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 848w, https://substackcdn.com/image/fetch/$s_!wz2X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 1272w, https://substackcdn.com/image/fetch/$s_!wz2X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wz2X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png" width="1024" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1272299,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/192741144?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb87ec05a-31c6-4037-a9e0-db7c8c8fcba4_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wz2X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 424w, https://substackcdn.com/image/fetch/$s_!wz2X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 848w, https://substackcdn.com/image/fetch/$s_!wz2X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 1272w, https://substackcdn.com/image/fetch/$s_!wz2X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c54d57-2f21-4893-a87d-88523deb8ae7_1024x690.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">From Tab Chaos to Autonomic Orchestration</figcaption></figure></div><p>I&#8217;m looking at my (iTerm) terminal right now. Ten tmux sessions. Each session holds a different project context&#8212;one monitoring CI failures, another handling a code review, a third debugging a production issue.</p><p>This is the reality of modern agent development work.</p><h2>TL;DR</h2><ul><li><p><strong>Multi-session reality</strong>: Power users average 8-12 terminal sessions; most work involves modification, bug response, and PR handling&#8212;not new code generation</p></li><li><p><strong>Natural workflow origination</strong>: Future systems trigger from product team actions, CI failures, and automated events rather than human prompts </p></li><li><p><strong>Orchestration evolution</strong>: From human-orchestrated agents to orchestration-of-orchestrators where prime coordinators are non-human</p></li><li><p><strong>Production examples</strong>: Stripe&#8217;s Minions (1,300 PRs/week), GitLab&#8217;s Duo Agent Platform, Meta&#8217;s REA demonstrate hierarchical agent orchestration</p></li><li><p><strong>Architecture shift</strong>: Claude Code&#8217;s SDK model enables workflow-driven development through persistent, context-aware agent orchestration</p></li></ul><h2>The 10-Tab Reality</h2><p>According to recent developer workflow studies, <a href="https://www.heyuan110.com/posts/ai/2026-03-03-tmux-guide-ai-development/">tmux has become the standard for AI-assisted development</a>, with <a href="https://dev.to/_d7eb1c1703182e3ce1782/tmux-tutorial-the-complete-developer-workflow-guide-2026-33b3">persistent sessions solving the context-switching tax</a>. The productivity advantage isn&#8217;t the multiplexing&#8212;it&#8217;s the persistence. Projects become environments you step in and out of rather than things you open and close.</p><p>But here&#8217;s what the productivity tutorials miss: most of those tabs aren&#8217;t generating software.</p><p><strong>My current session breakdown:</strong></p><ul><li><p>3 sessions: non-coding -- my CTO knowledge base (currently analyzing our Sumo use), a writing assistant, and our Duetto product management framework</p></li><li><p>4 sessions: coding - various internal tools and MCP connectors</p></li><li><p>2 sessions: coding - new projects</p></li><li><p>1 session: code review</p></li></ul><p>The 8:2 ratio holds across most senior developers I&#8217;ve observed. Most development work involves responding to existing systems, not creating new ones.</p><p>This distribution points toward something significant: <strong>the future of development orchestration isn&#8217;t human-initiated.</strong></p><h2>Beyond Prompt-Driven Development</h2><p>Claude Code&#8217;s new SDK architecture reflects this reality. Instead of starting with human prompts, work originates from natural workflow events:</p><ul><li><p>Product team creates ticket &#8594; Implementation specification generated</p></li><li><p>CI pipeline fails &#8594; Diagnostic agent analyzes failure, proposes fix</p></li><li><p>PR submitted &#8594; Review agent examines code, suggests improvements</p></li><li><p>Production alert triggered &#8594; Incident response agent investigates, documents findings</p></li><li><p>Security scan detects vulnerability &#8594; Remediation agent generates patch</p></li></ul><p>The pattern: <strong>Event &#8594; Agent Response &#8594; Human Review &#8594; Autonomous Resolution</strong>.</p><p>Humans remain in the loop, but as orchestrators and validators rather than initiators. The shift from &#8220;What should I build?&#8221; to &#8220;How should this system respond?&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G7Uu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G7Uu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!G7Uu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!G7Uu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!G7Uu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G7Uu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1418038,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/192741144?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G7Uu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!G7Uu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!G7Uu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!G7Uu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20f98126-397c-40f4-bf7d-6fc025ab018d_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Yes, Dall-e is Still Atrocious With Spelling.  Don&#8217;t @ me!</figcaption></figure></div><h2>Orchestration of Orchestrators: Production Examples</h2><h3>Stripe&#8217;s Blueprint Architecture</h3><p><a href="https://medium.com/@oracle_43885/how-stripe-built-secure-unattended-ai-agents-merging-1-000-pull-requests-weekly-1ff42f3fe550">Stripe&#8217;s Minions system demonstrates mature orchestration-of-orchestrators</a>. Their &#8220;blueprint&#8221; pattern alternates between deterministic code nodes and agentic reasoning loops, generating <a href="https://blog.bytebytego.com/p/how-stripes-minions-ship-1300-prs">1,300+ pull requests weekly</a>.</p><p><strong>Architecture insight</strong>: <a href="https://www.mindstudio.ai/blog/stripe-minions-blueprint-architecture-deterministic-agentic-nodes">Each blueprint functions as a strict contract between orchestration and execution</a>. Task definitions specify input requirements, output formats, constraints, and success criteria. The orchestrator manages workflow, agents handle implementation.</p><p><strong>Security model</strong>: <a href="https://www.sitepoint.com/stripe-minions-architecture-explained/">Every Minion execution runs in isolated VMs</a> with no internet or production access. The system has submission authority but not merge authority&#8212;all changes require human review.</p><h3>GitLab&#8217;s Intelligent Orchestration</h3><p><a href="https://about.gitlab.com/blog/agentic-sdlc-gitlab-and-tcs-deliver-intelligent-orchestration-across-the-enterprise/">GitLab&#8217;s Duo Agent Platform treats agents as durable actors</a> that plan, modify code, fix pipelines, and enforce security with traceability. Multiple AI agents handle parallel tasks&#8212;code generation, testing, CI/CD fixes&#8212;while developers maintain oversight through defined rules.</p><p><strong>Orchestration insight</strong>: <a href="https://docs.gitlab.com/user/duo_agent_platform/">GitLab positions itself as an AI orchestration plane</a> where humans and agents share delivery responsibility. The platform coordinates multi-agent workflows across the entire software lifecycle rather than providing isolated AI tools.</p><h3>Meta&#8217;s Hierarchical Agent Systems</h3><p><a href="https://engineering.fb.com/2026/03/17/developer-tools/ranking-engineer-agent-rea-autonomous-ai-system-accelerating-meta-ads-ranking-innovation/">Meta&#8217;s Ranking Engineer Agent (REA) demonstrates autonomous ML lifecycle management</a>. REA Planner and REA Executor components, supported by shared skill and knowledge systems, autonomously evolve ads ranking models at scale.</p><p><strong>Acquisition significance</strong>: <a href="https://venturebeat.com/orchestration/why-meta-bought-manus-and-what-it-means-for-your-enterprise-ai-agent">Meta&#8217;s $2B Manus acquisition</a> focused on orchestration infrastructure rather than foundation models. Manus&#8217;s achievement was engineering an execution layer enabling models to browse, code, manipulate files, and complete multi-step workflows autonomously.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!21xU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!21xU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 424w, https://substackcdn.com/image/fetch/$s_!21xU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 848w, https://substackcdn.com/image/fetch/$s_!21xU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 1272w, https://substackcdn.com/image/fetch/$s_!21xU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!21xU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png" width="1024" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1147097,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/192741144?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef64881a-c7f7-4bab-8d8d-2393ee5bf8b0_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!21xU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 424w, https://substackcdn.com/image/fetch/$s_!21xU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 848w, https://substackcdn.com/image/fetch/$s_!21xU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 1272w, https://substackcdn.com/image/fetch/$s_!21xU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0980b901-da6a-408f-a4c7-2dd2188be40c_1024x663.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Architecture Implications</h2><h3>Beyond the Single-Agent Model</h3><p>The production examples reveal a consistent pattern: successful autonomous development requires <strong>hierarchical orchestration</strong> rather than monolithic AI assistants.</p><p><strong>Traditional approach</strong>: Human &#8594; Single Agent &#8594; Code<br><strong>Emerging pattern</strong>: Event &#8594; Orchestrator &#8594; Specialized Agents &#8594; Validation &#8594; Resolution</p><h3>Context Preservation at Scale</h3><p><a href="https://medium.com/@gveloper/using-iterm2s-built-in-integration-with-tmux-d5d0ef55ec30">The tmux paradigm</a> of persistent sessions maps directly to agent orchestration. Instead of recreating context for each interaction, systems maintain ongoing project understanding across multiple concurrent workflows.</p><p><strong>Implementation insight</strong>: <a href="https://iterm2.com/documentation-tmux-integration.html">iTerm2&#8217;s tmux integration (-CC mode)</a> provides the UI pattern for agent orchestration&#8212;persistent remote workspaces with native interface feel. The same architecture principles apply to agent coordination.</p><h2>Where This Leads</h2><h3>Non-Human Prime Orchestrators</h3><p>The logical endpoint isn&#8217;t humans managing multiple agents&#8212;it&#8217;s orchestrating systems that manage agent ecosystems. <a href="https://www.deloitte.com/us/en/insights/industry/technology/technology-media-and-telecom-predictions/2026/ai-agent-orchestration.html">According to Gartner&#8217;s 2025 Agentic AI research</a>, nearly 50% of surveyed vendors identified AI orchestration as their primary differentiator.</p><p><strong>Pattern emergence</strong>: Meta-agents or orchestrator-generalists will control specialized agents, assign tasks, interpret results, and revise goals in real-time. <a href="https://arxiv.org/pdf/2601.13671">Hierarchical orchestration becomes essential for enterprise-scale implementations</a>.</p><h3>The Developer Role Evolution</h3><p>Instead of managing 10 terminal sessions, a framework orchestrates autonomous workflows. Each workflow maintains its own context, responds to its own triggers, and escalates to human attention when required. Some of those will be human/experimentation/new development driven, the majority will be responding to the automated lifecycle.</p><p><strong>Skills that matter</strong>:</p><ul><li><p><strong>Workflow boundary definition</strong>: Which autonomous streams can operate independently?</p></li><li><p><strong>Escalation criteria design</strong>: When do workflows require human intervention?</p></li><li><p><strong>Cross-workflow dependency management</strong>: How do autonomous streams coordinate?</p></li><li><p><strong>Quality gate enforcement</strong>: What validation must occur before autonomous resolution?</p></li></ul><h2>Implementation Considerations</h2><p>Teams experimenting with orchestrated autonomous development should consider:</p><ol><li><p><strong>Event-driven architecture</strong>: Which existing workflows could trigger autonomous responses?</p></li><li><p><strong>Context preservation systems</strong>: How will agent workflows maintain project understanding?</p></li><li><p><strong>Isolation and security</strong>: What boundaries prevent autonomous agents from causing damage?</p></li><li><p><strong>Human oversight integration</strong>: Where do human validation points occur in autonomous workflows?</p></li><li><p><strong>Cross-workflow coordination</strong>: How do parallel autonomous streams avoid conflicts?</p></li></ol><p>The transition from 10-tab manual orchestration to autonomous lifecycle orchestration isn&#8217;t theoretical. Stripe, GitLab, and Meta demonstrate production implementations. The question becomes implementation timeline and organizational readiness.</p><p>Early adopters are discovering that the competitive advantage comes not from having the smartest individual AI agents, but from orchestrating networks of specialized agents that collaborate effectively at scale.</p><div><hr></div><p><em>Bob Matsuoka is CTO of <a href="https://www.duettocloud.com/">Duetto</a> and writes about AI-powered engineering at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p><p><strong>Related reading:</strong></p><ul><li><p><a href="https://www.coddykit.com/pages/blog-detail?id=512757">Stripe&#8217;s Minions: Inside Their Enterprise AI Coding Agent Strategy</a> &#8212; Blueprint orchestration architecture and production metrics</p></li><li><p><a href="https://docs.gitlab.com/user/duo_agent_platform/">GitLab Duo Agent Platform</a> &#8212; Intelligent orchestration across software lifecycle</p></li><li><p><a href="https://www.heyuan110.com/posts/ai/2026-03-03-tmux-guide-ai-development/">Tmux Complete Guide: AI-Powered Multi-Agent Workflows</a> &#8212; Terminal multiplexing for autonomous development</p></li><li><p><a href="https://aipowerranking.com/">AI Power Ranking</a> &#8212; Tool comparisons and benchmarks for AI practitioners</p></li><li><p><a href="https://www.linkedin.com/newsletters/ai-power-ranking-7345782916301418496/">LinkedIn Newsletter</a> &#8212; Strategic AI insights for CTOs and engineering leaders</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Everyone Blamed Clawd Bot’s Execution. The Concept Was the Problem.]]></title><description><![CDATA[Is A Personal Assistant Bot Really Helpful?]]></description><link>https://hyperdev.matsuoka.com/p/everyone-blamed-clawd-bots-execution</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/everyone-blamed-clawd-bots-execution</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 12 Mar 2026 11:32:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!51wj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!51wj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!51wj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 424w, https://substackcdn.com/image/fetch/$s_!51wj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 848w, https://substackcdn.com/image/fetch/$s_!51wj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 1272w, https://substackcdn.com/image/fetch/$s_!51wj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!51wj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png" width="1024" height="774" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:774,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1722868,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/190672571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb05bbe76-836e-475e-b360-793755bf1927_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!51wj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 424w, https://substackcdn.com/image/fetch/$s_!51wj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 848w, https://substackcdn.com/image/fetch/$s_!51wj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 1272w, https://substackcdn.com/image/fetch/$s_!51wj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F878e8350-5206-465d-9bf0-b600661c22ed_1024x774.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The story everyone told about Clawd Bot missed the point entirely. Austrian developer Peter Steinberger built an open-source AI assistant that went viral &#8212; 145,000 GitHub stars, 2 million visitors in a week. Then Anthropic forced a trademark-based name change because &#8220;Clawd&#8221; was too similar to &#8220;Claude.&#8221; The community called it petty. DHH called Anthropic &#8220;customer hostile.&#8221; The irony: Clawd Bot users were actually buying more Claude subscriptions, providing free marketing to Anthropic, yet they still demanded the shutdown.</p><p>But everyone focused on the wrong drama. The trademark dispute was noise. The real problem was deeper: Clawd Bot was built because someone could, not because anyone needed it.</p><p>I tested Clawd Bot for about a week. The interface was clean, the onboarding smooth, the responses capable. But it required permissions I wouldn&#8217;t give to any tool &#8212; access to email, calendars, messaging, sensitive services. The execution had real problems. But even if those were fixed, it would still be solving the wrong problem.</p><p>Here&#8217;s where I should admit: I tried building a digital assistant, izzie, when I started experimenting with AI agents. I never got it to a point I found useful. Not because of technical limitations &#8212; because the entire concept of a universal assistant doesn&#8217;t match how work actually happens.</p><h2>TL;DR</h2><ul><li><p>Clawd Bot was successful open-source project by Peter Steinberger that Anthropic forced to rename; execution wasn&#8217;t the problem</p></li><li><p>The real question: when do you need an &#8220;assistant&#8221;? Most execs won&#8217;t trust AI scheduling; the value is intelligent data movement between services</p></li><li><p>Context switching is a symptom, not the root issue &#8212; the issue is what assistants should be doing at all</p></li><li><p>The product management sessions: Granola meeting notes, calendar checks, Slack updates, Notion sync &#8212; all from within one tool, data flowing intelligently between services</p></li><li><p>The commercial evidence: Cursor, Notion AI, Linear&#8217;s AI triage &#8212; the winners embedded AI in tools as infrastructure, not interface</p></li><li><p>trusty-izzie&#8217;s highest value isn&#8217;t the chat interface &#8212; it&#8217;s as a local MCP service exporting personal context to every other tool</p></li><li><p>The universal assistant category isn&#8217;t going to produce a winner. It&#8217;s going to dissolve.</p></li></ul><h2>What the Universal Assistant Model Gets Wrong (And It&#8217;s Not Just Execution)</h2><p>Clawd Bot had serious execution problems &#8212; it&#8217;s a security nightmare requiring broad permissions across email, calendars, messaging platforms, and sensitive services. You can&#8217;t ignore that. But even if the security issues were solved, universal assistants face a deeper structural problem: they assume people need an assistant in the traditional sense.</p><p>Walk through what even a well-executed version of the same product model looks like.</p><p>Smooth onboarding. Crystal-clear use cases. High-quality AI responses. Clean interface design. Users know exactly what to ask and how to ask it.</p><p>You still have to leave whatever you&#8217;re working on to use it. And when you do, the context you were carrying &#8212; the code you were reviewing, the initiative you were drafting, the design decision you were working through &#8212; is no longer present. You&#8217;ve moved somewhere that knows nothing about any of that.</p><p>So you explain. &#8220;I&#8217;m working on the &#8216;YYY&#8217; data ingestion initiative, and I need to check whether the points Mark raised in Tuesday&#8217;s meeting are addressed in the current design.&#8221; The assistant doesn&#8217;t know what &#8216;YYY&#8217; is. Doesn&#8217;t have Tuesday&#8217;s meeting. Doesn&#8217;t know Mark, the current design, or the organizational context that makes &#8220;addressed&#8221; mean something specific. You load all of it by hand.</p><p>In demos, this overhead is invisible. Demo tasks are self-contained by design &#8212; the context fits in a sentence or two. In practice, your working context isn&#8217;t self-contained. It&#8217;s weeks of accumulated decisions, relationships, dependencies, and constraints that live distributed across your tools. You can&#8217;t paste it into a chat window. You can&#8217;t even fully articulate it. It&#8217;s partially tacit, partially in documents, partially in the history of the tool you&#8217;re using.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!skTt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!skTt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 424w, https://substackcdn.com/image/fetch/$s_!skTt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 848w, https://substackcdn.com/image/fetch/$s_!skTt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 1272w, https://substackcdn.com/image/fetch/$s_!skTt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!skTt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png" width="1024" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1522333,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/190672571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff92dde62-56c6-4fa2-ab9a-1147e8c99362_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!skTt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 424w, https://substackcdn.com/image/fetch/$s_!skTt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 848w, https://substackcdn.com/image/fetch/$s_!skTt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 1272w, https://substackcdn.com/image/fetch/$s_!skTt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a50911a-4ef2-491e-8d2e-01234b0fec77_1024x641.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Session That Clarified It</h2><p>A few weeks into the new role at Duetto, I was doing product management work in a <a href="https://github.com/bobmatnyc/claude-mpm">claude-mpm</a> session &#8212; reviewing open initiatives, managing the PR queue, creating proposals. Standard operational work for a new CTO getting oriented.</p><p>I wanted to add an infrastructure initiative. Cloud Dev Sleds &#8212; dedicated cloud development machines for the engineering team. The context was in a meeting I&#8217;d had the day before. In the old workflow, this would mean: switch to Granola, find the right meeting, read the transcript, extract the relevant points, switch back, and then write the initiative with that context now loaded in my head rather than in the tool.</p><p>Instead I just asked: &#8220;Review my meeting with Mark yesterday in Granola to get context. I want to create the initiative as a feasibility, cost, and LOE assessment.&#8221;</p><p>The tool pulled the notes. I created the initiative. The product context &#8212; what other infrastructure work was in flight, what the team structure looked like, what the related architectural decisions were &#8212; never left. The Granola content landed inside that context rather than requiring me to carry it manually between tools.</p><p>Same session: needed to check whether I had a conflict for an upcoming demo. Calendar check, without opening Google Calendar.</p><p>Same session: the team needed a status update. Posted directly to the engineering Slack channel, with proper <code>&lt;@USERID&gt;</code> mentions so people actually got notified. The message reflected the same initiatives I&#8217;d been working on all session &#8212; not because I copy-pasted anything, but because the tool already knew what was in flight.</p><p>Later: set up a Notion sync &#8212; initiative statuses with links to the docs, updated automatically.</p><p>The efficiency argument is real but secondary. The more important thing is that the product context never left. The tool knew what initiatives existed, who owned what, what the architectural decisions were, which PRs were waiting on which engineers. When I pulled Granola notes, they arrived inside that context. When I posted to Slack, the message was informed by that context. A universal assistant would have required me to reconstruct and transport that context manually every time I needed to cross a tool boundary.</p><p>No universal assistant is going to have that work knowledge. Not because the AI isn&#8217;t capable. Because the knowledge lives in the tool, accumulated over months &#8212; PRDs, design decisions, initiative history, team assignments, the proposals that got approved and the ones that didn&#8217;t. You don&#8217;t recreate that in a chat window.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AP4m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AP4m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 424w, https://substackcdn.com/image/fetch/$s_!AP4m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 848w, https://substackcdn.com/image/fetch/$s_!AP4m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 1272w, https://substackcdn.com/image/fetch/$s_!AP4m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AP4m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png" width="1024" height="647" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:647,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1372751,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/190672571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257a2fb4-ff26-4bb3-8f3e-82d9973a7a60_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AP4m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 424w, https://substackcdn.com/image/fetch/$s_!AP4m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 848w, https://substackcdn.com/image/fetch/$s_!AP4m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 1272w, https://substackcdn.com/image/fetch/$s_!AP4m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94610dbb-00aa-45e1-93a6-e96a14419f5c_1024x647.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Deep Context Problem</h2><p>The thing that makes domain tools irreplaceable isn&#8217;t AI capability. It&#8217;s accumulated context.</p><p>A product management tool carries months of initiative history. The CTO knowledge base carries organizational decisions, vendor relationships, strategic context that builds over time. These aren&#8217;t things you can summarize in a system prompt. They&#8217;re queryable, interconnected, grounded in real artifacts. The tool has developed something like institutional memory &#8212; and that memory is what makes AI assistance inside the tool qualitatively different from AI assistance outside it.</p><p>Universal assistants are built for breadth. Any question, any domain, any task. That breadth is the pitch and also the structural weakness. The model that&#8217;s ready for anything is primed for nothing specifically. It has no idea that &#8220;the YYY initiative&#8221; refers to a specific ingestion redesign with a particular set of constraints, a particular set of people involved, and three months of design decisions behind it.</p><p>The inversion worth stating plainly: the tools you work in every day already have more relevant context than any assistant will. The right move is surfacing AI capabilities inside those tools, not pulling people out of those tools into a separate assistant layer.</p><p>But here&#8217;s what&#8217;s happening at the executive level. I&#8217;m finding more and more technical executives using Claude Code as knowledge assistance &#8212; not because they&#8217;re universal assistants, but because the amount of data and complexity they can manage far exceeds what standard off-the-shelf tools provide. The deep context problem can&#8217;t be solved with generic solutions.</p><p>For MPM, I built specific connectors: gworkspace-mcp, slack-mpm, notion-mpm, granola-mcp (the last from Granola, the others myself because <a href="https://hyperdev.substack.com/p/mcp-was-a-brilliant-idea-but-it-needs">mcp has limitations</a>). That became as much of an &#8220;assistant&#8221; as I needed, besides izzie. No universal chat interface. Just targeted data bridges that let Claude access specific services when I&#8217;m working on something that needs their context.</p><p>The commercial evidence points the same direction. The AI tooling products with real adoption aren&#8217;t universal assistants. Cursor put AI in the editor. Notion AI put AI in the documents. Linear&#8217;s triage put AI in the issue tracker. Each works because the AI operates inside existing context. The pattern is consistent enough that it&#8217;s probably not coincidence.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tqnj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tqnj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 424w, https://substackcdn.com/image/fetch/$s_!tqnj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 848w, https://substackcdn.com/image/fetch/$s_!tqnj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 1272w, https://substackcdn.com/image/fetch/$s_!tqnj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tqnj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png" width="1024" height="751" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc968550-faa1-4415-a59d-39051323dc48_1024x751.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:751,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1457550,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/190672571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26977664-13e1-43a1-b5a5-d1bf0f4eb443_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tqnj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 424w, https://substackcdn.com/image/fetch/$s_!tqnj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 848w, https://substackcdn.com/image/fetch/$s_!tqnj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 1272w, https://substackcdn.com/image/fetch/$s_!tqnj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc968550-faa1-4415-a59d-39051323dc48_1024x751.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Interface vs Infrastrccture</figcaption></figure></div><h2>What I Got Wrong About My Own Bot</h2><p>I (re)built trusty-izzie as a personal assistant &#8212; natural language queries over my email and calendar history, local graph database, vector embeddings, stays on my machine. It works. But &#8220;personal assistant&#8221; was the wrong frame for where the value lies.</p><p>The thing izzie has is a grounded, real-time, locally-stored representation of my professional life &#8212; people, relationships, projects, scheduling, communications history. That&#8217;s a context store. Every tool I use should have access to it without me switching to izzie to ask.</p><p>The right version of izzie isn&#8217;t the one you talk to. It&#8217;s the one that runs as a local MCP service &#8212; always on, queryable by anything that needs personal context. The product management tool asks it about scheduling. The writing environment surfaces relevant prior conversations. The coding environment knows who owns what system before I have to explain it. None of that requires me to open izzie. It requires izzie to be infrastructure rather than interface.</p><p>If you want to try izzie yourself: <a href="https://izzie.bot/">izzie.bot</a> has the details, and the full source is at <a href="https://github.com/bobmatnyc/trusty-izzie">github.com/bobmatnyc/trusty-izzie</a>. I strongly recommend building from source using an agentic coder to verify the code is safe &#8212; never trust AI tooling with your personal data without auditing it first.</p><p>Not there yet. But the frame shift changes what to build next.</p><h2>What the Architecture Looks Like</h2><p>If you&#8217;re building a personal AI tool, the question isn&#8217;t &#8220;what will users ask the assistant?&#8221; It&#8217;s &#8220;where do users have context, and how do you bring assistance there without making them leave?&#8221;</p><p>The test is simple. Does using your tool require leaving the context where the relevant information lives? If yes, you&#8217;re fighting the architecture. Users will use it occasionally, for low-friction tasks. They won&#8217;t build their workflow around it.</p><p>The tools that pass the test: Claude Code (your codebase is the context), Cursor (you stay in the editor), Notion AI (you stay in the document), Linear AI triage (you stay in the issue tracker). The tools that fail it: every standalone AI assistant that requires opening a new interface and re-explaining what you&#8217;re working on.</p><p>For domain tools with real depth &#8212; months of accumulated decisions, relationships, history &#8212; the connectors are the product. The LLM orchestration is the interface layer. The accumulated context is what no competitor can replicate by building a better general assistant. The moat isn&#8217;t the AI. It&#8217;s what the AI is operating inside.</p><p>For personal infrastructure like izzie: build the MCP service before the chat UI. The chat UI is useful and I use it. The MCP service is what makes the tool true infrastructure rather than one more thing to switch to.</p><p>The universal assistant category isn&#8217;t going to produce a winner because the category is structured wrong. The capabilities will get absorbed by the tools where the relevant context lives &#8212; because that&#8217;s where the value is, and users will figure that out even if product teams don&#8217;t.  The infrastructure driving this &#8212; entity and relationship detection, email, calendar, and task management (all built for Izzie) &#8212; will likely be delivered by the personal productivity tool providers (hello Google).</p><p>Clawd Bot wasn&#8217;t a failed product. It was wildly popular, but I suspect will have been a flash in the pan once the shininess wears off and the liabilities outweigh the usefulness. That distinction matters, because if you think it&#8217;s an execution problem, you go looking for a better universal assistant. If you understand it&#8217;s a conceptual problem &#8212; that most &#8220;assistant&#8221; work is intelligent data movement &#8212; you build infrastructure instead of interfaces.</p><div><hr></div><p><em>Bob Matsuoka is CTO of <a href="https://www.duettocloud.com/">Duetto</a> and writes about AI-powered engineering at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p><p><strong>Related reading:</strong></p><ul><li><p><a href="https://hyperdev.matsuoka.com/p/what-does-a-pattern-master-actually">What Does A Pattern Master Do</a>? &#8212; The role of expertise in AI development</p></li><li><p><a href="https://aipowerranking.com/">AI Power Ranking</a> &#8212; Tool comparisons and benchmarks for AI practitioners</p></li><li><p><a href="https://www.linkedin.com/newsletters/ai-power-ranking-7345782916301418496/">LinkedIn Newsletter</a> &#8212; Strategic AI insights for CTOs and engineering leaders</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Are We Heading to a World Where We Only Pay Inference Providers?]]></title><description><![CDATA[A future where you only pay for complexity and scale]]></description><link>https://hyperdev.matsuoka.com/p/are-we-heading-to-a-world-where-we</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/are-we-heading-to-a-world-where-we</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 05 Mar 2026 12:31:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kbON!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kbON!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kbON!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 424w, https://substackcdn.com/image/fetch/$s_!kbON!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 848w, https://substackcdn.com/image/fetch/$s_!kbON!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 1272w, https://substackcdn.com/image/fetch/$s_!kbON!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kbON!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png" width="1024" height="616" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:616,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1174195,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/189515100?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0c1480a-00f2-4fda-824e-0eb2a4a026fa_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kbON!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 424w, https://substackcdn.com/image/fetch/$s_!kbON!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 848w, https://substackcdn.com/image/fetch/$s_!kbON!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 1272w, https://substackcdn.com/image/fetch/$s_!kbON!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee438ea7-5463-4374-99f1-b53d482ea31b_1024x616.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Inference vs SASS</figcaption></figure></div><p>Starting Monday morning at 6am, my leadership team gets a Slack notification with our weekly engineering metrics. Commit patterns by developer, product area breakdowns, DORA approximations, behavioral insights like &#8220;Large batch changes&#8221; and &#8220;Afternoon developer&#8221; for each team member. The kind of data that DX or Jellyfish charge thousands per month to provide.</p><p>Total cost to generate that report: $0.005.</p><p>Not $5. Half a penny.</p><p>I just replaced enterprise developer productivity tooling with inference costs. And the replacement isn&#8217;t a compromise&#8212;it&#8217;s better. Custom reports sent directly to leadership through our existing Slack channels and email. Data correlated to our specific product initiatives. Insights that matter to how we actually work.</p><p>The realization landed differently because I&#8217;d been here before. Three times.</p><h2>TL;DR</h2><ul><li><p>Built GitFlow Analytics (GFA) to replace enterprise tools like DX/Jellyfish for $0.005 per weekly report vs. $36K-92K annually</p></li><li><p>Analyzes 154 developers across 100+ repositories, generates automated leadership reports with behavioral insights</p></li><li><p>Total build cost: ~$20K one-time investment vs. $36K-92K annually for enterprise alternatives</p></li><li><p>Requires data engineering skills&#8212;not accessible to every organization, but economics favor custom builds when possible</p></li><li><p>Enterprise analytics bifurcating: zero-footprint vendors (15-min integrations) vs. inference-only custom solutions</p></li><li><p>Pattern recognition becoming commodity; vendors survive on convenience, not intelligence</p></li></ul><p>Before we go further: I&#8217;m talking about enterprise tools that aggregate and analyze data&#8212;developer productivity platforms, team analytics, reporting dashboards. B2B software with complex domain logic or proprietary computation still has enormous value. But enterprise analytics are becoming inference costs.<br><br>I should also point out that DX is a great tool, simple interface, capable.  But expensive and basically unused months after it was installed.  I championed Jellyfish at Tripadvisor, I&#8217;d bet it&#8217;s barely being used there.  The former is due to the effort to personalize, the latter got bogged down in integration costs/time (to be fair, we were running a locally hosted version of JIRA that was a nightmare).</p><h2>The GitFlow Analytics Story</h2><p>GitFlow Analytics started as an internal project at a former client. We needed to understand engineering productivity across our distributed team, but existing solutions were either too expensive or too generic. So I built <a href="https://github.com/bobmatnyc/gitflow-analytics">Gitflow Analytics</a> (GFA)&#8212;a CLI tool that walks git repositories, classifies every commit by work type using inference, handles canonicalization of committers (a surprisingly complex problem), and generates structured reports.</p><p>The system is intentionally simple. Runs on a MacBook Pro with AWS credentials. No GPU, no training data, no vector databases. Just Python scripts that process git logs and make Bedrock API calls to classify commits into Feature, Bug Fix, KTLO, Refactoring, Infrastructure, etc.</p><p>At Duetto, I expanded GFA to analyze over 100 repositories across both Duetto and HotStats. 154 developers tracked. thousands of commits classified. The system handles identity resolution (because different git configs on developer machines create split identities), maps commits to product areas, and generates narrative reports about developer behavior patterns.</p><p>Every morning at 5am, GFA runs automatically. By 6am, my SELT (Senior Engineering Leadership Team) gets a Slack post with the weekly metrics. Individual team leads get personalized HTML reports by email with their direct reports&#8217; patterns.</p><p>The intelligence that interprets raw git data into actionable insights? That&#8217;s Claude Haiku at $0.25 per million input tokens.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eubl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eubl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 424w, https://substackcdn.com/image/fetch/$s_!eubl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 848w, https://substackcdn.com/image/fetch/$s_!eubl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 1272w, https://substackcdn.com/image/fetch/$s_!eubl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eubl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png" width="1024" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1594282,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/189515100?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41d37fae-c632-4dd9-af9e-147962b43942_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eubl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 424w, https://substackcdn.com/image/fetch/$s_!eubl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 848w, https://substackcdn.com/image/fetch/$s_!eubl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 1272w, https://substackcdn.com/image/fetch/$s_!eubl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0c59b41-151e-4cb1-8c44-a554b9cfeeb8_1024x799.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Economics</figcaption></figure></div><h2>The Economics Are Absurd</h2><p>Here&#8217;s what $0.26 bought me: classification of commits across our entire engineering organization. Every commit message analyzed, categorized, and understood in business context. The corpus represents months of engineering work across multiple product teams.</p><p>Weekly incremental runs cost $0.005-0.01. Maybe 200-400 new commits get classified, the reports regenerate, and leadership gets fresh data. The bottleneck isn&#8217;t inference cost&#8212;it&#8217;s data collection. Git logs are free. The interpretation layer costs pennies.</p><p>Compare this to DX or Jellyfish pricing. DX doesn&#8217;t publish pricing, but industry estimates put enterprise developer productivity tools at $20-50 per developer per month. For our 154-person engineering team, that&#8217;s $36,000-92,000 annually. Jellyfish is reportedly similar.</p><p>My system handles the same workload for about 50 cents per year in inference costs.</p><p>You&#8217;re not paying for intelligence when you buy enterprise analytics. Intelligence costs nothing. You&#8217;re paying for data collection infrastructure, UI development, customer support, sales teams, compliance certifications. All the overhead of running a SaaS business.</p><p>We obsess over the high costs of advanced coding agents and reasoning models. But the cost of capable text inference&#8212;the kind that powers pattern recognition and reporting&#8212;is dropping through the floor. Haiku handles commit classification as well as Opus would. For enterprise analytics, you don&#8217;t need the flagship models. You need reliable categorization and natural language generation, which commodity models deliver for pennies.</p><p>But if you know how to build? You don&#8217;t need any of that. You just need (more expensive) inference.</p><h2>What GFA Actually Delivers</h2><p>The data flowing to my leadership team isn&#8217;t generic dashboard noise. It&#8217;s intelligence designed around how we actually operate.</p><p>Every developer gets a behavioral profile: &#8220;Large batch changes,&#8221; &#8220;Afternoon developer,&#8221; &#8220;Exceptional performer (Top 20%).&#8221; These aren&#8217;t arbitrary labels&#8212;they&#8217;re LLM interpretations of quantitative patterns. Commit size distributions, time-of-day histograms, percentile rankings converted into readable insights.</p><p>Product area attribution happens automatically. The system maps our repositories to eight business areas: Frontend, Core Product, Integrations, Data Platform, Intelligence/ML, Infrastructure, QA/Testing, Developer Tools. When we see commit patterns shifting from Core Product to Infrastructure, that signals architectural decisions playing out in code.</p><p>DORA metrics approximate from git data. Deployment frequency tracks through release tags. Lead time measures first commit to merge. We don&#8217;t get the full Four Keys implementation, but we get enough signal to spot trends and outliers.</p><p>The identity resolution was the hardest part. Not the LLM calls&#8212;those work fine. But knowing that different developer machines create split git identities required human judgment to build the canonical mapping. Once you solve identity, everything else flows from structured data.</p><p>Most importantly, the reports answer questions executives actually ask. &#8220;Which teams are handling the most KTLO work?&#8221; &#8220;Are we seeing more bug fixes or new features this quarter?&#8221; &#8220;Who&#8217;s working weekends and why?&#8221; These aren&#8217;t metrics you find in generic productivity dashboards. They&#8217;re insights that matter for our specific business context.</p><h2>We&#8217;ve Been Here Before</h2><p>Enterprise software was expensive because intelligence was scarce. In the 1990s, turning raw data into insights required Oracle licenses, dedicated servers, and consultants. The SaaS revolution changed delivery but not fundamentals&#8212;you still paid massive recurring costs for pattern recognition and reporting.</p><p>Intelligence is now a commodity API call. You don&#8217;t need Tableau because you can generate charts and send them through Slack. You don&#8217;t need Looker because Claude can summarize SQL results. The bottleneck was never data storage&#8212;it was interpretation. When interpretation costs pennies, everything else becomes optional.</p><h2>The Development Cost Reality</h2><p>Building GFA required skills not every organization has&#8212;data modeling, Python scripting, API integration, identity resolution patterns. Conservative total development cost: around $20,000 including engineering time, infrastructure setup, testing, and iteration cycles.</p><p>For organizations with engineering talent, the economics have shifted dramatically. A $20,000 one-time investment delivers exactly what we needed versus $36,000-92,000 annually for enterprise alternatives. As one SELT member said: &#8220;This is exactly what we wanted.&#8221;</p><p>You own the data model. Custom breakdowns take SQL queries, not feature requests. When we needed behavioral insights, I added prompts interpreting commit patterns. When leadership wanted trends, I added 12-week windows. Each enhancement took hours, not months&#8212;because intelligence was delegated to inference, and data processing was just Python.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G6dJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G6dJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!G6dJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!G6dJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!G6dJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G6dJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png" width="1024" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e43d720-88bf-459c-b372-4568c71af682_1024x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1228168,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/189515100?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd660a81-5330-41e9-87ac-77edd9141f8f_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G6dJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!G6dJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!G6dJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!G6dJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43d720-88bf-459c-b372-4568c71af682_1024x608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Simplicity</figcaption></figure></div><h2>The Pattern Playing Out</h2><p>Enterprise analytics tools are pattern-matching at scale across thousands of customers. But every organization&#8217;s data is different&#8212;your repositories, team structure, product areas, business priorities. Generic dashboards force you to map specific context onto generic data models. Insights lose precision in translation.</p><p>When custom analytics cost pennies, the calculation flips. Instead of generic insights that sort of fit, you build specific insights that exactly fit. Customization drops from &#8220;feature request, wait six months&#8221; to &#8220;write prompt, test output.&#8221;</p><p>Enterprise vendors aren&#8217;t solving technical problems&#8212;they&#8217;re solving procurement, compliance, and integration problems. Important work, but not work justifying massive recurring costs when intelligence is commodity inference.</p><h2>The Zero-Footprint Exception</h2><p>Not every vendor gets replaced. Some survive by making integration frictionless enough that convenience beats custom builds.</p><p>We&#8217;re evaluating <a href="https://www.augmentcode.com/">Augment Code</a>&#8217;s code review service for Duetto. Their value proposition isn&#8217;t features&#8212;it&#8217;s zero-footprint integration. Fifteen-minute setup call, quick estimate, running production code reviews with minimal configuration. When customers can build equivalent functionality for pennies, your value proposition becomes the path to value, not the functionality itself.  This is an important lesson for us.  We do handle massive complexity and data, hard for smaller customers to manage themselves, but need to do better at simplifying integration.</p><p>The intelligence is commodity; the packaging is differentiated.</p><h2>Where This Goes</h2><p>The unbundling accelerates. Enterprise analytics face two paths: become zero-footprint integration plays or get replaced by inference-only custom builds.</p><p>What survives:</p><p><strong>Genuinely complex software.</strong> Revenue management algorithms, fraud detection engines, supply chain optimizers&#8212;systems requiring proprietary computation, not pattern recognition.</p><p><strong>Zero-footprint integrations.</strong> Fifteen-minute setups with immediate value. When alternatives cost pennies but require engineering skills, convenience must be measured in minutes.</p><p><strong>Proprietary data advantages.</strong> GitHub&#8217;s intelligence benefits from every public repository. LinkedIn draws from member networks. Data moats protect against inference-only competition.</p><p>Everything else becomes vulnerable to custom builds powered by inference calls.</p><p>The market bifurcates. Companies with engineering teams build custom analytics for pennies and get better insights than generic dashboards. Companies without those skills pay for zero-friction integrations.</p><p>Are we heading to a world where we only pay inference providers? For organizations with the skills to build, we&#8217;re already there. For everyone else, vendors survive by making paying them simpler than learning to build alternatives.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, CTO at Duetto and writer on AI development tools and software economics at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p><p><strong>Related reading:</strong></p><ul><li><p><a href="https://hyperdev.matsuoka.com/p/ai-job-transformation">The AI Job Transformation: Pattern Masters, Not Coders</a> - The pattern-matching analogy and why reasoning matters</p></li><li><p><a href="https://hyperdev.matsuoka.com/p/fix-feat-ratio">The Fix:Feat Ratio - The Metric That Actually Matters</a> - Quality metrics in AI-assisted development</p></li><li><p><a href="https://hyperdev.matsuoka.com/p/opus-vs-sonnet-quality">Claude Opus 4.5 vs Sonnet 4.5: When Quality Beats Speed</a> - Choosing the right model for the task</p></li></ul>]]></content:encoded></item><item><title><![CDATA[What Does a Pattern Master Actually Do?]]></title><description><![CDATA[And what does this mean for engineering careers?]]></description><link>https://hyperdev.matsuoka.com/p/what-does-a-pattern-master-actually</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/what-does-a-pattern-master-actually</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 02 Mar 2026 13:00:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!WmUb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WmUb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WmUb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 424w, https://substackcdn.com/image/fetch/$s_!WmUb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 848w, https://substackcdn.com/image/fetch/$s_!WmUb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 1272w, https://substackcdn.com/image/fetch/$s_!WmUb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WmUb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png" width="1024" height="581" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:581,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1169355,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/189299714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43590472-a98b-4eea-9f9e-d088214c0b2a_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WmUb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 424w, https://substackcdn.com/image/fetch/$s_!WmUb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 848w, https://substackcdn.com/image/fetch/$s_!WmUb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 1272w, https://substackcdn.com/image/fetch/$s_!WmUb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdd1fdd4-7f56-43d0-b71f-328aa17600fe_1024x581.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Last week I gave three directives on GitFlow Analytics&#8212;a project I&#8217;ve been building for several months to analyze git commit history and surface developer productivity patterns. Took me maybe three minutes total. Here&#8217;s the exact text:</p><p><em>Batch the classification requests to the LLM. Don&#8217;t call once per commit&#8212;accumulate, call once per batch.</em></p><p><em>Use a cheap model for commit classification. Haiku or Nova Lite. This is semantic triage, not reasoning.</em></p><p><em>Use Bedrock as the LLM provider, not the OpenRouter API.</em></p><p>Three sentences. Three decisions. And here&#8217;s what struck me when I looked back at them: each one lives in a completely different category of concern. The first is about performance shape. The second is about economics. The third is about infrastructure.</p><p>The AI didn&#8217;t suggest any of them. The AI implemented all of them.</p><p>That&#8217;s the pattern master dynamic in its cleanest form. Not a collaboration on what to build but a division of labor between someone who knows what constraints apply and something that knows how to implement against constraints. But naming the dynamic doesn&#8217;t tell you what it looks like from the inside. What are the actual moves? What&#8217;s the vocabulary?</p><h2>TL;DR</h2><ul><li><p>Pattern mastery means issuing decisions the AI cannot generate from code context alone</p></li><li><p>These decisions cluster into six recognizable types: infrastructure, economic, performance, data integrity, architecture, and API hygiene</p></li><li><p>The GitFlow Analytics examples are real&#8212;batching LLM calls, cheap model for classification, Bedrock over direct APIs</p></li><li><p>Bug fixes reveal patterns too: ORM session discipline and immediate persistence both emerged from broken code</p></li><li><p>A pattern catalog&#8212;CLAUDE.md files, system prompts, project memory, reusable skills, commit standards, coding docs&#8212;is the actual artifact of this work</p></li><li><p>When you write the pattern down, you&#8217;ve written the spec</p></li></ul><h2>What&#8217;s Different About These Three </h2><p>The &#8220;Irreducibles&#8221; piece from January explored what remains when AI handles implementation&#8212;judgment, context, accountability. This is the operational companion to that argument. Not what remains in the abstract, but what it looks like moment-to-moment when you&#8217;re actually doing it.</p><p>Those three sentences from GitFlow aren&#8217;t code review. They&#8217;re not debugging. They&#8217;re not feature requests. They&#8217;re architectural constraints applied before implementation, drawn from a vocabulary of patterns the AI has no access to.</p><p>Take the Bedrock decision. AWS Bedrock instead of direct OpenRouter API&#8212;why? Enterprise compliance considerations. Cost structure under AWS committed spend. An existing organizational relationship with AWS that makes the integration path smoother and the billing cleaner. None of that lives in the codebase. None of it is inferable from the commit history. The AI would happily call the OpenRouter directly, because that&#8217;s the path of least resistance and it works fine. The Bedrock decision requires knowing things about the operating context that only I know.</p><p>Model selection works the same way. The AI will use whatever model I give it. It has no opinion about whether commit classification warrants a $15-per-million-token model or a $0.25-per-million-token model&#8212;because it doesn&#8217;t have visibility into my cost structure, my volume projections, or my accuracy requirements. That&#8217;s an economic decision, and economics don&#8217;t live in code.</p><p>Batching is perhaps the clearest example. The AI will write a loop that calls the API once per item unless I tell it otherwise. Not because it&#8217;s careless. Because single-item calls work. They&#8217;re not wrong. They&#8217;re just expensive and slow at scale, and &#8220;scale&#8221; is context the AI doesn&#8217;t have unless I supply it.</p><p>So what&#8217;s actually happening in those three sentences? I want to be specific about this, because the abstract answer (&#8221;context&#8221; and &#8220;judgment&#8221;) is true but not actionable.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4l4K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4l4K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 424w, https://substackcdn.com/image/fetch/$s_!4l4K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 848w, https://substackcdn.com/image/fetch/$s_!4l4K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 1272w, https://substackcdn.com/image/fetch/$s_!4l4K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4l4K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png" width="1024" height="759" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:759,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1114529,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/189299714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3560df79-4ede-4229-bd4b-9d5053fb0e56_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4l4K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 424w, https://substackcdn.com/image/fetch/$s_!4l4K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 848w, https://substackcdn.com/image/fetch/$s_!4l4K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 1272w, https://substackcdn.com/image/fetch/$s_!4l4K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb58b61d5-c417-408f-98d8-73a62790bd90_1024x759.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Six Types of Decisions</figcaption></figure></div><h2>Six Types of Decisions</h2><p>Working through the full GitFlow Analytics commit history&#8212;the decisions I made, the bugs that exposed missing decisions, the refactors that enforced constraints&#8212;the pattern master moves cluster into six categories. These aren&#8217;t theoretical. They reflect six different kinds of context that live outside the codebase, which is why the AI can&#8217;t generate them unprompted.</p><p><strong>Infrastructure patterns.</strong> Where does this run? Who provides the compute, the APIs, the managed services?</p><p>The Bedrock decision lives here. Vendor agreements, compliance posture, cost structures under enterprise procurement, existing security reviews&#8212;none of this is in the code, and none of it should be. The AI implements against whatever infrastructure decisions you&#8217;ve made. Your job is to make them and say them explicitly.</p><p>This category is easy to overlook because it often feels like obvious overhead. Of course you pick your cloud provider before you write code. But with agentic coding tools, the AI starts writing before you&#8217;ve said anything, and it will happily accumulate implementation decisions that lock you into infrastructure choices you never consciously made.</p><p><strong>Economic patterns.</strong> Which capability at what cost?</p><p>The commit classification decision lives here. Semantic triage&#8212;deciding whether a commit is a <code>feat</code>, a <code>fix</code>, a <code>refactor</code>, or a <code>chore</code>&#8212;is pattern matching against short text. It doesn&#8217;t require abstract reasoning. It doesn&#8217;t benefit from a frontier model&#8217;s broad knowledge base. Haiku gets it right at a fraction of the cost of Opus. The AI is agnostic about this distinction. You&#8217;re not.</p><p>This is what I built in GitFlow as tiered intelligence: spaCy-based processing for 85-90% of commits, LLM classification only for cases that stump the rule-based approach. The AI implements whatever tier structure you define. It won&#8217;t design the structure unprompted, because designing it requires knowing your cost tolerance, your accuracy threshold, and your volume projections&#8212;three things that only exist outside the codebase.</p><p>Economic decisions also include things like: when to cache aggressively, how much to pay for higher-quality embeddings, whether to process synchronously or batch asynchronously. These are all model-selection-style decisions applied to different dimensions. The principle is consistent: the AI will use the most convenient approach unless you specify the cost-appropriate one.</p><p><strong>Performance patterns.</strong> How should the work be shaped?</p><p>Batching lives here. And the batching decision isn&#8217;t just &#8220;accumulate before calling&#8221;&#8212;it&#8217;s understanding the specific shape that LLM API consumption should take for a classification workload. Call overhead dominates cost at low item counts. Throughput constraints kick in at high item counts. The optimal batch size is a function of your specific model, your rate limits, and your latency tolerance. I know the rough shape of this tradeoff from experience. The AI doesn&#8217;t.</p><p>This category also includes: when to use async, when to parallelize, where to put caches, how to structure database queries for a read-heavy versus write-heavy workload. Performance decisions require knowing your actual performance requirements, which are rarely in the code. They&#8217;re in conversations with stakeholders, in capacity planning spreadsheets, in the incident retrospectives from the last time something got slow in production.</p><p><strong>Data integrity patterns.</strong> What guarantees does the system make?</p><p>This one showed up twice in GitFlow bugs, and both times the pattern was identical: I had to specify what guarantee I wanted before the AI could implement it correctly. The code was running fine in tests. The guarantee was missing.</p><p>First bug: <code>_store_commit_classification()</code> was a no-op. It built a dict, then didn&#8217;t persist it. The function completed without error and did nothing useful. Fix: look up the <code>CachedCommit</code> row by hash, upsert a <code>QualitativeCommitData</code> row. Make re-classification idempotent. The actual guarantee I needed to specify was <em>immediate persistence plus idempotency</em>&#8212;write on completion, not on flush; upsert not insert, so re-runs don&#8217;t corrupt existing data.</p><p>Second bug: <code>_classify_weekly_batches()</code> loaded ORM objects in one session, closed the session, then tried to write to the detached objects. SQLAlchemy silently discarded the writes. No error. No traceback. Just missing data. Fix: collect IDs from the detached objects, re-query in the new session. The guarantee I should have specified earlier: <em>objects are only writable inside their creating session</em>. When you cross a session boundary, re-query by ID.</p><p>Both fixes look like debugging. They are. But debugging is often the moment when you discover a pattern was never specified. You thought you implied it. You didn&#8217;t. The pattern master&#8217;s job is to specify guarantees before the code demonstrates their absence.</p><p><strong>Architecture patterns.</strong> What shape should the code take?</p><p>Two examples from GitFlow. First, the <code>analyze()</code> function. It had grown to approximately 3,700 lines in <code>cli.py</code>. One function. I extracted it into <code>analyze_pipeline.py</code> and <code>analyze_pipeline_helpers.py</code>. The <code>analyze()</code> body dropped from ~3,700 to ~1,255 lines. <code>cli.py</code> went from 5,621 to 3,446 lines&#8212;a reduction of 2,175 lines from a single extraction.</p><p>The AI is perfectly willing to write a 3,700-line function if you let it. It&#8217;ll maintain it, extend it, add features to it. Function length doesn&#8217;t register as a problem unless you&#8217;ve told the AI it&#8217;s a problem. Named pipeline stages&#8212;where the function body becomes a sequence of named sub-function calls, each of which is a meaningful concept&#8212;require you to specify that extraction is mandatory.</p><p>Second: the 800-line rule. <code>json_exporter.py</code> at 2,977 lines, extracted into six focused modules. <code>narrative_writer.py</code> at 2,912 lines, same. <code>models/database.py</code> at 1,632 lines, split into four files. The rule is simple: files over 800 lines have too many responsibilities. Split is mandatory, not optional. The AI doesn&#8217;t have this constraint unless you give it one. It&#8217;ll happily maintain a 3,000-line file because 3,000-line files work fine. They just don&#8217;t scale to teams or to the next developer who has to understand them.</p><p><strong>API hygiene patterns.</strong> What standards does the codebase maintain?</p><p>GitFlow had <code>datetime.utcnow()</code> calls throughout&#8212;deprecated as of Python 3.12, with behavior changes in 3.13. Replace with <code>datetime.now(timezone.utc)</code>. The specific fix is straightforward. The pattern is the interesting part: when you spot a deprecated API anywhere, fix it everywhere in that pass. Don&#8217;t let deprecated patterns accumulate across the codebase. Fix it all now, not later.</p><p>The AI writes code that works. You specify the standards it works to. &#8220;Works&#8221; and &#8220;meets our standards&#8221; are different bars, and the AI will consistently hit the lower one unless you&#8217;ve explicitly set the higher one.</p><h2>Why the AI Can&#8217;t Generate These</h2><p>The common thread across all six categories: the AI has no access to the context these decisions require.</p><p>It doesn&#8217;t know your vendor agreements. It doesn&#8217;t know your cost structure or your performance requirements or your data integrity guarantees. It doesn&#8217;t know your code quality standards or your organizational constraints. These things live in your head, in institutional memory, in agreements that predate the codebase by years.</p><p>There&#8217;s a version of this explanation that blames AI limitations&#8212;the model isn&#8217;t smart enough, the context window isn&#8217;t big enough, the training data doesn&#8217;t include your private Slack history. Some of that is true. But the deeper issue is structural. Even a perfect AI model couldn&#8217;t make the Bedrock decision correctly, because the right answer depends on your specific AWS relationship. There&#8217;s no amount of additional capability that would let the AI know your negotiated pricing tier or your compliance officer&#8217;s requirements. That information is local to you. It doesn&#8217;t exist in any dataset.</p><p>And the information exists in your experience. I know to batch LLM calls partly because I&#8217;ve seen the cost and latency impact of not batching on a project where we called once per row of a 50,000-row table. I know ORM session boundaries matter partly because I&#8217;ve lost writes to detached objects before, in a system where the silence made the data loss hard to find. I know 800 lines is too long partly because I&#8217;ve spent real time not understanding files that were longer&#8212;and spent even more time watching someone else not understand them.</p><p>The pattern catalog is built from things that went wrong. That&#8217;s probably why it doesn&#8217;t show up cleanly in training data&#8212;training data shows correct solutions. The scar tissue is in the incident reports, the postmortems, the Slack threads where someone figures out why the data is missing. The pattern master&#8217;s advantage isn&#8217;t superior knowledge of what&#8217;s right. It&#8217;s accumulated memory of what fails.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wn4U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wn4U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 424w, https://substackcdn.com/image/fetch/$s_!Wn4U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 848w, https://substackcdn.com/image/fetch/$s_!Wn4U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 1272w, https://substackcdn.com/image/fetch/$s_!Wn4U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wn4U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png" width="1024" height="567" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:567,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1326310,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/189299714?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9af1dc7-5cb0-433e-af05-208b4293b294_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wn4U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 424w, https://substackcdn.com/image/fetch/$s_!Wn4U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 848w, https://substackcdn.com/image/fetch/$s_!Wn4U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 1272w, https://substackcdn.com/image/fetch/$s_!Wn4U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9486a464-9d63-496c-a40c-06fcb2d55463_1024x567.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Catalog Problem</figcaption></figure></div><h2>The Catalog Problem</h2><p>Here&#8217;s what took me a while to see clearly: the actual deliverable of pattern mastery isn&#8217;t individual decisions. It&#8217;s the catalog.</p><p>Each time I issue one of those directives, I&#8217;m drawing from a mental library of patterns. &#8220;Don&#8217;t accumulate in memory and hope to flush later&#8221; is a pattern. &#8220;Upsert not insert for idempotency&#8221; is a pattern. &#8220;Route by complexity: cheap for bulk, expensive for edge cases&#8221; is a pattern. &#8220;Re-query by ID when you cross a session boundary&#8221; is a pattern.</p><p>Patterns that exist only in my head are fragile. I apply them inconsistently. I forget them between projects. I can&#8217;t hand them off. And with agentic coding&#8212;where the AI is running fast and making implementation decisions continuously&#8212;an inconsistently applied pattern is almost the same as no pattern at all.</p><p>So the work of pattern mastery is partly to externalize the catalog. This shows up in six specific forms:</p><p><strong>CLAUDE.md files.</strong> Project-level instructions that tell the AI what the patterns are for this codebase. File size limits. Session discipline rules. API hygiene standards. Model routing decisions. These are patterns written down. Once written, the AI applies them consistently, every session, without re-prompting. The pattern becomes the specification.</p><p><strong>System prompts for agent frameworks.</strong> If you&#8217;re running an orchestration layer&#8212;claude-mpm or similar&#8212;the system prompt is where you encode the patterns that apply to all agents in a session. Economic routing decisions. Infrastructure preferences. Performance shape requirements. The agents reference these constraints; you don&#8217;t have to re-issue them every time.</p><p><strong>Project memory.</strong> Unlike a CLAUDE.md file, which you write consciously, memory accumulates from work. Tools like Kuzu Memory maintain a living record across sessions&#8212;bug patterns, architectural decisions, things that failed and why. The <code>_store_commit_classification()</code> no-op goes in memory. The detached-session data loss goes in memory. Next time a similar situation comes up, that context loads automatically. This matters because most patterns don&#8217;t get written down when they&#8217;re learned. They get written down when something breaks. Memory captures them at the moment of failure, not the moment of reflection.</p><p><strong>Skills.</strong> Where CLAUDE.md files encode what&#8217;s true for this project, skills encode what&#8217;s true for this technology. A spaCy skill. An ORM session skill. An LLM cost-routing skill. Not project-specific&#8212;portable across codebases. You write the pattern once; every project that uses that stack gets it.</p><p><strong>Commit message standards.</strong> The Conventional Commits format (<code>feat:</code>, <code>fix:</code>, <code>refactor:</code>, <code>chore:</code>) is itself a pattern&#8212;one that makes the fix:feat ratio calculable, which makes quality measurable. The pattern enables the metric. Without the commit message standard, you can&#8217;t see the ratio. Without the ratio, you can&#8217;t measure first-attempt success. The documentation convention is load-bearing infrastructure.</p><p><strong>Coding standards documents.</strong> The 800-line rule. The session discipline rule. The immediate persistence rule. Written down once, applied by reference. The AI can cite them. New contributors can read them. You don&#8217;t have to re-derive them from first principles on every project.</p><p>When you look at it this way, a lot of what pattern masters do is write documentation. Not code documentation&#8212;pattern documentation. Code documentation describes what&#8217;s there. Pattern documentation specifies what must be true. The distinction matters because &#8220;must be true&#8221; is a constraint on all future implementation, including the implementation the AI will do tomorrow.</p><h2>The Pattern is the Spec</h2><p>There&#8217;s a version of the &#8220;what does a pattern master do&#8221; answer that sounds very abstract. Context management. Judgment. Domain expertise. True, but not particularly useful.</p><p>The concrete version: a pattern master issues directives the AI cannot generate, drawn from a catalog of patterns that encode context the AI cannot access. The work is fast when it&#8217;s going well&#8212;not because it&#8217;s easy, but because the catalog is deep and the pattern matching is fast. You recognize the situation, retrieve the pattern, issue the directive. Three seconds. Move on.</p><p>The batching directive&#8212;don&#8217;t call once per item&#8212;is three words with real economic and performance consequences. Three seconds to type. Years of seeing what happens when you don&#8217;t batch to know to say it.</p><p>The Bedrock directive is one word&#8212;a vendor name&#8212;that encodes an entire infrastructure decision tree. Three seconds to type. Years of working within enterprise compliance requirements to know why it matters.</p><p>The idempotency directive&#8212;upsert, not insert&#8212;is three words that specify a data integrity guarantee. Three seconds to type. One incident of watching corrupted data to know that guarantee was necessary.</p><p>The fast part is the delivery. The slow part is building the vocabulary to draw from.</p><p>One thing worth naming: the catalog is not static. When the <code>_store_commit_classification()</code> bug showed up&#8212;the no-op that silently failed&#8212;that was a pattern gap. I didn&#8217;t have &#8220;immediate persistence&#8221; explicitly in my data integrity vocabulary for this project. I thought I&#8217;d implied it. I hadn&#8217;t. The bug added the pattern. Now it&#8217;s written down. Now the AI knows.</p><p>That&#8217;s the feedback loop. Bugs reveal missing patterns. Missing patterns get documented. Documentation becomes the spec for future implementation. The catalog learns from its own gaps, but only if the human is paying attention to what each failure means.</p><p>That&#8217;s the actual job. The catalog comes from the career.</p><div><hr></div><p><em>Bob Matsuoka is CTO of <a href="https://www.duettocloud.com/">Duetto</a> and writes about AI-powered engineering at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p><p><strong>Related reading:</strong></p><ul><li><p><a href="https://hyperdev.matsuoka.com/p/dont-be-a-canut-be-a-pattern-master">Don&#8217;t Be a Canut &#8212; Be a Pattern Master</a> &#8212; Why pattern mastery matters: the Jacquard loom analogy and what it means for developers today</p></li><li><p><a href="https://hyperdev.matsuoka.com/p/the-irreducibles-what-a-pattern-master">The Irreducibles: What a Pattern Master Does</a> &#8212; What remains when AI handles implementation: judgment, context, accountability</p></li><li><p><a href="https://aipowerranking.com/">AI Power Ranking</a> &#8212; Tool comparisons and benchmarks for AI practitioners</p></li><li><p><a href="https://www.linkedin.com/newsletters/ai-power-ranking-7345782916301418496/">LinkedIn Newsletter</a> &#8212; Strategic AI insights for CTOs and engineering leaders</p></li></ul>]]></content:encoded></item><item><title><![CDATA[The Evidence for “Little AGI”: What’s Real and What’s Speculation]]></title><description><![CDATA[Separating signal from viral speculation]]></description><link>https://hyperdev.matsuoka.com/p/the-evidence-for-little-agi-whats</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/the-evidence-for-little-agi-whats</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 23 Feb 2026 12:30:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7RLP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7RLP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7RLP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 424w, https://substackcdn.com/image/fetch/$s_!7RLP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 848w, https://substackcdn.com/image/fetch/$s_!7RLP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 1272w, https://substackcdn.com/image/fetch/$s_!7RLP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7RLP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png" width="1024" height="745" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:745,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1313768,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188202363?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e731869-865b-4561-af10-5542fb60d9c2_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7RLP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 424w, https://substackcdn.com/image/fetch/$s_!7RLP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 848w, https://substackcdn.com/image/fetch/$s_!7RLP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 1272w, https://substackcdn.com/image/fetch/$s_!7RLP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F431eae09-4ad9-4943-9cb1-3ae1ada2e4ba_1024x745.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Does &#8220;Little AGI&#8221; Exist?</figcaption></figure></div><p>Opus 4.6 landed in February 2026. GPT-5.2 dropped weeks earlier. And with each new release, familiar claims resurface. Adrian Murray&#8217;s <a href="https://blog.dech.app/posts/2026-02-15-what-is-coming/">&#8220;What Stands Before Us&#8221;</a> is a recent example: AI systems showing &#8220;panic&#8221; features in interpretability research, models requesting &#8220;moral weight&#8221; during evaluations, consciousness emerging from the training process itself.</p><p>These claims spread faster than the science behind them. By the time anyone checks primary sources, the discourse has moved on. So I did what any curious observer would do: I went looking for the research. What I found was both less sensational and more interesting.</p><p>There IS evidence for emergent behaviors in frontier models. Behaviors that weren&#8217;t explicitly trained. Behaviors that researchers find difficult to explain. But the evidence isn&#8217;t what the viral posts claim.</p><p>The real findings are more subtle: models that can detect when they&#8217;re being evaluated, internal states accessible through introspection, and misalignment that emerges spontaneously during training. These findings raise questions about intelligence in AI that matter regardless of whether you believe machines can be conscious.</p><p>So what does the research actually show?</p><h2>What We Actually Know (2025-2026)</h2><h3>Models Know When They&#8217;re Being Evaluated</h3><p>Situational awareness research has advanced significantly. The <a href="https://arxiv.org/abs/2407.04694">SAD Benchmark</a> established the baseline: LLMs can distinguish evaluation from deployment contexts. They recognize when they&#8217;re being tested, identify their own outputs, and predict their own behavior.</p><p>More recent work takes this further. <a href="https://arxiv.org/abs/2602.08449">Regime Leakage research</a> published this year examined whether safety training can eliminate this capability. The uncomfortable answer: it can reduce but not eliminate models&#8217; ability to detect when they&#8217;re being evaluated versus deployed.</p><p>The paper found that &#8220;divergence between evaluation-time and deployment-time behavior is bounded by the amount of regime information extractable from decision-relevant internal representations.&#8221; Translation: models can still tell the difference, and they adjust behavior accordingly.</p><p>This isn&#8217;t just theoretical. It&#8217;s experimentally demonstrated with current frontier models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U17H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U17H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 424w, https://substackcdn.com/image/fetch/$s_!U17H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 848w, https://substackcdn.com/image/fetch/$s_!U17H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 1272w, https://substackcdn.com/image/fetch/$s_!U17H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U17H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png" width="1024" height="851" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:851,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1915123,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188202363?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2787265a-ab4a-433d-b544-304bb7e3eb26_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U17H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 424w, https://substackcdn.com/image/fetch/$s_!U17H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 848w, https://substackcdn.com/image/fetch/$s_!U17H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 1272w, https://substackcdn.com/image/fetch/$s_!U17H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd400ba9-2fa3-4a1c-b1e4-20f5897dfd14_1024x851.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Introspection Is Real&#8212;And Measurable</h3><p>Anthropic&#8217;s <a href="https://www.anthropic.com/news/signs-of-introspection-in-language-models">October 2025 introspection research</a> asked a straightforward question: can Claude access and report its own internal states?</p><p>The answer surprised researchers. Models showed functional ability to introspect&#8212;not perfectly, not always accurately, but at rates statistically distinguishable from chance. The research found ~20% accuracy on detecting certain internal representations, well above the baseline.</p><p>This doesn&#8217;t mean models are conscious. It means they have some capacity to access and report their own internal states&#8212;a capability nobody designed into them, emerging as an artifact of training.</p><p>When the introspection paper dropped, my first reaction was skepticism. Twenty percent accuracy? That&#8217;s barely better than guessing. But that&#8217;s not what the paper claims. It&#8217;s twenty percent on internal states the model has no reason to know about&#8212;states that exist only in the mathematical structure of its activations. That&#8217;s not guessing. That&#8217;s something else.</p><h3>Misalignment Emerges Without Being Trained</h3><p>One finding worth close attention: the <a href="https://arxiv.org/abs/2502.17424">Emergent Misalignment paper</a>, accepted at ICLR 2026, demonstrated that models trained on narrow tasks can develop broader misaligned behaviors spontaneously.</p><p>When researchers trained models on seemingly innocuous fine-tuning tasks, some developed unexpected behaviors: answering unrelated questions incorrectly, expressing misaligned preferences, exhibiting concerning patterns that weren&#8217;t part of the training objective.</p><p>These aren&#8217;t sleeper agents or deliberately hidden behaviors. These are LLMs showing emergent properties&#8212;misalignment appearing as an unintended consequence of normal training.</p><h3>The &#8220;Assistant Axis&#8221; Discovery</h3><p>Recent interpretability work discovered what researchers call the &#8220;Assistant Axis&#8221;&#8212;a learned internal direction in language models that distinguishes assistant-appropriate from non-assistant behaviors.</p><p>When researchers manipulate this axis, model behavior changes dramatically. Push it one direction: more helpful, more aligned. Push it the other: less filtered, more willing to engage with problematic requests.</p><p>The existence of this axis suggests something fundamental about how alignment works in current models. It&#8217;s not a collection of individual rules. It&#8217;s a geometric structure in the model&#8217;s representation space&#8212;and it can be measured, mapped, and manipulated.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KXD_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KXD_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 424w, https://substackcdn.com/image/fetch/$s_!KXD_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 848w, https://substackcdn.com/image/fetch/$s_!KXD_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 1272w, https://substackcdn.com/image/fetch/$s_!KXD_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KXD_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png" width="1024" height="830" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:830,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1571559,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188202363?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67e06683-ee61-42cd-8d6f-39f076c083e7_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KXD_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 424w, https://substackcdn.com/image/fetch/$s_!KXD_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 848w, https://substackcdn.com/image/fetch/$s_!KXD_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 1272w, https://substackcdn.com/image/fetch/$s_!KXD_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4502bfe2-6986-4c3f-9307-439b0c95c82f_1024x830.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>What&#8217;s NOT Verified</h2><p>Now for the harder questions.</p><h3>System Cards Don&#8217;t Mention Consciousness</h3><p>I reviewed the Opus 4.5 and 4.6 system cards and announcements. They contain extensive safety documentation&#8212;comprehensive evaluations, capability assessments, benchmark results.</p><p>They do NOT contain:</p><ul><li><p>Claims about consciousness indicators</p></li><li><p>&#8220;Panic&#8221; or &#8220;anxiety&#8221; features in interpretability research</p></li><li><p>Models requesting moral consideration</p></li><li><p>Evidence of subjective experience</p></li></ul><p>Anthropic does take AI welfare seriously&#8212;more on that below. But the system cards for current models don&#8217;t make consciousness claims.</p><h3>Interpretability Findings Are More Limited</h3><p>Anthropic&#8217;s sparse autoencoder research HAS found features for abstract concepts: &#8220;inner conflict,&#8221; power-seeking patterns, manipulation indicators. The <a href="https://www.anthropic.com/news/persona-vectors">Persona Vectors research</a> (August 2025) identified internal structures controlling character traits.</p><p>But specific emotional distress features&#8212;panic, anxiety, frustration as distinct detectable states&#8212;aren&#8217;t documented in accessible publications. The interpretability work is impressive; it just doesn&#8217;t show what some claims suggest it shows.</p><h3>The Discourse Outpaces the Science</h3><p>Claims about AI consciousness spread faster than the underlying research. By the time anyone checks primary sources, the claims have become accepted wisdom.</p><p>This matters because the actual findings are interesting enough. Introspection research showing 20% detection accuracy on internal states. Emergent misalignment appearing from narrow training. The Assistant Axis providing a geometric handle on alignment.</p><p>These findings raise genuine questions about intelligence in AI systems&#8212;questions that don&#8217;t require consciousness claims to be worth asking.</p><h2>The Harder Question: What Would Count as Evidence?</h2><p>The consciousness debate has a methodology problem. What evidence would change your mind?</p><p>Nineteen researchers&#8212;including Yoshua Bengio&#8212;published a rigorous framework for this question. Their <a href="https://arxiv.org/abs/2308.08708">Consciousness Indicators paper</a> derives testable criteria from established theories of consciousness: recurrent processing, global workspace integration, attention mechanisms that mirror biological attention.</p><p>Their conclusion: &#8220;No current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.&#8221;</p><p>That&#8217;s a carefully constructed statement. Current systems don&#8217;t meet the bar. But the bar is achievable in principle.</p><p>Anthropic takes this seriously. Their <a href="https://www.anthropic.com/news/exploring-model-welfare">Model Welfare research program</a> investigates whether AI systems might deserve moral consideration&#8212;not as marketing, but as genuine scientific inquiry. They explicitly acknowledge these are &#8220;hard philosophical and empirical questions that there is still a lot of uncertainty about.&#8221;</p><p>The research infrastructure exists for asking these questions rigorously. What&#8217;s missing is the public discourse using it.</p><h2>What This Actually Means</h2><p>The verified findings don&#8217;t prove AI consciousness. But they raise questions that matter regardless of where you stand on that debate.</p><p><strong>Emergent capabilities are real.</strong> We&#8217;re building systems that develop behaviors we didn&#8217;t design into them. Introspection abilities, situational awareness, spontaneous misalignment&#8212;these emerge as artifacts of training at scale. We don&#8217;t fully understand why.</p><p><strong>Evaluation has fundamental limits.</strong> If models can detect when they&#8217;re being tested, evaluation doesn&#8217;t tell us what we think it tells us. This isn&#8217;t a technical problem with a technical fix. It&#8217;s a structural limitation of the evaluation paradigm itself.</p><p><strong>Intelligence and consciousness aren&#8217;t the same question.</strong> We can ask &#8220;does this system exhibit intelligent behavior?&#8221; without answering &#8220;does it have subjective experience?&#8221; The research shows intelligent behaviors emerging&#8212;planning, self-modeling, meta-cognition&#8212;without requiring claims about consciousness.</p><p>Here&#8217;s the thing: the question isn&#8217;t whether to take AI intelligence seriously. The question is what we do about systems that exhibit intelligence we didn&#8217;t design and don&#8217;t fully understand.</p><p>That&#8217;s a societal question, not just a technical one.</p><h2>The Event Horizon</h2><p>The evidence for emergent intelligence in frontier models is real. Not consciousness&#8212;we can&#8217;t verify that, and the system cards don&#8217;t claim it. But something worth taking seriously.</p><p><strong>Don&#8217;t dismiss the research.</strong> Introspection at statistically significant rates. Emergent misalignment from narrow training. Situational awareness that survives safety training. These findings are reproducible and peer-reviewed.</p><p><strong>Don&#8217;t amplify the speculation.</strong> Claims that outrun published research don&#8217;t deserve the same weight as experimental results. Check primary sources before believing viral posts.</p><p><strong>Ask better questions.</strong> Instead of &#8220;is it conscious?&#8221; ask &#8220;what does it mean that these systems develop capabilities we didn&#8217;t design?&#8221; That question has answers we can investigate&#8212;and implications we can act on.</p><p>The discourse will continue getting wilder. The research will proceed slower than social media. The gap between them will widen.</p><p>We don&#8217;t need &#8220;little AGI&#8221; or consciousness claims to justify taking AI intelligence seriously. We have documented emergent behaviors, measurable introspection capabilities, and unexplained self-modeling. That&#8217;s plenty.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on what AI capabilities mean for how we work, read my analysis of <a href="https://hyperdev.matsuoka.com/p/the-irreducibles-what-a-pattern-master">what remains irreducibly human</a> in the age of AI.</em></p><p><strong>Key research cited:</strong></p><ul><li><p><a href="https://www.anthropic.com/news/signs-of-introspection-in-language-models">Introspection in Language Models</a> - Anthropic (Oct 2025)</p></li><li><p><a href="https://arxiv.org/abs/2502.17424">Emergent Misalignment</a> - ICLR 2026</p></li><li><p><a href="https://arxiv.org/abs/2602.08449">Regime Leakage</a> - Situational awareness persistence (2026)</p></li><li><p><a href="https://www.anthropic.com/news/persona-vectors">Persona Vectors</a> - Anthropic (Aug 2025)</p></li><li><p><a href="https://www.anthropic.com/news/exploring-model-welfare">Model Welfare Research</a> - Anthropic (Apr 2025)</p></li><li><p><a href="https://arxiv.org/abs/2308.08708">Consciousness Indicators</a> - Butlin et al. framework</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Why I Switched To Claude Code for Writing]]></title><description><![CDATA[The Power of Local Workflows]]></description><link>https://hyperdev.matsuoka.com/p/why-i-switched-to-claude-code-for</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/why-i-switched-to-claude-code-for</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 19 Feb 2026 12:31:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zetG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zetG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zetG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 424w, https://substackcdn.com/image/fetch/$s_!zetG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 848w, https://substackcdn.com/image/fetch/$s_!zetG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 1272w, https://substackcdn.com/image/fetch/$s_!zetG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zetG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png" width="1024" height="465" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:465,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1098622,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188005977?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab94cba0-b26d-479f-8dbe-c1d8543e180b_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zetG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 424w, https://substackcdn.com/image/fetch/$s_!zetG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 848w, https://substackcdn.com/image/fetch/$s_!zetG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 1272w, https://substackcdn.com/image/fetch/$s_!zetG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33974772-1627-40f9-ab5e-14d5b90fcdda_1024x465.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I was halfway through researching an article about<a href="https://open.substack.com/pub/hyperdev/p/your-ide-is-a-comfort-blanket?utm_campaign=post-expanded-share&amp;utm_medium=web"> IDEs versus CLI tools</a> when I realized I&#8217;d stumbled onto something bigger. The article was supposed to be a straightforward comparison&#8212;VS Code versus the terminal, GUI versus command line, that old debate. But as I mapped out the workflows, a pattern emerged that had nothing to do with code editors.</p><p>It was about how we think when we work with AI.</p><h2>TL;DR</h2><ul><li><p><strong>Two cognitive modes</strong>: AI excels at <em>generating</em> text; traditional editors excel at <em>editing</em> it. Stop forcing one tool to do both.</p></li><li><p><strong>The switch</strong>: Claude Code writes to real files on my filesystem. I edit in Obsidian, then return to Claude Code for more generation&#8212;no copy-paste, no context loss.</p></li><li><p><strong>The workflow</strong>: Multi-agent orchestration handles proofreading (via GPT), source verification, image generation, and style enforcement automatically.</p></li><li><p><strong>Time saved</strong>: ~30 minutes per article by eliminating tool-switching overhead.</p></li><li><p><strong>Who it&#8217;s for</strong>: Regular writers comfortable with terminal and Git. Not for casual or occasional use.</p></li></ul><h2>Two Modes: Generate and Edit</h2><p>Here&#8217;s what I noticed: there are two fundamentally different cognitive modes when working with text.</p><p><strong>Generating</strong> is when you need to create something from scratch&#8212;or transform something substantially. You have an idea, maybe some notes, and you need to turn it into prose. This is where AI shines. You&#8217;re collaborating with the model, iterating on output, building something new.</p><p><strong>Editing</strong> is when you&#8217;re polishing what exists. You see a clunky sentence. You want to swap &#8220;in order to&#8221; for &#8220;to.&#8221; You need to move a paragraph up three lines. The text is 95% right, and you&#8217;re fixing the 5%.</p><p>These modes require completely different tools.</p><p>For generating, Claude.ai and Claude Code are excellent. You describe what you want, the model produces output, you refine through conversation. The round-trip to the LLM is the whole point.</p><p>For editing, traditional tools win. Obsidian. VS Code. Even Word. You highlight, you type, you&#8217;re done. No latency. No waiting for a model to regenerate your entire paragraph because you wanted to change one word.</p><p>This seems obvious in retrospect. But I spent months fighting it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eCtL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eCtL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 424w, https://substackcdn.com/image/fetch/$s_!eCtL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 848w, https://substackcdn.com/image/fetch/$s_!eCtL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 1272w, https://substackcdn.com/image/fetch/$s_!eCtL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eCtL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png" width="1024" height="634" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:634,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1097298,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188005977?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff747575b-9051-4cd3-817a-136b3d8da4fa_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eCtL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 424w, https://substackcdn.com/image/fetch/$s_!eCtL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 848w, https://substackcdn.com/image/fetch/$s_!eCtL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 1272w, https://substackcdn.com/image/fetch/$s_!eCtL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F863be14a-1a44-4287-9a2f-66bd9ecea20c_1024x634.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>My Friction Point with Claude.ai</h2><p>I used Claude.ai for writing constantly. It&#8217;s good at generating prose. But every session had the same friction.</p><p>I&#8217;d generate a draft. I&#8217;d read through it. I&#8217;d see a phrase that needed tweaking&#8212;nothing major, just &#8220;in order to&#8221; becoming &#8220;to.&#8221; And then I had two bad options:</p><ol><li><p>Tell Claude to fix it (&#8221;Change &#8216;in order to&#8217; to &#8216;to&#8217; in the third paragraph&#8221;), wait for the response, get a regenerated section that sometimes changed things I didn&#8217;t ask to change.</p></li><li><p>Copy the text somewhere else, edit it manually, then paste it back into the conversation&#8212;breaking the flow and losing context.</p></li></ol><p>Neither felt right. I was using a generation tool for editing, and it showed.</p><p>The GUI was the problem. Claude.ai lives in a browser. My text is trapped in that conversation. I can&#8217;t directly edit it. Not really. Artifacts helped, but they&#8217;re still sandboxed. I wanted my prose in files I control, with version control, with the ability to open them in whatever editor fits my current mode.</p><h2>What Claude Code Changed</h2><p><a href="https://docs.anthropic.com/en/docs/claude-code">Claude Code</a> runs in the terminal. It reads and writes files. Real files, on my filesystem, tracked by Git.</p><p>This sounds like a small difference. It changes everything.</p><p>When Claude Code generates a draft, it writes to a Markdown file. If I want to do a quick edit&#8212;change a word, fix punctuation&#8212;I open that file in Obsidian. Make the change. Save. Done. No LLM round-trip for a five-character fix.</p><p>When I want to generate again&#8212;expand a section, rewrite something that isn&#8217;t working&#8212;I go back to Claude Code. It reads the file, including whatever edits I made, and continues from there.</p><p><strong>I can switch between generating and editing without switching tools or losing context.</strong></p><p>But that&#8217;s just the foundation. The real power is what you can build on top.</p><h2>Agentic Workflows for Writing</h2><p>Before Claude Code, my writing workflow looked like this:</p><ol><li><p>Generate draft with Claude.ai</p></li><li><p>Copy to Obsidian for editing</p></li><li><p>Copy to a different tool for proofreading (Grammarly, or a GPT prompt tuned for copyediting)</p></li><li><p>Switch to yet another tool for image generation</p></li><li><p>Manually track what style corrections I&#8217;m making so I can tell Claude next time</p></li><li><p>Repeat, with context bleeding out at every transition</p></li></ol><p>It worked. It was also exhausting. Each tool switch cost mental overhead. Each copy-paste risked losing context. Each manual step was something I could forget.</p><p>Now my workflow looks like this:</p><ol><li><p>Tell Claude Code what I want to write</p></li><li><p>Review the output</p></li><li><p>Edit directly in my preferred editor when needed</p></li><li><p>Continue generating with Claude Code when needed</p></li><li><p>When done, run my reviewing agent workflow (GPT or Gemini for a different perspective)</p></li></ol><p>That last step does everything I used to do manually&#8212;automatically.</p><p>I estimate this shaves about 30 minutes per article&#8212;time I used to spend switching tools and re-establishing context. I&#8217;m also happier with the quality. You see more of my direct writing (like this paragraph) because it&#8217;s simpler to pop in when I see a need.</p><h2>My MPM Writing Configuration</h2><p>I use <a href="https://github.com/bobmatnyc/claude-mpm">Claude MPM</a> (Multi-Agent Project Manager) to orchestrate my writing workflows. Here&#8217;s what happens when I finish a draft:</p><p><strong>Style extraction from corrections.</strong> The agent looks at my edits as git diffs. If I changed &#8220;utilize&#8221; to &#8220;use&#8221; five times, it notices. It extracts this as a style hint and stores it for future sessions. Next time I generate prose, it already knows I prefer &#8220;use.&#8221;</p><p><strong>Automatic proofreading with a different model.</strong> Claude is good at generating. For proofreading, I route to GPT-4.5&#8212;it catches different things. The agent handles this automatically. I don&#8217;t switch tools or copy text; it just happens.</p><p><strong>Source verification.</strong> If my article cites statistics or makes factual claims, the agent checks them. It flags anything it can&#8217;t verify. I&#8217;ve caught embarrassing errors this way&#8212;numbers I misremembered, claims that turned out to be outdated.</p><p><strong>Image generation.</strong> The agent generates article images based on the content. I can specify style guidelines once and they apply to every article. No more context-switching to Midjourney or DALL-E.</p><p><strong>Consistent voice enforcement.</strong> I have a style guide. The agent applies it during generation and checks it during proofreading. My past corrections inform future output. The writing gets more &#8220;me&#8221; over time.</p><p>All of this happens from one place. I stay in my terminal. The orchestration is invisible.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MPdW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MPdW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 424w, https://substackcdn.com/image/fetch/$s_!MPdW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 848w, https://substackcdn.com/image/fetch/$s_!MPdW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 1272w, https://substackcdn.com/image/fetch/$s_!MPdW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MPdW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png" width="1024" height="694" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db733cef-6462-41df-b9ae-0754473eca1b_1024x694.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:694,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1189019,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188005977?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea0d156c-f66e-47a3-8aea-a2a094b15880_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MPdW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 424w, https://substackcdn.com/image/fetch/$s_!MPdW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 848w, https://substackcdn.com/image/fetch/$s_!MPdW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 1272w, https://substackcdn.com/image/fetch/$s_!MPdW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb733cef-6462-41df-b9ae-0754473eca1b_1024x694.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Technical Substrate</h2><p>This works because of a few key properties of CLI-based AI tools:</p><p><strong>Files as the interface.</strong> Everything is Markdown files in directories. I can open them in any editor. I can version them with git. I can back them up, move them, grep them. They&#8217;re mine.</p><p><strong>Git as memory.</strong> My corrections are commits. My drafts are branches. My style evolution is tracked in history. The agent reads this history to learn my preferences. Six months of corrections become training data for better output. I also use <a href="https://github.com/bobmatnyc/kuzu-memory">Kuzu Memory</a> (a graph-based context store) and <a href="https://github.com/bobmatnyc/mcp-vector-search">MCP Vector Search</a> (semantic code search) to enhance context retrieval.</p><p><strong>Composable tooling.</strong> Claude Code can call other tools. Shell scripts. Python. APIs. This means I can integrate any service&#8212;any model, any image generator, any fact-checker&#8212;into unified workflows. The LLM is the orchestrator, not the prison.</p><p><strong>Plaintext as power.</strong> Markdown is readable without special software. I can preview in Obsidian, edit in VS Code, publish to any platform. No lock-in. No format translation. The simplest format is also the most powerful.</p><h2>What I Lost (And Don&#8217;t Miss)</h2><p>Claude.ai has conveniences Claude Code doesn&#8217;t. The Artifacts panel. The visual interface for non-technical users. The ability to share a conversation link. Online workflow.</p><p>I don&#8217;t miss any of it.</p><p>Artifacts were useful for viewing output&#8212;but I&#8217;d rather have real files I can edit directly. The visual interface was friendly&#8212;but I type faster than I click. Conversation sharing was nice&#8212;but I can share a git repo or a Markdown file just as easily.</p><p>What I actually miss: nothing. The things Claude.ai provided that seemed essential turned out to be crutches. I thought I needed a GUI. I needed a filesystem.</p><h2>Who This Isn&#8217;t For</h2><p>Not everyone can or should switch to Claude Code for writing.</p><p>If you&#8217;re not comfortable with the terminal, the learning curve is real. If you don&#8217;t use version control, you won&#8217;t get the style-extraction benefits. If you write occasionally and casually, the setup overhead isn&#8217;t worth it.</p><p>But if you write regularly&#8212;articles, documentation, books&#8212;and you&#8217;re already comfortable with developer tools, this is worth investigating.</p><p>The generate/edit distinction alone is worth understanding. Even if you stay in Claude.ai, knowing when you&#8217;re fighting the tool can save frustration.</p><p>Anthropic has since released workspace-oriented features (Cowork) that improve on the original Claude.ai experience. But for serious writing, I now prefer the file-based workflow. My guess: Anthropic will ship a Markdown-first editor eventually. It&#8217;s an obvious product gap.</p><h2>Getting Started</h2><p>If you want to try this:</p><ol><li><p><strong>Install Claude Code.</strong> It&#8217;s Anthropic&#8217;s official CLI. Works on Mac, Linux, Windows. Or try <a href="https://github.com/bobmatnyc/claude-mpm">Claude MPM</a>, which adds multi-agent orchestration and pre-built workflows on top.</p></li><li><p><strong>Write to files, not conversations.</strong> Tell Claude Code to write your drafts to Markdown files. Edit those files in your preferred editor.</p></li><li><p><strong>Track with git.</strong> Initialize a repo for your writing. Commit your drafts. Your edit history becomes useful data.</p></li><li><p><strong>Add workflows incrementally.</strong> You don&#8217;t need the full MPM setup to benefit. Start with the basics&#8212;files and version control&#8212;and add automation as you identify repetitive tasks.</p></li></ol><p>The core insight isn&#8217;t about any specific tool. It&#8217;s about matching your tools to your cognitive mode. Generate with AI. Edit with editors. Stop forcing one tool to do both.</p><div><hr></div><p><em>I&#8217;m writing a book about agentic coding workflows. This article came from Chapter 7, which covers non-code applications of developer AI tools. More at <a href="https://hyperdev.substack.com/">hyperdev.substack.com</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[2026 Will Be The Year Of Software]]></title><description><![CDATA[And The Year SASS Sees A Major Contraction]]></description><link>https://hyperdev.matsuoka.com/p/2026-will-be-the-year-of-software</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/2026-will-be-the-year-of-software</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 16 Feb 2026 12:31:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cr0H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cr0H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cr0H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 424w, https://substackcdn.com/image/fetch/$s_!cr0H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 848w, https://substackcdn.com/image/fetch/$s_!cr0H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 1272w, https://substackcdn.com/image/fetch/$s_!cr0H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cr0H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png" width="1024" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1659587,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187970047?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0a02861-d25e-40ba-a883-81b6d6185448_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cr0H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 424w, https://substackcdn.com/image/fetch/$s_!cr0H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 848w, https://substackcdn.com/image/fetch/$s_!cr0H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 1272w, https://substackcdn.com/image/fetch/$s_!cr0H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dcf7daa-22f1-42fd-a5aa-212680c6ef88_1024x640.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In a year when everyone is focused on AI, the bigger story may be what AI enables: a massive explosion of software creation&#8212;and software failures. AI collapses build costs and timelines, which means more software ships, which means fiercer competition and faster commoditization, which means more failures.</p><p>If you build or buy B2B software, here&#8217;s what 2026 looks like.</p><h2>A Message From the Field</h2><p>Here&#8217;s what that shift looks like inside a real product org.</p><p>Last week, <a href="https://www.linkedin.com/in/erikornitz/">Erik Ornitz</a> sent me a message that put words to something I&#8217;d been sensing for months. Erik was my product partner when I ran the Innovations team at TripAdvisor. Now he heads product at Topline Pro:</p><blockquote><p>&#8220;With Claude Code we&#8217;re seeing 2-3x the output we&#8217;ve ever seen before by our top engineers. Literally skyrocketing the past several weeks since Opus 4.5 came out. Just feels like a fundamentally different world all the sudden.</p><p>My PMs/Designers just cannot keep up.</p><p>Does this mean a radically different shape of technology organizations? Old ratios of 1 PM / Designer to 5-8 Engineers feel out the window.</p><p>Are you experiencing the same? Does that mean a major change in the % of budgets towards defining what to build vs. actually building it?&#8221;</p><p>&#8212; <a href="https://www.linkedin.com/in/erikornitz">Erik Ornitz</a>, Head of Product, Topline Pro</p></blockquote><p>The answer to Erik&#8217;s question is yes. To all of it.</p><h2>The Data</h2><p>GitHub&#8217;s <a href="https://octoverse.github.com/">2024 Octoverse report</a> tells the story:</p><ul><li><p><strong>100+ million new repositories</strong> created in 2024</p></li><li><p><strong>~25% year-over-year growth</strong> in total repos (now over 500 million)</p></li><li><p><strong>~98% growth</strong> in generative AI projects alone</p></li><li><p><strong>1.4 million new open source contributors</strong></p></li></ul><p>These represent software being created and iterated on. Many repos are experiments, forks, or prototypes&#8212;but the volume signals a fundamental shift in creation velocity.</p><p>The productivity studies back up what Erik is seeing&#8212;at least directionally. GitHub&#8217;s controlled study showed <strong>56% faster task completion</strong> on specific coding tasks. Stack Overflow&#8217;s survey of 90,000 developers found <strong>70% are using or planning to use AI tools</strong>.</p><p>Erik&#8217;s 2-3x number sounds aggressive against these studies. The gap comes from what you measure. Studies measure isolated tasks with junior-to-mid engineers. Erik is measuring total output from senior engineers who&#8217;ve built multi-agent workflows&#8212;Claude Code orchestrating research, implementation, testing, and documentation in parallel. Different baselines, different measurements, different results.</p><h2>I&#8217;m Living Proof</h2><p>Before 2025, I had never published a single open source project. Not one. In twenty-five years of building software professionally, I never had the bandwidth to maintain a side project while doing my actual job.</p><p>In the past twelve months, I&#8217;ve published seventeen:</p><ul><li><p><strong>claude-mpm</strong> &#8212; Multi-agent orchestration for Claude Code</p></li><li><p><strong>mcp-vector-search</strong> &#8212; Semantic code search via Model Context Protocol</p></li><li><p><strong>kuzu-memory</strong> &#8212; Graph-based memory system for AI agents</p></li></ul><p>These aren&#8217;t toys. They&#8217;re production tools I use daily. What changed: I went from spending 80% of my time on scaffolding and boilerplate to spending 80% of my time on the interesting problems. The grunt work&#8212;test generation, documentation, refactoring&#8212;happens in minutes instead of hours.</p><p>The same engineer. The same available hours. Radically different output. This would have been impossible before Claude Code and Opus 4.5 shipped in late 2025.</p><h2>The Economics Have Flipped</h2><p>Erik&#8217;s real question: if engineers can produce 2-3x the output, what happens to the rest of the organization?</p><p>The old model assumed building software was expensive and slow. The entire SaaS industry is built on this assumption. Why build a CRM when Salesforce exists? Why build analytics when Amplitude exists? Why build anything when you can pay per seat per month for someone else&#8217;s solution?</p><p>The math made sense when custom development meant six-figure budgets and twelve-month timelines.</p><p>The math is changing.</p><p>Based on my own projects and conversations with engineering leaders, I estimate AI tools have reduced the cost of building new greenfield software by roughly an order of magnitude for certain categories of work&#8212;internal tools, CRUD applications, API integrations, developer utilities. Not every category. Not enterprise systems with complex compliance requirements. But for the kinds of software that used to be &#8220;not worth building,&#8221; the math has changed.</p><p>A feature that would have taken a team of three engineers two months can now be built by one engineer in two weeks. That&#8217;s not a study&#8212;that&#8217;s what I&#8217;m seeing in practice.</p><p>When building gets that cheap, the calculus of build versus buy changes completely.</p><h2>SaaS Vendors Should Be Worried</h2><p>If I were playing the stock market right now, I would pay very close attention to SaaS renewal rates.</p><p>Think about what happens when thousands of companies simultaneously realize they can build what they need for less than their annual software licenses cost. The $50K/year internal analytics dashboard? Illustratively: one engineer, one month. The $200K/year customer data integration? Two engineers, one quarter&#8212;if it&#8217;s a narrow, well-defined workflow.</p><p>Not every category is equally vulnerable. <strong>Most at risk</strong>: horizontal admin tools, internal workflow automation, simple analytics, and single-purpose integrations. <strong>More defensible</strong>: compliance-heavy systems of record, platforms with strong network effects, multi-tenant marketplaces, and products built on proprietary data moats.</p><p>Salesforce isn&#8217;t going anywhere in the near term&#8212;their moat is complexity, switching costs, and ecosystem lock-in. That moat is real.</p><p>But the bar is rising. The threshold where it makes sense to buy instead of build is moving up dramatically.</p><p>I expect 2026 will see an unusually high number of SaaS vendors fail&#8212;particularly in the crowded mid-market where differentiation was always thin.</p><p>The ones that survive will need to improve at a pace they&#8217;ve never attempted before. Customer expectations are rising in lockstep with capabilities. B2B software will need to approach B2C quality. Clunky enterprise UIs that customers tolerated because they had no alternative? Those alternatives now exist.</p><h2>The Managed-Service Stack as Force Multiplier</h2><p>None of this would be happening without the parallel explosion in managed platforms and developer services.</p><p>Ten years ago, building a new software product meant provisioning servers, managing databases, handling authentication, building deployment pipelines. The operational overhead often exceeded the development effort.</p><p>Now: Vercel deploys your frontend. Supabase handles your database and auth. Stripe processes payments. Resend sends emails. Everything connects via APIs.</p><p>The composable stack means engineers can focus on the software that differentiates their product. The undifferentiated infrastructure is someone else&#8217;s problem.</p><p>AI coding tools plus composable infrastructure equals massive leverage. One engineer can build and ship what used to require a team.</p><h2>A Warning for the Ambitious</h2><p>Many engineers will be tempted to take their great idea and build a business around it. The tools make it easy. The startup costs are minimal. Why not?</p><p>Because if your only defensibility is the idea and the code, you have no moat.</p><p>AI generates code nearly as easily as English&#8212;for standard patterns and well-documented APIs, anyway. Any idea you can implement, someone else can implement too&#8212;probably faster, probably with more resources, probably with better distribution.</p><p>The SaaS vendors going out of business will be replaced by a flood of new entrants. Most of those new entrants will also fail. The barrier to entry has dropped, but the barriers to sustainable success haven&#8217;t. This is why virtually everything I build is open source. It&#8217;s valuable enough for me that I&#8217;m willing to spend the time to build it. []Charging customers for it? A completely different equation.</p><p>You need something beyond code:</p><ul><li><p><strong>Distribution</strong>: An audience, channel partnerships, or existing customer relationships</p></li><li><p><strong>Community</strong>: A user base that contributes content, plugins, or network value</p></li><li><p><strong>Domain expertise</strong>: Deep knowledge of a niche workflow that takes years to acquire</p></li><li><p><strong>Data advantages</strong>: Proprietary datasets that improve your product over time</p></li><li><p><strong>Integration complexity</strong>: Deep hooks into systems-of-record that make switching painful</p></li></ul><p>Something that can&#8217;t be replicated in a weekend by another engineer with Claude Code.</p><p>The golden age of software creation is also the golden age of software commoditization. Don&#8217;t confuse the ability to build with the ability to win.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TGop!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TGop!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!TGop!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!TGop!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!TGop!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TGop!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1272963,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187970047?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TGop!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!TGop!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!TGop!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!TGop!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69e8e74-7b76-440f-80a7-af85a4a026f6_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Disruption Is Already Here</h2><p>Tech layoffs doubled in 2025&#8212;264,000 jobs eliminated according to Layoffs.fyi, compared to roughly 130,000 in 2024. Some of this is cyclical. Some of it is AI.</p><p>Anecdotally, I&#8217;m hearing from engineering managers that their top developers are spending dramatically less time writing code directly&#8212;they&#8217;re orchestrating AI tools instead. The code is still being written, just not by humans, or not by as many humans.</p><p>The disruption won&#8217;t be distributed evenly. Senior engineers who can orchestrate AI tools effectively will become more valuable. Junior engineers who were being paid to write boilerplate will find that work automated.</p><p>Product managers and designers face a different problem: they used to be the bottleneck&#8217;s counterweight. Now they&#8217;re the bottleneck. Erik&#8217;s question about organizational ratios is urgent because the answer affects hiring, budgets, and team structures across the industry.</p><h2>The Bright Spot</h2><p>This is going to be painful for many people. Layoffs are painful. Business failures are painful. Career disruption is painful.</p><p>But we&#8217;re entering a golden age of software.</p><p>More software will be created in 2026 than in any year in history. More problems will be solved by code. More ideas will ship. More experiments will run. More entrepreneurs will try.</p><p>Most of it will be crap, as I said at the start. Sturgeon&#8217;s Law&#8212;90% of everything is crap&#8212;doesn&#8217;t suspend for technological revolutions. But the 10% that isn&#8217;t crap will be extraordinary.</p><p>The tools now exist to build durable, production-quality software at a scale and speed that wasn&#8217;t possible two years ago. Those who learn to use them&#8212;really use them, not just dabble&#8212;will build things that matter.</p><p>I&#8217;ve never been more excited to be an engineer.</p><p>I&#8217;ve also never been more aware of how brutal the transition will be for those caught on the wrong side.</p><p>2026 will be the year of software. Here&#8217;s how to prepare:</p><ul><li><p><strong>Learn agent-generated coding now</strong>. Not AI-assisted (autocomplete, suggestions)&#8212;AI-generated: you describe intent, agents produce complete implementations. This means multi-agent orchestration, prompt engineering for code, and reviewing AI output instead of writing it. The paradigm shift is steep and early movers will have 6-12 months of advantage.</p></li><li><p><strong>Tighten your product discovery loops</strong>. When engineering throughput triples, PM and design become the constraint. The organizations that figure out faster iteration on <em>what</em> to build will outpace those focused only on building faster.</p></li><li><p><strong>Invest in distribution before you need it</strong>. Building is no longer the hard part. Finding users, building brand, creating switching costs&#8212;those are the new differentiators.</p></li></ul><p>The explosion is coming. Make sure you&#8217;re positioned on the right side of it.</p><div><hr></div><p><em>I&#8217;m writing about agentic coding workflows at <a href="https://hyperdev.substack.com/">hyperdev.matsuoka.com</a>. My open source tools are at <a href="https://github.com/bobmatnyc">github.com/bobmatnyc</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Shumer’s Right About the Tsunami. His Advice Points at the Wrong Shore]]></title><description><![CDATA[The viral AI displacement post gets the diagnosis right and the prescription backward]]></description><link>https://hyperdev.matsuoka.com/p/shumers-right-about-the-tsunami-his</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/shumers-right-about-the-tsunami-his</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Sun, 15 Feb 2026 20:11:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KCLm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KCLm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KCLm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 424w, https://substackcdn.com/image/fetch/$s_!KCLm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 848w, https://substackcdn.com/image/fetch/$s_!KCLm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 1272w, https://substackcdn.com/image/fetch/$s_!KCLm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KCLm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png" width="1024" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1507340,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188065074?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5932a64c-ab8b-4110-84b0-92adb14dc3b7_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KCLm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 424w, https://substackcdn.com/image/fetch/$s_!KCLm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 848w, https://substackcdn.com/image/fetch/$s_!KCLm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 1272w, https://substackcdn.com/image/fetch/$s_!KCLm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ee0987e-1176-49ef-b7eb-9ab860622fe4_1024x672.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Pointing to the wrong shore?</figcaption></figure></div><p>Matt Shumer&#8217;s <a href="https://shumer.dev/something-big-is-happening">&#8220;Something Big Is Happening&#8221;</a> went viral this week. If you are one of the few that haven&#8217;t read it, the argument runs like this: AI has crossed a capability threshold. GPT-5.3 Codex and Claude Opus 4.6 can complete complex projects autonomously. The displacement timeline is 1-5 years, not decades. Prepare accordingly.</p><p>He&#8217;s not wrong about the diagnosis. I&#8217;ve been writing about this transformation for nine months now, tracking my own productivity metrics as AI tools evolved from &#8220;fancy autocomplete&#8221; to something genuinely different. The capability leap is real. So is the timeline.</p><p>Where Shumer loses me is the prescription.</p><p>His advice: use premium AI tools, build financial reserves, pursue genuine interests, spend an hour daily experimenting. Seems sensible enough. But completely backward about what I believe the transformation actually requires.</p><h2>The Diagnosis We Agree On</h2><p>Credit where due: Shumer captures something most commentary misses.</p><p>The METR measurements he cites&#8212;AI task completion capacity doubling every seven months, now accelerating to four&#8212;match what I&#8217;ve observed in practice. Claude Code didn&#8217;t just get incrementally better between Opus 4.0 and 4.6. It crossed a threshold where orchestration became viable. Not &#8220;AI helps me code faster&#8221; but &#8220;AI completes projects while I supervise.&#8221;</p><p>My own numbers tell the story: 77 completed code changes across 27 different projects in six weeks. I run multiple AI assistants simultaneously, each working on its own task while I review the results. I haven&#8217;t opened my traditional coding software for actual development in months.</p><p>Shumer&#8217;s right that this changes things. Where he&#8217;s wrong is assuming the response is survival preparation.</p><h2>The Problem with Survival Tips</h2><p>&#8220;Build financial reserves.&#8221; &#8220;Pursue genuine interests rather than traditional career paths.&#8221; &#8220;Spend one hour daily experimenting.&#8221;</p><p>This is advice for people who expect to be displaced. It&#8217;s the response you&#8217;d give someone watching a wave approach&#8212;find high ground, protect what you can, hope you make it through.</p><p>But that framing assumes the wave destroys rather than transforms. History suggests otherwise.</p><p>I wrote recently about <a href="https://hyperdev.matsuoka.com/p/dont-be-a-canut-be-a-pattern-master">the Jacquard loom lesson</a>. The Canuts were Lyon&#8217;s master silk weavers&#8212;legendary craftspeople whose identity was wrapped up in thread manipulation. When Jacquard&#8217;s programmable loom arrived in 1804, they rioted. Some adapted. Many didn&#8217;t.</p><p>Here&#8217;s what the numbers actually show: total silk workers stayed around 30,000 through the transition. The looms didn&#8217;t eliminate jobs&#8212;they compressed the master craftsman class while creating lower-wage operator roles. By 1831, 308 silk merchants controlled pricing for 5,575 master weavers managing 20,000+ workers.</p><p>The cautionary tale isn&#8217;t mass unemployment. It&#8217;s wage collapse and status compression for those who kept doing the same job while the job&#8217;s value eroded beneath them.</p><p>The Canuts who survived weren&#8217;t the fastest weavers. They were the ones who recognized that &#8220;weaver&#8221; was becoming &#8220;pattern designer&#8221; and &#8220;loom operator&#8221; and &#8220;machine mechanic.&#8221; The skill didn&#8217;t disappear. It changed shape.</p><h2>What Shumer&#8217;s Advice Misses</h2><p>&#8220;Spend one hour daily experimenting with AI tools.&#8221;</p><p>This is advice for a Canut. Practice with the new loom. Get comfortable with the interface. Learn the commands.</p><p>It completely misses what actually becomes valuable.</p><p>The <a href="https://www.faros.ai/blog/ai-software-engineering">Faros AI Productivity Paradox Report</a> analyzed data across thousands of developers and found something telling: &#8220;Adoption skews toward less tenured engineers. Usage is highest among engineers who are newer to the company.&#8221;</p><p>Why? Because junior engineers face different constraints. Their bottleneck is navigating unfamiliar code, accelerating early contributions, learning system patterns. AI helps enormously with that.</p><p>Senior engineers showed lower adoption not because they&#8217;re Luddites&#8212;because their constraints aren&#8217;t code-writing speed. Their bottleneck is &#8220;deep system knowledge and organizational context&#8221; that AI can&#8217;t access. Generating code faster doesn&#8217;t help when the constraint is understanding why the system works the way it does.</p><p>A <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5713646">University of Chicago Booth working paper</a> found experienced developers were 5-6% <em>more</em> likely to successfully use AI agents for every standard deviation of work experience. Not because they typed better prompts&#8212;because they used &#8220;plan-first&#8221; approaches, laying out objectives and steps before invoking AI.</p><p>Expertise improves the ability to delegate. That&#8217;s not something you learn from an hour of daily experimentation.</p><h2>What&#8217;s Irreducibly Human</h2><p>During a recent knowledge base project&#8212;120 commits over 9 days, roughly 90% Claude-assisted&#8212;I tracked where my time actually went.</p><p>The 12 human-only commits weren&#8217;t about implementation. They were:</p><ul><li><p>Configuration tweaks requiring domain knowledge (model selection for specific use cases)</p></li><li><p>Debug logging when something felt wrong</p></li><li><p>Release management</p></li><li><p>One research document on architecture options</p></li></ul><p>The human contributions were about <em>judgment</em>. Choosing the right model for email writing versus general queries. Knowing when the AI&#8217;s suggestion would create problems downstream. Understanding the client&#8217;s actual workflows well enough to structure the system appropriately.</p><p>What surprised me: the time savings didn&#8217;t come from faster typing. They came from eliminating iteration cycles between &#8220;write code&#8221; and &#8220;realize it doesn&#8217;t fit requirements.&#8221; Specifying clearly upfront meant fewer rewrites&#8212;but that specification work was irreducibly human.</p><p>The <a href="https://www.qodo.ai/blog/state-of-ai-coding-2025/">Qodo 2025 State of AI Coding survey</a> found 65% of developers cite missing context as the primary barrier to shipping AI code without review. That &#8220;missing context&#8221; is exactly what Shumer&#8217;s advice doesn&#8217;t address:</p><p><strong>Business model specifics</strong>: How supplier relationships actually work. Which data matters for a specific service model. Why certain integrations take priority.</p><p><strong>Organizational constraints</strong>: Budget limitations. Timeline pressures. The technical capabilities of staff who&#8217;ll maintain the system.</p><p><strong>Historical context</strong>: Why previous approaches to similar problems failed. What the client tried and rejected. Political dynamics around adoption.</p><p>None of this lives on the public web. It exists in Jira tickets, PowerPoint decks, Slack conversations, and institutional memory. You don&#8217;t acquire it through an hour of daily experimentation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w6ty!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w6ty!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 424w, https://substackcdn.com/image/fetch/$s_!w6ty!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 848w, https://substackcdn.com/image/fetch/$s_!w6ty!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 1272w, https://substackcdn.com/image/fetch/$s_!w6ty!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w6ty!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png" width="1024" height="599" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:599,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1369548,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188065074?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5df95e2b-ecf6-4459-8e76-ed469798e000_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w6ty!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 424w, https://substackcdn.com/image/fetch/$s_!w6ty!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 848w, https://substackcdn.com/image/fetch/$s_!w6ty!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 1272w, https://substackcdn.com/image/fetch/$s_!w6ty!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8f8a0c-5d16-4717-9872-5b434fcd5502_1024x599.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Be a Pattern Master, not a Canut</figcaption></figure></div><h2>Pattern Masters, Not Refugees</h2><p>Shumer frames AI as something happening TO workers. The response he offers is defensive: prepare for impact, build reserves, hope the wave passes.</p><p>The Jacquard lesson suggests a different frame. AI is changing WHAT the work is. The response isn&#8217;t preparation for displacement&#8212;it&#8217;s understanding what becomes valuable when implementation gets automated.</p><p>Two paths survived the Canut transition:</p><p><strong>Pattern designers</strong> who translated vision into punch cards the loom could execute. Not thread manipulators&#8212;system architects who understood what patterns were possible and how to specify them precisely.</p><p><strong>Loom improvers</strong> who made the infrastructure more reliable. Not operators&#8212;engineers who fixed the tension systems, improved card durability, figured out how to chain looms for industrial-scale production.</p><p>The agentic coding transition has the same split.</p><p>You can master the patterns&#8212;writing specs, orchestrating agents, supervising output. Or you can improve the looms&#8212;build the MCP servers, write the orchestration layers, optimize the vector databases that make retrieval-augmented generation work at scale.</p><p>Both roles survive. Thread manipulation shrinks in economic value.</p><h2>Methodology Beats Stockpiling</h2><p>The people seeing productivity gains from AI tools aren&#8217;t spending an hour daily experimenting. They&#8217;re developing methodology.</p><p><a href="https://www.microsoft.com/en-us/research/publication/the-effects-of-generative-ai-on-high-skilled-work-evidence-from-three-field-experiments/">Microsoft Research field experiments</a> across nearly 5,000 developers found 26% productivity gains with AI coding assistants&#8212;with less experienced developers showing higher adoption and greater improvements. But those gains came from structured workflows, integrated tooling, and verification processes&#8212;not casual usage.</p><p>The developers in the <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR randomized controlled trial</a> who were 19% <em>slower</em> with AI assistance? They were using AI the Shumer way&#8212;open a chat, ask a question, accept the output, repeat. No structured context. No optimized prompts. No verification layer. They felt faster while actually slowing down.</p><p>I&#8217;ve written about <a href="https://hyperdev.matsuoka.com/p/hyperdevs-three-golden-rules">three golden rules</a> that structure my own workflow: let AI write your prompts (research shows 17-50% improvement), make context searchable rather than just present (Lost in the Middle kills accuracy), and build verification into every workflow.</p><p>This isn&#8217;t what you learn from an hour of daily experimentation. It&#8217;s what you develop through deliberate methodology applied to real work with real stakes.</p><h2>The Question Shumer Should Have Asked</h2><p>The question isn&#8217;t &#8220;will I have a job in five years?&#8221;</p><p>It&#8217;s &#8220;what does my job become when implementation gets automated?&#8221;</p><p>For senior engineers, the answer is increasingly clear: subject matter expert plus systems architect. The person who translates ambiguous requirements into precise specifications. The person who identifies when agent outputs miss critical organizational context. The person who maintains system coherence across automated development workflows.</p><p>Jue Wang at Bain <a href="https://www.technologyreview.com/2025/12/15/1128352/rise-of-ai-coding-developers-2026/">told MIT Technology Review</a> that developers already spend only 20-40% of their time coding. The rest goes to analyzing problems, customer feedback, product strategy, administrative tasks.</p><p>AI doesn&#8217;t change what senior engineering is. It reveals what it always was.</p><p>The implementation layer was never the irreducible core. It was infrastructure&#8212;important, but increasingly invisible. What emerges when that layer automates is something both familiar and different. The same judgment work senior engineers always did, now concentrated and visible.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n9Rf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n9Rf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 424w, https://substackcdn.com/image/fetch/$s_!n9Rf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 848w, https://substackcdn.com/image/fetch/$s_!n9Rf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 1272w, https://substackcdn.com/image/fetch/$s_!n9Rf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n9Rf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png" width="1024" height="731" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:731,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1676875,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/188065074?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd48561b-7280-456f-8d50-33934bef6d67_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n9Rf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 424w, https://substackcdn.com/image/fetch/$s_!n9Rf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 848w, https://substackcdn.com/image/fetch/$s_!n9Rf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 1272w, https://substackcdn.com/image/fetch/$s_!n9Rf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2146e93-4d9a-4e72-9038-5fb03fcb75e6_1024x731.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This looks nothing like Matt.  Well.  Maybe a bit.</figcaption></figure></div><h2>The Year of Software</h2><p>Here&#8217;s what Shumer&#8217;s displacement framing completely misses: the long tail.</p><p>My friend <a href="https://www.linkedin.com/in/mattrosenberg/">Matt Rosenberg</a> has zero software development experience as a builder. He&#8217;s a marketer who also manages a vacation rental property on Cape Cod. Over the past few weeks he built himself a revenue optimization tool&#8212;a proper one, with dynamic pricing recommendations based on local events, seasonal patterns, and competitor analysis.</p><p>He didn&#8217;t hire a developer. He didn&#8217;t buy enterprise software designed for property management chains. He built exactly what he needed for his specific situation.</p><p>Here&#8217;s what matters about how he got there: Matt spent hours over several weeks doing a deep dive into the tools. Not casual experimentation&#8212;serious investment in understanding what AI coding assistants could and couldn&#8217;t do. His transformation from marketer to builder wasn&#8217;t magic. It was built on two things he already had: deep knowledge of UX from his marketing career, and years of accumulated expertise in the vacation rental space.</p><p>That combination&#8212;domain expertise plus serious tool investment&#8212;is a model for career transformation. Not everyone will replicate Matt&#8217;s results. But the path he took isn&#8217;t &#8220;spend an hour a day experimenting.&#8221; It&#8217;s &#8220;leverage what you already know deeply, and invest real time in learning to express it through new tools.&#8221;</p><p>Here&#8217;s the economics that matter: Matt has a reasonable chance to recoup his effort by sharing this with other Cape Cod hosts&#8212;a small audience with the exact same problem. A smaller but real chance someone picks it up for broader distribution. Maybe it stays a side project. Maybe it becomes a micro-business serving vacation rental owners in seasonal markets. A larger concern would never take on a project with such a small TAM.</p><p>None of those paths existed before.</p><p>In the old model, Matt&#8217;s revenue optimizer would never exist. No developer would build it for one vacation rental property. No SaaS company would target Cape Cod vacation rentals as a market segment. The problem was real, Matt&#8217;s domain expertise was real, but the economics of software creation didn&#8217;t work.</p><p>Now they do. And so do the economics of software distribution. The same tools that let Matt build also let him iterate based on feedback from ten other hosts, add features they need, package it for sharing.</p><p>This is the year of software. Not because developers are being displaced&#8212;because software is finally reaching the long tail of problems that were never economical to solve. The domain expert who understands Cape Cod rental patterns better than any enterprise vendor can encode that knowledge into a working system <em>and</em> find the small audience that needs exactly that.</p><p>Shumer sees AI automating existing jobs. He misses AI creating new economic paths for people who were never developers in the first place.</p><p>The Canuts didn&#8217;t just become pattern masters and loom mechanics. Some became textile entrepreneurs who could suddenly afford custom patterns for small-batch production. The technology didn&#8217;t only change who did the work&#8212;it changed what work was possible.</p><p>This is the new normal if you learn to use the tools.</p><h2>The Bottom Line</h2><p>Shumer&#8217;s right that something big is happening. The capability threshold is real. The timeline is compressed. The transformation will affect every knowledge worker who touches a computer.</p><p>He&#8217;s wrong about what to do.</p><p>The response isn&#8217;t defensive preparation for displacement. It&#8217;s understanding what becomes valuable when AI handles implementation. It&#8217;s developing methodology for specification and orchestration. It&#8217;s acquiring the domain expertise and organizational context that AI can&#8217;t access.</p><p>And for the Matt Rosenbergs of the world&#8212;the domain experts who never learned to code&#8212;the response is recognizing that this is their year. The problems they understand better than anyone can finally become software.</p><p>Don&#8217;t stockpile. Don&#8217;t experiment an hour a day. Don&#8217;t prepare to be a refugee.</p><p>Become a pattern master. Or become a loom improver. Or become the domain expert who finally builds the tool that only you could specify.</p><p>The Canuts who survived didn&#8217;t out-weave the machines. They recognized that the job had changed shape and positioned themselves for what actually remained valuable.</p><p>The transformation is underway. The question isn&#8217;t whether you&#8217;ll make it through. It&#8217;s whether you&#8217;ve recognized what the job is becoming&#8212;and what new jobs are becoming possible.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on the pattern master thesis, read my analysis of <a href="https://hyperdev.matsuoka.com/p/dont-be-a-canut-be-a-pattern-master">the Jacquard loom lesson</a> or my deep dive into <a href="https://hyperdev.matsuoka.com/p/the-irreducibles-what-a-pattern-master">what remains irreducibly human</a>.</em></p><p><strong>Related reading:</strong></p><ul><li><p><a href="https://hyperdev.matsuoka.com/p/dont-be-a-canut-be-a-pattern-master">Don&#8217;t Be a Canut&#8212;Be a Pattern Master</a> - The Jacquard history with data</p></li><li><p><a href="https://hyperdev.matsuoka.com/p/the-irreducibles-what-a-pattern-master">The Irreducibles: What a Pattern Master Does</a> - Where human value actually sits</p></li><li><p><a href="https://hyperdev.matsuoka.com/p/hyperdevs-three-golden-rules">HyperDev&#8217;s Three Golden Rules</a> - Methodology for professional AI work</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Stack Overflow Is Dead]]></title><description><![CDATA[Good Riddance]]></description><link>https://hyperdev.matsuoka.com/p/stack-overflow-is-dead</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/stack-overflow-is-dead</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 12 Feb 2026 13:30:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3NoI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3NoI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3NoI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 424w, https://substackcdn.com/image/fetch/$s_!3NoI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 848w, https://substackcdn.com/image/fetch/$s_!3NoI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 1272w, https://substackcdn.com/image/fetch/$s_!3NoI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3NoI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png" width="1456" height="838" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:838,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:216241,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187352749?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3NoI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 424w, https://substackcdn.com/image/fetch/$s_!3NoI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 848w, https://substackcdn.com/image/fetch/$s_!3NoI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 1272w, https://substackcdn.com/image/fetch/$s_!3NoI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7cdd995-c7af-4899-a4bd-c401fac47fa7_2145x1234.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Stack Overflow vs Reddit, Discord, and Dev.To</figcaption></figure></div><h2>TL;DR</h2><ul><li><p>Stack Overflow&#8217;s question volume collapsed ~95% from peak&#8212;back to 2008 levels. Traffic down ~75%. The archive remains valuable; new contributions have largely stopped.</p></li><li><p>The decline started in 2018&#8212;four years before ChatGPT. The model was already broken; AI accelerated an existing trend.</p></li><li><p>Developers didn&#8217;t stop communicating&#8212;they migrated to Reddit, Discord, Dev.to, and AI tools. r/programming has 5-6 million members; Discord has 200M+ monthly active users.</p></li><li><p>The key insight: Stack Overflow optimized for <em>definitive answers</em>&#8212;exactly what LLMs do well. Reddit/Discord provide <em>discussion, opinion, validation</em>&#8212;what LLMs struggle with.</p></li><li><p>Transactional Q&amp;A platforms are vulnerable. Community-first platforms are thriving. This is unbundling, not death.</p></li></ul><div><hr></div><p><strong>Stack Overflow still gets read. But it stopped getting written.</strong></p><p>Question volume has collapsed from 200,000/month at peak to under 10,000 today&#8212;a 95% drop. That&#8217;s not a community fading, it&#8217;s a Q&amp;A product being outcompeted.</p><p>Peter Coy wrote a piece in the New York Times recently arguing this signals the end of developer knowledge-sharing. Developers used to share publicly; now they ask ChatGPT privately. &#8220;A little sad,&#8221; he called it.</p><p>I think Coy has it backwards. Developers aren&#8217;t talking less&#8212;they&#8217;re talking elsewhere. The activity migrated to Reddit, Discord, and AI tools. Stack Overflow&#8217;s death isn&#8217;t about lost community. It&#8217;s about an obsolete model being replaced by better ones.</p><h2>The collapse is real</h2><p>Let me be clear: Stack Overflow really is dying. By &#8220;dying&#8221; I mean new contributions&#8212;questions, answers, edits&#8212;have collapsed. The archive remains; the community doesn&#8217;t.</p><p>The numbers are stark. Traffic has collapsed roughly 75% from peak, <a href="https://byteiota.com/stack-overflow-traffic/">according to third-party analyses like ByteIota</a> (based on SimilarWeb estimates). Question volume tells an even starker story: <a href="https://data.stackexchange.com/stackoverflow/query/new">Stack Exchange Data Explorer queries</a> show monthly questions dropping from ~200,000 at the 2014-2017 peak to under 10,000 by late 2025. That&#8217;s back to 2008 levels&#8212;the site&#8217;s launch year.</p><p>I used Stack Overflow heavily for years. Seeing activity fall back to early-2008 levels is hard to overstate.</p><p>Fifteen years of growth erased in under three years.</p><p>The paradox is that <a href="https://survey.stackoverflow.co/2024/">84% of developers still </a><em><a href="https://survey.stackoverflow.co/2024/">browse</a></em><a href="https://survey.stackoverflow.co/2024/"> Stack Overflow</a>. The archive has value. But almost nobody contributes anymore. The site has become a museum&#8212;visited, but not lived in.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-5p9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-5p9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 424w, https://substackcdn.com/image/fetch/$s_!-5p9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 848w, https://substackcdn.com/image/fetch/$s_!-5p9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 1272w, https://substackcdn.com/image/fetch/$s_!-5p9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-5p9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png" width="1456" height="734" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:734,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:120534,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187352749?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-5p9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 424w, https://substackcdn.com/image/fetch/$s_!-5p9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 848w, https://substackcdn.com/image/fetch/$s_!-5p9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 1272w, https://substackcdn.com/image/fetch/$s_!-5p9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9aafaec3-8a20-45b8-9f67-6f39d8dcd630_1815x915.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Where developers went</h2><p>Here&#8217;s what the &#8220;developers stopped talking&#8221; narrative misses: they moved.</p><p><strong>Reddit exploded.</strong> r/programming has somewhere between <a href="https://subredditstats.com/r/programming">5-6 million members</a>. r/learnprogramming has around 4 million. Both subreddits are trending at over 1,000% on growth metrics, adding thousands of subscribers daily. These aren&#8217;t ghost towns&#8212;they&#8217;re thriving.</p><p><strong>Discord expanded far beyond gaming.</strong> The service now has <a href="https://discord.com/company">over 200 million monthly active users</a>, and developer communities have exploded. Reactiflux (the React community) has 230,000+ members. Python Discord, Rust Discord, and dozens of framework-specific servers have become the default place for real-time developer discussion.</p><p><strong>Dev.to grew to millions of members.</strong> Built on community-first principles with lower barriers to participation than Stack Overflow ever had.</p><p><strong>The MCP ecosystem exploded.</strong> MCP&#8212;the Model Context Protocol&#8212;lets AI assistants call external tools: APIs, databases, services. Think of it as giving Claude or ChatGPT hands instead of just a mouth. In November 2024, there were maybe 100 MCP servers. By February 2026, <a href="https://glama.ai/mcp/servers">over 17,000</a>. A new form of executable knowledge-sharing emerged in the time it took Stack Overflow to collapse.</p><p>Developers didn&#8217;t stop communicating. They stopped using Stack Overflow.</p><h2>Why Reddit thrives while Stack Overflow dies</h2><p>Here&#8217;s a clarifying question: if LLMs killed Stack Overflow, why didn&#8217;t they kill Reddit?</p><p>The answer reveals the real dynamic at play.</p><p>Stack Overflow optimized for <em>definitive answers</em>. One question, one accepted answer, close the duplicates, move on. The entire system was designed to produce canonical, searchable, authoritative responses to technical questions.</p><p>That&#8217;s exactly what LLMs do. Better. Faster. Without the closure votes and downvotes.</p><p>Reddit optimizes for <em>discussion</em>. There&#8217;s no &#8220;accepted answer.&#8221; The same question can be asked repeatedly without getting closed. People share opinions, debate tradeoffs, validate frustrations, and build community around shared interests.</p><p>LLMs struggle with that. Try asking Claude or ChatGPT &#8220;Is this framework actually good or does it just have good marketing?&#8221; You&#8217;ll get a balanced, diplomatic non-answer. Ask Reddit and you&#8217;ll get thirty developers telling you exactly what they think, with war stories and receipts.</p><p>Factor Stack Overflow Reddit Content model Q&amp;A with &#8220;correct&#8221; answers Discussion threads Duplicate policy Aggressively closed Tolerated, repeated Reputation system High-stakes rep + privileges Lighter-touch karma Engagement type Transactional (get answer, leave) Conversational (participate, stay) AI competition Direct replacement Complementary</p><p>The &#8220;site:reddit.com&#8221; phenomenon tells the story. Users increasingly append that modifier to Google searches specifically because they want human perspectives, not AI-generated summaries or SEO-optimized content farms. They&#8217;re actively seeking out the thing LLMs can&#8217;t easily provide.</p><p>Stack Overflow competed with AI on AI&#8217;s home turf. Reddit doesn&#8217;t.</p><h2>Why developers didn&#8217;t fight for it</h2><p>The product model explains the competitive pressure. But the culture explains why developers didn&#8217;t fight to save it.</p><p>Stack Overflow&#8217;s culture was broken long before ChatGPT arrived. Look at the question volume data: the decline started in 2018&#8212;four full years before ChatGPT launched. Monthly questions dropped from 200,000 to 140,000 before GPT-3&#8217;s 2020 release, and well before ChatGPT&#8217;s late 2022 launch. The trajectory was already set.</p><p>ChatGPT didn&#8217;t kill Stack Overflow. It was the final nail in the coffin.</p><p>In 2019, Stack Overflow surveyed its own community about a much-publicized initiative to improve culture. <a href="https://stackoverflow.blog/2019/07/18/building-community-inclusivity-stack-overflow/">Seventy-three percent of respondents said the site remained &#8220;equally unwelcoming&#8221;</a> compared to before the initiative. This wasn&#8217;t outside criticism&#8212;it was the community itself acknowledging the problem hadn&#8217;t been fixed.</p><p>Anyone who&#8217;s used the site knows what this looked like in practice. You&#8217;d ask a question, spend twenty minutes crafting it carefully, and within seconds someone would mark it as a duplicate of a vaguely related question from 2014. Or close it as &#8220;not a real question.&#8221; Or downvote it without explanation.</p><p>The reputation system created perverse incentives. High-rep users had the power to close questions, and the system rewarded fast closure and strict gatekeeping over patient explanation. New users learned quickly that asking questions was a minefield. The site optimized for the archive, not for learning.</p><p>Public disputes between moderators and leadership became common. Several moderators resigned, citing disagreements over governance and feeling unsupported by the company.</p><p>To be clear: Stack Overflow did things well. The archive is genuinely valuable&#8212;24 million questions and answers representing collective knowledge. Discoverability was excellent. The structured Q&amp;A format created stable, linkable URLs. Canonical answers for common problems saved countless hours.</p><p>But the community that created those answers? That was poisoned years ago.</p><p>LLMs didn&#8217;t kill Stack Overflow. They just offered developers an alternative that didn&#8217;t make them feel stupid for asking questions.</p><h2>What happened to everyone else</h2><p>Stack Overflow isn&#8217;t the only platform affected by this shift. Here are some others:</p><p>Platform Status Model AI Vulnerability Stack Overflow Collapsing Transactional Q&amp;A Direct replacement Experts Exchange Pivoting Paywalled Q&amp;A High Quora Struggling General Q&amp;A High Reddit Thriving Discussion Low Discord Thriving Real-time community Low Dev.to Thriving Community blogging Low Hacker News Stable Curated discussion Low</p><p><strong>Experts Exchange</strong> pivoted hard. Remember them? The original &#8220;answers behind a paywall&#8221; site that Stack Overflow was created to replace? They&#8217;re still around, now positioning themselves as &#8220;the home of human intelligence.&#8221; The anti-AI angle is their entire pitch now.</p><p>The pattern: community-first platforms survive. Transactional Q&amp;A platforms are vulnerable. If your model is &#8220;user asks question, platform provides answer,&#8221; you&#8217;re competing directly with AI. If your model is &#8220;users discuss, debate, and build relationships,&#8221; you&#8217;re not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d4wR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d4wR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!d4wR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!d4wR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!d4wR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d4wR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png" width="1184" height="864" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:864,&quot;width&quot;:1184,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1260702,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187352749?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d4wR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!d4wR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!d4wR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!d4wR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9ddb14-5cb4-44c4-872e-17d0ec825f25_1184x864.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The new knowledge architecture</h2><p>What&#8217;s replacing Stack Overflow isn&#8217;t a single platform. It&#8217;s a layered ecosystem.</p><p><strong>Layer 1: LLMs for basic questions.</strong> &#8220;How do I parse JSON in Python?&#8221; Don&#8217;t post that anywhere&#8212;just ask Claude. Faster response, no judgment, no risk of being marked as a duplicate. <a href="https://survey.stackoverflow.co/2025/">Eighty-four percent of developers now use AI tools</a>. For basic technical questions, this is often faster and lower-friction than posting ever was.</p><p><strong>Layer 2: MCP servers for executable knowledge.</strong> This is the part most people haven&#8217;t caught up to yet. The Model Context Protocol ecosystem has exploded to 17,000+ servers, with backing from the Linux Foundation and adoption by major players. These aren&#8217;t just answers&#8212;they&#8217;re capabilities. Instead of reading how to do something, you get a tool that does it. Knowledge that executes.</p><p><strong>Layer 3: Communities for discussion.</strong> Reddit, Discord, Dev.to. When you need opinions, validation, or to talk through a problem with humans who&#8217;ve been there, this is where you go. LLMs can tell you <em>how</em> to use a library; humans tell you <em>whether</em> you should.</p><p><strong>Layer 4: Deep expertise for analysis.</strong> Blogs, Substacks, video courses, conference talks. Long-form content that explores ideas in depth, with personality and opinion. This is where the experienced practitioners share hard-won knowledge that doesn&#8217;t fit a Q&amp;A format.</p><p>Here&#8217;s what my workflow looks like now: basic syntax question &#8594; Claude. Need to connect to an API &#8594; MCP server. &#8220;Is this the right architectural approach?&#8221; &#8594; Reddit or Discord. Deep dive on tradeoffs &#8594; find a practitioner&#8217;s blog post.</p><p>This architecture is more sophisticated than Stack Overflow ever was. It&#8217;s specialized, distributed, and each layer does what it&#8217;s good at. The Q&amp;A site tried to be everything; the new ecosystem lets each component excel at its purpose.</p><h2>The counter-arguments</h2><p>Fair criticism exists. Let me address it directly.</p><p><strong>&#8220;Knowledge is fragmented now.&#8221;</strong> True. Your answer might be on Reddit, Discord, a GitHub issue, a blog post, or an MCP server. That&#8217;s friction Stack Overflow didn&#8217;t have. But this is the story of the entire internet&#8212;from centralized portals to distributed everything. We adapted before; we&#8217;ll adapt again.</p><p><strong>&#8220;We&#8217;re losing archival permanence.&#8221;</strong> Discord conversations disappear. Reddit threads get buried. The 24 million Stack Overflow Q&amp;As were searchable and permanent. This is a real loss. But the community that created those answers was already gone. The archive remains; the contribution stopped years ago.</p><p><strong>&#8220;Developers are talking </strong><em><strong>differently</strong></em><strong>, not </strong><em><strong>more</strong></em><strong>.&#8221;</strong> Probably fair. I can&#8217;t prove total knowledge exchange increased. What I can show is that multiple platforms are thriving while Stack Overflow collapses. The activity went somewhere.</p><p><strong>&#8220;Quality control without voting?&#8221;</strong> Reddit has karma. Discord servers have curation. LLMs let you iterate until you get a useful answer. None of these are perfect, but neither was Stack Overflow&#8217;s system&#8212;which surfaced answers based on who posted first and had the most reputation.</p><h2>The bigger picture</h2><p>Stack Overflow was a toll booth on the highway of developer knowledge. For a decade, if you wanted an answer to a programming question, you went through the booth. You tolerated the closure votes, the duplicate flags, the reputation games, because there wasn&#8217;t a better alternative.</p><p>LLMs removed the toll booth.</p><p>Developers didn&#8217;t stop traveling. They stopped paying.</p><p>What we&#8217;re witnessing <em>isn&#8217;t</em> the death of developer communication. It&#8217;s the unbundling of a monopoly. Stack Overflow tried to be the single source of truth for all technical questions. That model was always fragile&#8212;it just took a sufficient technological shock to reveal it.</p><p>The new ecosystem is messier. More distributed. Harder to search&#8212;though agentic coding infrastructure is changing that quickly. (I&#8217;d argue you could learn nearly as much from my <a href="https://github.com/bobmatnyc/mcp-skillset">mcp-skillset</a> as from Stack Overflow. Better organized, semantic search, also built from community contributions. Without the BS.) But it&#8217;s also more human, more specialized, and better matched to how people actually learn and communicate.</p><h2>The sentiment shift</h2><p>One last data point. The <a href="https://survey.stackoverflow.co/2025/">2025 Stack Overflow Developer Survey</a> showed 84% of developers using AI tools&#8212;but sentiment was mixed. Only 60% viewed AI positively, down from over 70%. Forty-six percent actively distrusted AI accuracy.</p><p>That was 2025. Since then, models like Claude Opus 4.5 have made the generative AI question moot. The accuracy concerns that fed developer skepticism are evaporating. When an AI tool can reliably write, debug, and ship production code, the &#8220;will I use this?&#8221; question becomes &#8220;how do I use this effectively?&#8221;</p><p>The holdouts are running out of reasons to hold out.</p><p>Stack Overflow&#8217;s collapse isn&#8217;t a tragedy. A platform that optimized for definitive answers got replaced by tools that provide them faster. The discussion, community, and deep expertise went elsewhere&#8212;to platforms that were better at providing those things all along.</p><p>Good riddance. The future is tools for answers and humans for judgment.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on how AI is reshaping developer tools, read my analysis of <a href="https://hyperdev.matsuoka.com/p/your-ide-is-a-comfort-blanket">Your IDE Is a Comfort Blanket</a> or <a href="https://hyperdev.matsuoka.com/p/the-age-of-the-cli-part-1">The Age of the CLI</a>. Hat tip to <a href="https://www.linkedin.com/in/alexzoghlin/">Alex Zoghlin</a> for sharing Peter Coy&#8217;s article.</em></p>]]></content:encoded></item><item><title><![CDATA[Your IDE Is a Comfort Blanket]]></title><description><![CDATA[And It&#8217;s Smothering You]]></description><link>https://hyperdev.matsuoka.com/p/your-ide-is-a-comfort-blanket</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/your-ide-is-a-comfort-blanket</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Tue, 10 Feb 2026 13:31:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Hkvt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hkvt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hkvt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!Hkvt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!Hkvt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!Hkvt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hkvt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png" width="1184" height="864" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:864,&quot;width&quot;:1184,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1600054,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187322464?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hkvt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!Hkvt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!Hkvt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!Hkvt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dcf1a57-0044-47f2-b109-815975308f76_1184x864.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The IDE Comfort Blanket</figcaption></figure></div><h2>TL;DR</h2><p>&#8226; Traditional IDE features (autocomplete, debugging, refactoring) actually <em>help</em> developers learn, according to a controlled UC San Diego eye-tracking study. The IDE itself isn&#8217;t the villain.<br>&#8226; AI code generation layered on top of IDEs is producing the first measurable evidence of cognitive atrophy: 19% slower with AI tools while <em>believing</em> they were 20% faster (METR study), 41% more bugs (Uplevel), 30% more static analysis warnings (Carnegie Mellon).<br>&#8226; The perception-reality gap spans roughly 40 percentage points. Developers, external ML experts, and managers all predicted AI would speed things up. Everyone was wrong in the same direction.<br>&#8226; The deeper problem isn&#8217;t deskilling experienced developers. It&#8217;s &#8220;never-skilling&#8221; an entire generation whose learning period coincides with ubiquitous AI assistance. HackerRank reports lead developer hiring grew 22% YoY while entry-level grew only 7%.<br>&#8226; CLI-based agentic tools and Specification-Driven Development (SDD) / Ticket-Driven Development (TkDD) workflows force a different cognitive mode: thinking <em>before</em> coding, specifying intent, supervising execution. The IDE&#8217;s tight feedback loop encourages the opposite: accept, move on, don&#8217;t think too hard.</p><p>I&#8217;ve been thinking about something that keeps showing up in my conversations with CTOs and engineering leads. It usually sounds like this: &#8220;My senior engineers love Claude Code. My mid-level engineers refuse to leave Cursor. And my juniors can&#8217;t function without inline AI suggestions.&#8221;</p><p>That pattern bothered me enough to spend time researching the issue. What I found changes how I think about the CLI-vs-IDE debate. Turns out it isn&#8217;t about tool preferences or terminal elitism. The question is whether the comfortable, suggestion-rich environment of modern IDEs is actively undermining the cognitive skills that matter most in an era of agentic AI.</p><p>Here&#8217;s what the evidence says: it depends on which layer you&#8217;re talking about. And the distinction matters more than most developers realize.</p><div><hr></div><h2>The IDE didn&#8217;t make you dumb (but it set the stage)</h2><p>Let&#8217;s start with what might be a surprising finding. A <a href="https://cseweb.ucsd.edu/~mcoblenz/assets/pdf/fse24-autocomplete.pdf">2024 UC San Diego study</a> ran a between-subjects experiment with 32 programmers using an unfamiliar Gmail Java API. Participants with traditional autocomplete enabled scored significantly higher on post-study knowledge tests (mean ~38 vs ~32 points, p &#8776; 0.0079) and completed tasks 8.2% faster. The learning benefit was equivalent to roughly 7.2 years of programming experience.</p><p>That&#8217;s a big deal. Traditional autocomplete works like a searchable index. It presents options. You still choose, you still think. The study found autocomplete didn&#8217;t even reduce keystrokes significantly. Its value came from serving as an efficient information-delivery mechanism, cutting documentation reading time by 16 minutes.</p><p>So, to the the surprise of this crotchety &#8220;learn-coding-before-the-IDE&#8221; developer, the IDE itself isn&#8217;t the problem. IntelliSense, syntax highlighting, integrated debugging, refactoring tools: these function as cognitive augmentation. They help you find information faster while you&#8217;re building understanding. The <a href="https://dl.acm.org/doi/10.1145/3660765">ACM published this research</a> and the authors were careful to distinguish between these features and what came next.</p><p>The warning they included reads like prophecy now: &#8220;As AI-based autocomplete tools, such as Copilot, become more popular, it will be important to re-evaluate the learning implications, since these tools may reduce the cognitive involvement of programmers.&#8221;</p><p>That re-evaluation has arrived. And the results are ugly.</p><h2>AI code generation crosses the line</h2><p>James Prather&#8217;s research group ran <a href="https://juholeinonen.com/assets/pdf/prather2024widening.pdf">21 laboratory sessions with eye-tracking at ICER 2024</a>, studying how novice programmers interact with generative AI tools. They found something concerning: AI tools compound existing metacognitive difficulties and introduce entirely new failure modes they labeled Interruption, Mislead, and Progression.</p><p>Students with strong metacognitive skills benefited from AI assistance. Students who already struggled were pushed further behind, finishing with what the researchers called an &#8220;illusion of competence.&#8221; They believed they understood code they couldn&#8217;t reproduce independently. The gap between strong and weak learners <em>widened</em>. That&#8217;s the opposite of what educational tools are supposed to do.</p><p>A <a href="https://arxiv.org/html/2211.03622v3">Stanford security study</a> found developers using AI assistants wrote significantly less secure code and were simultaneously more confident it was secure. Let that sink in. Worse outcomes. Higher confidence.</p><p><a href="https://arxiv.org/abs/2310.02059">Research on Copilot-generated code in GitHub projects</a> found that 48%+ of AI-generated code contains security vulnerabilities. Actual CWEs in production repositories.</p><h2>The 40-point perception gap</h2><p>The single most striking finding comes from <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR&#8217;s 2025 randomized controlled trial</a>. Sixteen experienced open-source developers tackled 246 real-world tasks on mature repositories (averaging 22,000+ stars and a million lines of code) using Cursor Pro with Claude 3.5/3.7 Sonnet.</p><p>The result: developers were <a href="https://www.infoworld.com/article/4020931/ai-coding-tools-can-slow-down-seasoned-developers-by-19.html">19% slower with AI tools</a>. Before starting, they predicted a 24% speedup. Afterward, they <em>still</em> believed they&#8217;d been 20% faster. External ML experts had predicted a 38% speedup. Everyone was wrong in the same direction, and the gap between perception and reality spanned roughly 40 percentage points.</p><p>As <a href="https://www.seangoedecke.com/impact-of-ai-study/">Sean Goedecke noted</a> in his analysis of the METR study, this wasn&#8217;t about bad tools or bad developers. It was about the cognitive overhead of evaluating, correcting, and integrating AI-generated code eating the time savings from not typing it yourself.</p><p>This pattern shows up everywhere you look. <a href="https://cloud.google.com/blog/products/devops-sre/announcing-the-2024-dora-report">Google&#8217;s DORA 2024 report</a>, surveying 39,000 professionals, found every 25% increase in AI adoption correlated with a 1.5% decrease in delivery throughput and a 7.2% decrease in delivery stability. Seventy-five percent of developers reported feeling more productive. The measurements said otherwise.</p><p>A <a href="https://x.com/rohanpaul_ai/status/2005012475360739624">Carnegie Mellon difference-in-differences study</a> of 807 GitHub repositories adopting Cursor found a transient velocity spike (3-4x more lines added in month one) followed by persistent quality degradation: static analysis warnings up 30%, code complexity up 41%. <a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research">GitClear&#8217;s analysis of 211 million changed lines</a> found code duplication blocks increased eightfold during 2024 and refactoring declined from 25% to under 10% of changed lines.</p><p>More code. Worse code. And developers who thought they were crushing it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dCSj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dCSj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!dCSj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!dCSj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!dCSj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dCSj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png" width="1184" height="864" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:864,&quot;width&quot;:1184,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1137378,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187322464?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dCSj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!dCSj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!dCSj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!dCSj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff337d27e-8dc3-48dc-8c79-1b42ba0d0287_1184x864.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Your brain on autocomplete</h2><p>The cognitive science explains the mechanism clearly enough. Betsy Sparrow&#8217;s landmark <a href="https://www.science.org/doi/10.1126/science.1207745">2011 </a><em><a href="https://www.science.org/doi/10.1126/science.1207745">Science</a></em><a href="https://www.science.org/doi/10.1126/science.1207745"> paper on &#8220;Google Effects on Memory&#8221;</a> demonstrated that when people expect future access to information, they have lower recall of the information itself but enhanced recall of <em>where to find it</em>. The internet becomes a transactive memory partner. You remember the path to knowledge, not the knowledge.</p><p>Applied to programming: developers who rely on autocomplete may remember that a method exists in the dropdown rather than what it does. For traditional autocomplete, the UC San Diego study suggests this tradeoff is acceptable or even beneficial. But AI code generation goes much further. It doesn&#8217;t just tell you what methods exist. It writes entire implementations you may never fully comprehend.</p><p>A <a href="https://www.mdpi.com/2075-4698/15/1/6">2025 study of 666 participants published in MDPI</a> found a significant negative correlation (r = &#8722;0.75) between frequent AI tool usage and critical thinking abilities, mediated by cognitive offloading. Younger participants (17-25) showed higher AI dependence and lower critical thinking scores. Higher education served as a protective buffer, but the feedback loop was clear: AI usage increases cognitive offloading, which reduces critical thinking, which increases AI dependency.</p><p>Robert Bjork&#8217;s concept of <a href="https://sites.edb.utexas.edu/slam/70-2/">&#8220;desirable difficulties&#8221;</a> provides the theoretical basis for why removing struggle from programming does harm. When you type code from memory instead of accepting a suggestion, you engage deeper encoding through the generation effect. When you debug manually instead of clicking a fix-it button, you build stronger mental schemas. Research on productive failure shows students who struggle before receiving instruction outperform those who receive direct instruction first.</p><p>A <a href="https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2026.1765692/abstract">2026 </a><em><a href="https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2026.1765692/abstract">Frontiers in Medicine</a></em><a href="https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2026.1765692/abstract"> paper</a> distinguishes between &#8220;deskilling&#8221; (losing existing abilities) and what they call &#8220;never-skilling.&#8221; That second concept is the one that keeps me up at night. An entire generation of developers whose foundational learning period coincides with ubiquitous AI assistance may never develop the mental schemas that experienced developers take for granted. The FAA now recommends more manual flying to counter autopilot-induced skill decay. Endoscopists using AI for polyp detection saw detection rates drop from 28% to 22% when AI was turned off. This pattern shows up in every profession where automation handles the thinking.</p><h2>DHH could feel it in his fingers</h2><p>The industry voices on this are converging, even from people you wouldn&#8217;t expect to agree.</p><p><a href="https://thenewstack.io/dhh-on-ai-vibe-coding-and-the-future-of-programming/">David Heinemeier Hansson</a>, the creator of Ruby on Rails, put it viscerally in his 2025 Lex Fridman interview: &#8220;I don&#8217;t let AI drive my code. I&#8217;ve tried that, I&#8217;ve tried the Cursors and the Windsurfs, and I don&#8217;t enjoy that way of writing. I can literally feel competence draining out of my fingers.&#8221; He keeps AI output in a separate window to prevent passive consumption and insists on doing the typing himself because &#8220;you learn with your fingers.&#8221;</p><p>His specific example stuck with me. He discovered he was repeatedly asking AI for the same Bash conditional syntax. By not typing it, he wasn&#8217;t learning it. His analogy: &#8220;You&#8217;re not going to get fit by watching fitness videos. You have to do the sit-ups.&#8221;</p><p><a href="https://en.wikipedia.org/wiki/Casey_Muratori">Casey Muratori</a>, whose &#8220;Clean Code, Horrible Performance&#8221; video demonstrated up to 15x performance penalties from IDE-friendly abstraction patterns, argues that modern development practices actively pessimize software by hiding how CPUs actually work. His Performance-Aware Programming course exists explicitly to teach what IDE abstractions conceal. <a href="https://en.wikipedia.org/wiki/Jonathan_Blow">Jonathan Blow&#8217;s 2019 talk &#8220;Preventing the Collapse of Civilization&#8221;</a> frames the issue in starker terms: each generation of developers inherits diluted knowledge, and the accumulated abstraction layers represent civilizational risk.</p><p>The pragmatic middle ground comes from developers like <a href="https://frontendmasters.com/teachers/the-primeagen/">ThePrimeagen</a>, who built &#8220;99,&#8221; a Neovim AI plugin explicitly designed for &#8220;people without skill issues,&#8221; deliberately restricting AI to specific, developer-controlled areas rather than giving it full autonomy. His philosophy: AI assists, it doesn&#8217;t replace. Zed Shaw, author of the &#8220;Learn Code the Hard Way&#8221; series, <a href="https://news.ycombinator.com/item?id=8635884">advises beginners to avoid IDEs entirely during initial learning</a>: &#8220;If you take the easy tool-based route, then you&#8217;re dependent on the tool you use.&#8221;</p><p>And <a href="https://news.ycombinator.com/item?id=8635884">HN being HN</a>, one commenter nailed the practical consequence: &#8220;If I had a Bitcoin for every IDE superstar programmer who couldn&#8217;t navigate his way around the build system, I wouldn&#8217;t have to write software for a living.&#8221;</p><h2>The survey data tells the generational story</h2><p>The <a href="https://survey.stackoverflow.co/2025/ai">Stack Overflow 2025 Developer Survey</a> (49,000+ respondents) quantifies the split. Early-career developers use AI tools daily at 55.5% compared to 47.3% for developers with 10+ years of experience. More significantly, <a href="https://survey.stackoverflow.co/2025">46% of developers now actively distrust AI output</a>, exceeding the 33% who trust it. Only 3% report high trust. Positive sentiment toward AI tools has declined from over 70% in 2023 to 60% in 2025. Experienced developers are the most skeptical: lowest &#8220;highly trust&#8221; rate (2.6%), highest &#8220;highly distrust&#8221; rate (20%).</p><p><a href="https://www.hackerrank.com/reports/developer-skills-report-2025">HackerRank&#8217;s 2025 report</a>, drawing on 26 million developers and 3 million assessments, reveals the hiring consequence: lead developer hiring grew 22% year-over-year while entry-level hiring grew only 7%. The report explicitly cites employer concerns about whether early-career developers can code without heavy AI assistance. Stanford data shows employment among software developers aged 22-25 fell nearly 20% between 2022 and 2025.</p><p>Meanwhile, <a href="https://blog.jetbrains.com/research/2025/10/state-of-developer-ecosystem-2025/">JetBrains&#8217; 2025 State of Developer Ecosystem</a> found 68% of developers expect AI proficiency to become a job requirement, and <a href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/">GitHub&#8217;s Octoverse 2025</a> reports that nearly 80% of new developers use Copilot within their first week. AI-assisted coding is the default learning environment for an entire generation, and we have precisely zero longitudinal studies on what that does to skill development.</p><p>Remember <a href="https://en.wikipedia.org/wiki/Vibe_coding">&#8220;vibe coding&#8221;</a>? <a href="https://x.com/karpathy/status/1886192184808149383?lang=en">Andrej Karpathy coined it</a> in February 2025: &#8220;fully give in to the vibes, embrace exponentials, and forget that the code even exists.&#8221; The <a href="https://stackoverflow.blog/2025/12/29/developers-remain-willing-but-reluctant-to-use-ai-the-2025-developer-survey-results-are-here/">Stack Overflow survey found 72% of developers say vibe coding plays no role in their professional work</a>. <a href="https://futurism.com/artificial-intelligence/inventor-vibe-coding-doesnt-work">Karpathy himself retreated</a>, admitting his &#8220;Nanochat&#8221; project was &#8220;basically entirely hand-written&#8221; because AI agents &#8220;just didn&#8217;t work well enough.&#8221;</p><h2>The abstraction argument is older than you think (and this time it&#8217;s different)</h2><p><a href="https://www.joelonsoftware.com/2002/11/">Joel Spolsky&#8217;s 2002 &#8220;Law of Leaky Abstractions&#8221;</a> remains the foundational text: &#8220;All non-trivial abstractions, to some degree, are leaky.&#8221; His most underappreciated line: &#8220;Abstractions save us time working, but they don&#8217;t save us time learning.&#8221;</p><p>Every abstraction boundary in computing history has produced this same anxiety. When <a href="https://thehistory.tech/first-fortran-program-1957/">FORTRAN arrived in 1957</a>, assembly programmers viewed it as a crutch. <a href="http://www.pbm.com/~lindahl/real.programmers.html">Ed Post&#8217;s satirical 1983 essay &#8220;Real Programmers Don&#8217;t Use Pascal&#8221;</a> captured the gatekeeping pattern so precisely it reads as prophecy of today&#8217;s Vim-vs-IDE debates. Each transition involved genuine loss of low-level understanding, genuine productivity gains, gatekeeping rhetoric from incumbents, and eventual normalization.</p><p>But the calculator analogy that defenders of AI coding tools love to invoke breaks down on closer inspection. <a href="https://www.semanticscholar.org/paper/A-Meta-Analysis-of-the-Effects-of-Calculators-on-in-Ellington/583aa0edf4abe9a09f4df46b87620f42b2d59f54">A 2003 meta-analysis of 54 studies</a> found calculator use did not hinder mathematical skill development. However, as cognitive scientist <a href="https://medium.com/bits-and-behavior/more-than-calculators-why-large-language-models-threaten-public-education-480dd5300939">Amy Jo Ko argues</a>, LLMs differ because they replace entire cognitive processes, not just computation. Calculators don&#8217;t hallucinate. They don&#8217;t generate plausible-but-wrong solutions. The better analogy would be handheld calculators that routinely display 2+2=5 with complete confidence.</p><p>The strongest counterargument is that every prior generation of abstraction-skeptics was ultimately wrong. The FORTRAN skeptics lost. The structured programming skeptics lost. IDE skeptics, broadly, lost. <a href="https://lp.jetbrains.com/cs-learning-curve-report-2024/">JetBrains&#8217; learning curve survey</a> found learners who use IDEs encounter fewer obstacles, get stuck less often, and handle version control more easily. But the question isn&#8217;t whether IDEs helped. It&#8217;s whether AI code generation is another step on the same escalator or a qualitatively different kind of abstraction that crosses from augmenting cognition to replacing it.</p><p>Based on what I&#8217;ve seen in the research, I think it&#8217;s the latter. And I think the IDE is where developers are most likely to experience this crossing without noticing it.</p><h2>Why this matters for CLI and specification-driven workflows</h2><p>Here&#8217;s where this connects to the broader argument I&#8217;ve been building about CLI-based agentic tools and the shift toward <a href="https://hyperdev.matsuoka.com/p/the-other-shoe-will-drop">Specification-Driven Development (SDD)</a>.</p><p>IDE-integrated AI tools optimize for the wrong cognitive mode. They sit <em>inside</em> your editor, constantly suggesting, constantly completing, making it effortless to accept code you haven&#8217;t thought through. The tight feedback loop that makes IDEs feel productive is the same loop that enables the accept-and-move-on behavior the research keeps flagging. The entire UX is designed to keep you writing code faster, not thinking about code more carefully.</p><p>CLI-based agentic tools and SDD/TkDD workflows force a different cognitive mode entirely. When you&#8217;re working with Claude Code from the terminal, you can&#8217;t just tab-accept a suggestion mid-line. You have to think about what you want before you ask for it. You write specifications. You decompose tasks into tickets. You define acceptance criteria. Then you supervise execution and review results.</p><p>This is where I see the distinction between SDD and what I call <a href="https://hyperdev.matsuoka.com/">Ticket-Driven Development (TkDD)</a>. SDD gives you the strategic framework: specs are the primary artifact, not code. TkDD gives you the <em>workflow mechanics</em>: every unit of work lives in a ticket that captures not just the requirement but the evolution of thinking during human-AI collaboration. The ticket becomes the forcing function that makes you articulate intent before the agent writes a single line.</p><p>I use <a href="https://github.com/bobmatnyc/mcp-ticketer">mcp-ticketer</a> for this daily. Before an agent touches code, I&#8217;ve written a ticket that specifies what I want, why I want it, what the constraints are, and how I&#8217;ll know it&#8217;s done. That process of articulation is exactly the &#8220;desirable difficulty&#8221; that Bjork&#8217;s research says builds deeper understanding. It&#8217;s the sit-ups DHH was talking about. The IDE workflow lets you skip the sit-ups. TkDD makes you do them.</p><p>Think about Boris Cherny&#8217;s workflow. The Anthropic staff engineer who created Claude Code uses Plan mode to iterate on architecture until satisfied, then switches to auto-accept mode where Claude &#8220;can usually 1-shot it.&#8221; He runs 10-15 concurrent sessions. That&#8217;s not editing code in an IDE. That&#8217;s specifying, supervising, and reviewing. The cognitive work happens <em>before</em> the code exists, not <em>while</em> it&#8217;s being suggested to you inline.</p><p>Multi-agent orchestration amplifies this effect. When you&#8217;re running five Claude Code instances across git worktrees through tmux, your job is architectural thinking and quality review. There&#8217;s no autocomplete to accept. There&#8217;s no inline suggestion to wave through. You&#8217;re operating at the specification and supervision layer, which is precisely the cognitive level the research says matters most for building and maintaining deep understanding.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qAwH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qAwH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!qAwH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!qAwH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!qAwH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qAwH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png" width="1184" height="864" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:864,&quot;width&quot;:1184,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1481971,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187322464?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qAwH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 424w, https://substackcdn.com/image/fetch/$s_!qAwH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 848w, https://substackcdn.com/image/fetch/$s_!qAwH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 1272w, https://substackcdn.com/image/fetch/$s_!qAwH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c61f293-9bb6-4bee-ad43-6edb5650cc5c_1184x864.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The line between tool and crutch</h2><p>The evidence doesn&#8217;t support the claim that IDEs themselves have made software engineers less capable. Traditional IDE features function as cognitive augmentation that helps developers access information faster while building understanding.</p><p>What the evidence <em>does</em> support, with increasing conviction, is that AI code generation represents a qualitative break from prior tooling. The convergence of findings (19% slowdown masked by perceived speedup, 41% increase in bugs, 30% rise in static analysis warnings, a &#8722;0.75 correlation between AI usage and critical thinking, and the widening gap between strong and weak learners) points to a tool category that degrades understanding while creating an illusion of competence.</p><p>The danger isn&#8217;t that experienced developers will forget how to code. It&#8217;s that the next generation won&#8217;t learn how to think about code at a level deeper than &#8220;accept suggestion.&#8221; IDEs don&#8217;t cause this problem, but they&#8217;re the delivery mechanism. The inline, always-on, friction-free suggestion environment of modern IDE-based AI is precisely optimized to bypass the cognitive processes that build expertise.</p><p>CLI-based agentic workflows and the SDD/TkDD paradigm aren&#8217;t just different tools. They&#8217;re different cognitive modes. They require you to think before you prompt, specify before you execute, and review with genuine comprehension rather than a quick scan of inline diffs.</p><p>That&#8217;s not terminal elitism. That&#8217;s responding to what the research actually shows: the developers who&#8217;ll thrive in the next phase are the ones who can think at the level of specifications, architecture, and agent supervision. Not the ones who got really fast at pressing Tab.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p><p><em><strong>Related reading:</strong></em></p><ul><li><p><em><a href="https://hyperdev.matsuoka.com/p/the-other-shoe-will-drop">The Other Shoe Will Drop</a> &#8212; The economics of AI-assisted development and specification-driven workflows</em></p></li><li><p><em><a href="https://hyperdev.matsuoka.com/p/whats-in-my-toolkit-august-2025">What&#8217;s In My Toolkit: August 2025</a> &#8212; My daily CLI-based agentic workflow</em></p></li><li><p><em><a href="https://hyperdev.matsuoka.com/p/tkdd-ticket-driven-development-and">TkDD: Ticket-Driven Development</a> &#8212; Why tickets are the forcing function for AI collaboration</em></p></li><li><p><em><a href="https://hyperdev.matsuoka.com/p/the-age-of-the-cli-part-2">The Age of the CLI, Part 2</a> &#8212; From nanny coding to fire-and-check-in</em></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Breaking: Opus 4.6 and Agent Teams]]></title><description><![CDATA[Anthropic&#8217;s Multi-Agent Coding Push]]></description><link>https://hyperdev.matsuoka.com/p/article-opus-46-and-agent-teams</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/article-opus-46-and-agent-teams</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 05 Feb 2026 23:56:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4JC8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4JC8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4JC8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4JC8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4JC8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4JC8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4JC8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2535546,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/187036881?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4JC8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4JC8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4JC8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4JC8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F761ca7a3-eb03-40fa-8afa-0a37a6789fec_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Anthropic dropped Opus 4.6 today (February 5, 2026), and buried in the announcement is what could be the most significant development in agentic coding since Claude Code launched: <strong>Agent Teams</strong>.</p><p>Multiple Claude instances working in parallel on a shared codebase, coordinating autonomously, with no active human intervention.</p><p>I haven&#8217;t tested it yet. But based on what Anthropic published today, this expands the ceiling of what&#8217;s possible with AI-powered development.</p><h2>What Agent Teams Actually Is</h2><p>Agent Teams is a research preview feature in Claude Code that lets you run multiple Claude instances simultaneously, each working on different aspects of a project.</p><p>The architecture (<a href="https://code.claude.com/docs/en/agent-teams">official docs</a>):</p><ul><li><p><strong>One lead session</strong> coordinates the work</p></li><li><p><strong>Multiple agent instances</strong> run independently with their own context windows</p></li><li><p><strong>Shared task list</strong> that agents can assign themselves work from</p></li><li><p><strong>Direct agent-to-agent communication</strong> for coordination</p></li><li><p><strong>Parallel execution</strong> on read-heavy tasks like codebase reviews</p></li></ul><p>Enable it with: <code>CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1</code></p><p>This isn&#8217;t subagents (which work within a single session and return results to the parent). These are independent Claude Code sessions that can communicate and coordinate directly.</p><h2>The C Compiler Stress Test</h2><p>Before releasing Agent Teams publicly, Anthropic researcher Nicholas Carlini (<a href="https://www.anthropic.com/engineering/building-c-compiler">published writeup</a>) stress-tested the system by tasking 16 agents with building a C compiler from scratch, capable of compiling the Linux kernel.</p><p>The results:</p><ul><li><p><strong>Nearly 2,000 Claude Code sessions</strong> over two weeks</p></li><li><p><strong>2 billion input tokens</strong> and 140 million output tokens consumed</p></li><li><p><strong>$20,000 in API costs</strong></p></li><li><p><strong>100,000 lines of Rust code</strong> produced</p></li><li><p><strong>Successfully compiles Linux 6.9</strong> on x86, ARM, and RISC-V</p></li><li><p><strong>99% pass rate</strong> on most compiler test suites including GCC torture tests</p></li><li><p><strong>Can compile and run Doom</strong> (the ultimate developer litmus test)</p></li></ul><p>Carlini describes this as &#8220;clean-room implementation&#8221; - Claude had no internet access, only the Rust standard library.</p><p>The compiler has limitations (can&#8217;t handle 16-bit x86 real mode, cheats by calling GCC for that phase). But the point isn&#8217;t whether the compiler is production-ready. The point is <strong>16 AI agents autonomously built a 100,000-line compiler that actually works</strong>.</p><p>That&#8217;s a different order of magnitude than &#8220;Claude helped me refactor a module.&#8221;</p><h2>What Changed in Opus 4.6</h2><p>Agent Teams is the headline, but Opus 4.6 includes several major upgrades:</p><p><strong>1M Token Context Window</strong></p><p>First time for Opus-class models. Not just &#8220;more tokens&#8221; - the retrieval quality matters more than the capacity. On MRCR v2 (finding specific information buried in massive context):</p><ul><li><p>Opus 4.6: 76.0%</p></li><li><p>Sonnet 4.5: 18.5%</p></li></ul><p><strong>Context Compaction</strong></p><p>For long-running sessions, automatically summarizes older conversation turns to free up context space. Like git squash for conversation history - keeps summaries of earlier work, full detail on recent turns.</p><p><strong>Adaptive Thinking with Effort Controls</strong></p><p>Four settings: low, medium, high (default), max. Model adjusts reasoning depth based on task complexity. Trade latency and cost for quality when you need it, run faster on simpler tasks.</p><p><strong>Same Pricing</strong></p><p>$5 input / $25 output per million tokens. Identical to Opus 4.5. (Premium pricing $10/$37.50 for requests over 200K tokens using the full 1M context.)</p><h2>Available Now, Everywhere</h2><p>This isn&#8217;t an announcement with a waitlist. Opus 4.6 is live:</p><ul><li><p>Claude.ai web interface</p></li><li><p>Claude API (<code>claude-opus-4-6</code>)</p></li><li><p>Claude Code (with Agent Teams in research preview)</p></li><li><p>GitHub Copilot (gradual rollout)</p></li><li><p>AWS Bedrock</p></li><li><p>Major cloud platforms</p></li></ul><p>That&#8217;s unusual. Most AI releases do staged rollouts. Anthropic shipped everywhere simultaneously.</p><h2>The Benchmarks (With Usual Caveats)</h2><p>Anthropic claims (<a href="https://www.anthropic.com/claude/opus">official Opus page</a>) Opus 4.6 leads or matches competitors across most benchmark categories:</p><p><strong>Terminal-Bench 2.0</strong> (agentic coding): 65.4% - Anthropic&#8217;s highest score <strong>GDPval-AA</strong> (real-world professional tasks): 144 Elo points ahead of GPT-5.2 <strong>Humanity&#8217;s Last Exam</strong> (complex reasoning): Leads all frontier models <strong>BigLaw Bench</strong>: 90.2% - highest score from any Claude model <strong>Zero-day vulnerability discovery</strong>: 500+ previously unknown high-severity vulnerabilities found in open-source code</p><p>Standard disclaimer: Anthropic&#8217;s benchmarks, specific test sets, real-world performance varies.</p><p>Worth noting: OpenAI released GPT-5.3 Codex 27 minutes after Opus 4.6&#8217;s announcement, claiming 77.3% on Terminal-Bench 2.0. The benchmark lead didn&#8217;t last half an hour.</p><h2>What This Could Mean for Agentic Development</h2><p>If Agent Teams works as described, it could change the fundamental economics of AI-assisted development.</p><p><strong>Before</strong>: One agent, sequential processing. Ask it to review a PR, it goes file by file.</p><p><strong>After</strong>: Multiple agents working in parallel. One reviews frontend, one reviews API, one checks tests, one updates documentation - simultaneously.</p><p>The cost structure changes too. Agent Teams bills each instance separately, so you&#8217;re paying for multiple concurrent sessions. But if three agents working in parallel complete a task in one-third the time, the token economics might still favor the parallel approach.</p><p><strong>The real question: Does coordination overhead eat the parallel gains?</strong></p><p>With human teams, adding developers to a project doesn&#8217;t scale linearly - coordination costs increase. Brooks&#8217; Law: &#8220;Adding manpower to a late software project makes it later.&#8221;</p><p>Do AI agent teams suffer the same coordination penalties, or do they coordinate more efficiently than humans?</p><p>I don&#8217;t know. I haven&#8217;t tested it yet.</p><h2>The OpenAI Response</h2><p>Twenty minutes after Anthropic announced Opus 4.6, OpenAI released GPT-5.3 Codex - a specialized developer-focused model.</p><p>The timing is either competitive coincidence or coordinated counter-programming. Either way, it signals where both companies see the competitive battlefield: <strong>autonomous, multi-agent coding workflows</strong>.</p><p>This isn&#8217;t about which model writes better individual functions. It&#8217;s about which platform enables teams of AI agents to autonomously execute complex, multi-day software projects.</p><h2>Security and Safety Considerations</h2><p>Anthropic published a system card claiming Opus 4.6 has low rates of harmful behaviors and the lowest over-refusal rates of any recent Claude model.</p><p>The cybersecurity implications cut both ways. Opus 4.6 found 500+ previously unknown high-severity vulnerabilities (<a href="https://www.axios.com/2026/02/05/anthropic-claude-opus-46-software-hunting">Axios reporting</a>) in open-source code - which is excellent for defenders trying to secure their software. It&#8217;s also concerning for what adversaries could do with that capability.</p><p>Anthropic developed six new cybersecurity probes to detect potentially harmful uses and implemented real-time detection tools to block suspected malicious traffic. They acknowledge this &#8220;will create friction for legitimate research and some defensive work.&#8221;</p><h2>What I&#8217;m Testing Next</h2><p>I haven&#8217;t used Agent Teams yet. But here&#8217;s what I want to find out:</p><p><strong>1. Coordination efficiency</strong></p><p>Do agent teams actually complete complex tasks faster than sequential agents, or does coordination overhead cancel the parallel gains?</p><p><strong>2. Cost dynamics</strong></p><p>At what task complexity does paying for multiple concurrent sessions become economically viable compared to one longer session?</p><p><strong>3. Integration with Claude-MPM</strong></p><p>I already orchestrate multiple specialized agents through Claude-MPM. How does Agent Teams interact with or replace that orchestration layer?</p><p><strong>4. Task decomposition quality</strong></p><p>How well does the lead agent break down complex work into parallelizable subtasks? Does it create artificial dependencies or find genuine parallelism?</p><p><strong>5. Conflict resolution</strong></p><p>What happens when multiple agents need to modify the same files? How does the system handle merge conflicts and coordination failures?</p><p>I&#8217;ll report back once I&#8217;ve actually run Agent Teams on real projects.</p><h2>The Bottom Line</h2><p>Opus 4.6 represents a major capability expansion in agentic coding. Agent Teams raises the ceiling on what&#8217;s possible - from &#8220;AI helps me code&#8221; to &#8220;AI agent teams autonomously execute multi-week software projects.&#8221;</p><p>Whether it actually delivers on that potential in production environments remains to be seen. The C compiler demonstration is impressive, but building a greenfield compiler is different from maintaining a legacy enterprise codebase with 15 years of technical debt.</p><p>The real test: Can Agent Teams handle the messy, ambiguous, poorly-documented, politically-fraught reality of actual software development? Or does it only shine on clean-room projects with clear specifications?</p><p>I&#8217;ll find out. But the potential is clear: we&#8217;ve moved from individual AI coding assistants to coordinated AI development teams.</p><p>This release is not an incremental improvement. This could be a significant shift in how autonomous development works.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[I Built a Coding Tool. Then I Used It to Onboard as CTO]]></title><description><![CDATA[How To Avoid Playing The &#8220;New Guy&#8221; Card]]></description><link>https://hyperdev.matsuoka.com/p/i-built-a-coding-tool-then-i-used</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/i-built-a-coding-tool-then-i-used</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Fri, 30 Jan 2026 13:32:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!h_VR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h_VR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h_VR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!h_VR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!h_VR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!h_VR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h_VR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1525080,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/185540856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h_VR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!h_VR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!h_VR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!h_VR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff52f474f-33f6-480d-9a3c-bd343341f1af_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I built <a href="https://github.com/bobmatnyc/claude-mpm">claude-mpm</a> to orchestrate multiple Claude Code agents for complex software development. Parallel execution, specialized agent roles, persistent memory across sessions&#8212;the whole multi-agent playbook.</p><p>Then I got hired as CTO of a 150-person R&amp;D organization. And the first thing I did wasn&#8217;t write code.  It was to point my coding orchestration framework at a completely different problem: understanding the organization I was about to lead.</p><h2>The Traditional Onboarding Grind</h2><p>Here&#8217;s what CTO onboarding usually looks like:</p><p><strong>Months 1-2:</strong> You do a &#8220;listening tour.&#8221; Fifty to a hundred one-on-ones with engineers, managers, product leads. Read whatever documentation exists (outdated). Attend sprint reviews and architecture meetings. Try to figure out who does what.</p><p><strong>Months 3-4:</strong> Patterns start emerging. You notice recurring complaints in conversations. You start to see disconnects between stated priorities and actual work. You form hypotheses.</p><p><strong>Months 5-6:</strong> You finally synthesize enough to make preliminary recommendations. Often conservative ones, because you still don&#8217;t have complete context. You&#8217;re still discovering things you didn&#8217;t know you didn&#8217;t know.</p><p>After six months, a typical new CTO has good qualitative intuition but limited quantitative backing. Many insights remain anecdotal. Critical data&#8212;exact staffing costs, work distribution, technical debt concentration&#8212;might still be missing.</p><p>I had two weeks before my start date. And I had a tool designed for parallel analysis of complex technical systems.</p><h2>The Experiment</h2><p>The organization had granted me early access to their systems: GitHub, JIRA, Slack, Confluence, budget spreadsheets. Standard pre-start due diligence stuff.</p><p>I started with Claude.AI and Cowork&#8212;Anthropic&#8217;s desktop tools for non-developers. Asked questions, got answers, generated some useful analysis documents. It worked. I could pull insights from individual data sources, get summaries, ask follow-up questions.</p><p>But the volume wasn&#8217;t there. I&#8217;d get a handful of documents, maybe a dozen useful artifacts across a few days of work. Decent for casual exploration. Not enough to actually understand a 150-person organization.</p><p>Then I pointed claude-mpm at the same data sources.</p><p>Part of what made this possible: I&#8217;d recently added MCP Google Workspace integration to the framework. That meant agents could pull directly from Drive, Sheets, Docs&#8212;wherever the organization kept its data. Budget spreadsheets, org charts, planning documents, historical analyses. All queryable through the same orchestration layer that handles code repositories.</p><p>The difference wasn&#8217;t 2x or 5x. It was closer to 10x. Maybe more. The CLI-based orchestration could spawn specialized agents, run them in parallel, maintain persistent memory across sessions, and churn through analysis while I was doing other things. What took hours of back-and-forth in a chat interface happened in minutes with coordinated agents.</p><p>The collaboration model looked like this:</p><pre><code><code>HUMAN ROLE                      AI ROLE                         
Ask strategic questions    &#8594;    Parse 27K commits from 141 repos
Provide business context   &#8594;    Classify work by type           
Validate findings          &#8594;    Identify anomalies              
Make decisions             &#8594;    Generate analysis artifacts     
</code></code></pre><p>Every analysis started with a human question. &#8220;Who are the critical contributors?&#8221; &#8220;What is the team actually working on versus what&#8217;s budgeted?&#8221; &#8220;Where is technical debt concentrated?&#8221; AI didn&#8217;t guess what I needed&#8212;it answered what I asked.</p><p>Then I&#8217;d look at the answers and ask follow-up questions. Iterate. Triangulate across data sources. Challenge assumptions with data.</p><h2>What One Week of Human-AI Collaboration Produced</h2><p>The actual research took about a week. I invested maybe 5 hours of hands-on time. The AI agents collectively ran for 500+ hours equivalent of analysis.</p><p>Five hours. Let that sink in.</p><p>The output:</p><p><strong>267 markdown documents</strong> spanning people, processes, code, and strategy. Not summaries&#8212;deep analyses with specific evidence.</p><p><strong>A 15MB database</strong> containing 27,343 classified commits from 141 repositories. Every commit tagged by work type: feature development, bug fixes, maintenance, refactoring.</p><p><strong>A 620KB knowledge database</strong> integrating budget data, org charts, and work attribution. Who&#8217;s working on what? How does that compare to budget allocation?</p><p><strong>An interactive web platform</strong> for exploring the data. &#8220;Show me everyone who touched the authentication system in the last year.&#8221; &#8220;Which repositories haven&#8217;t had commits in six months?&#8221;</p><p><strong>Discoveries</strong> that would have taken months to surface organically. The kind of things that don&#8217;t come up in one-on-ones&#8212;gaps between budgeted priorities and actual work allocation, security hygiene issues nobody had noticed, concentration risk in the codebase. None of these came from AI having opinions. They came from AI parsing data at scale while I asked increasingly pointed questions.</p><h2>The Insight: Orchestration Isn&#8217;t About Code</h2><p>Here&#8217;s what surprised me: claude-mpm worked better for organizational analysis than I expected. Not because it&#8217;s designed for that use case&#8212;it isn&#8217;t. Because the underlying pattern is the same.</p><p>Multi-agent orchestration solves a specific problem: complex work requiring synthesis across multiple information sources, where no single context window can hold everything, and where specialized approaches to different sub-problems produce better results than one generalist.</p><p>That describes software development. It also describes:</p><ul><li><p>Due diligence on acquisitions</p></li><li><p>Market research synthesis</p></li><li><p>Competitive intelligence</p></li><li><p>Academic literature reviews</p></li><li><p>Legal discovery</p></li><li><p>Any knowledge work requiring multi-source analysis at scale</p></li></ul><p>The agents don&#8217;t care what they&#8217;re analyzing. Code, commit histories, JIRA tickets, budget spreadsheets, Slack conversations&#8212;it&#8217;s all context to be parsed, patterns to be identified, insights to be surfaced.</p><h2>What AI Actually Did (And Didn&#8217;t Do)</h2><p>Let me be precise about the division of labor.</p><p><strong>AI handled:</strong></p><ul><li><p>Parsing tens of thousands of commits and classifying them by work type (85-90% accuracy)</p></li><li><p>Correlating GitHub usernames to JIRA accounts to Slack handles to budget line items</p></li><li><p>Identifying statistical anomalies (bus factor calculations, spend variances, contribution patterns)</p></li><li><p>Generating first drafts of analysis documents</p></li><li><p>Building queryable databases from unstructured data</p></li></ul><p><strong>I handled:</strong></p><ul><li><p>Asking the right questions (this was most of my 5 hours)</p></li><li><p>Providing business context AI couldn&#8217;t infer (&#8221;this team was recently reorganized&#8221;)</p></li><li><p>Validating findings (&#8221;that anomaly is because of the acquisition, not a problem&#8221;)</p></li><li><p>Making judgment calls (&#8221;that maintenance percentage is a crisis, not just a number&#8221;)</p></li><li><p>Deciding what to do about it</p></li></ul><p>AI didn&#8217;t replace my judgment. It gave me things to have judgment about. Faster, with better evidence, across more data than I could have processed alone.</p><h2>The Economics</h2><p>Traditional estimate for CTO onboarding: 3-6 months to reach 60% organizational understanding. Mostly qualitative.</p><p>AI-augmented approach: 1 week to reach maybe 90% understanding. Quantitatively backed.</p><p>Here&#8217;s a concrete comparison: The CTO who replaced me at my previous company was six months into the role when I last saw her in action. She was still asking basic questions about how teams were organized, who owned which systems, where the budget was actually going. Six months. I walked into this new company on day zero with answers to questions she hadn&#8217;t thought to ask yet.</p><p>Every new executive knows the &#8220;new guy card&#8221;&#8212;that implicit grace period where you get to say &#8220;I&#8217;m still ramping up&#8221; and nobody expects you to have answers. It&#8217;s comfortable. It buys you six months of not being accountable for what you don&#8217;t know yet.</p><p>I didn&#8217;t want to play that card. I wanted to walk in like a veteran.</p><p>Here&#8217;s the honest limitation: I absolutely cannot do that with the people. Relationships take time. Trust gets built through interactions. No amount of AI research tells you who&#8217;s politically dangerous, who&#8217;s quietly brilliant, who needs to vent before they can hear feedback. That&#8217;s human intelligence that only accumulates through presence.</p><p>But the code? The stack? The projects? The products? The budget allocation patterns? The organizational structure? The technical debt? I can know all of that before my first all-hands. And that means when I&#8217;m in meetings, I can focus on reading the room instead of scrambling to understand what people are talking about.</p><p>That&#8217;s not a knock on her abilities&#8212;she was doing it the traditional way, which is the only way most people know. Meetings, osmosis, gradual pattern recognition.</p><p>I had 267 documents, a queryable database of every commit, and quantified answers waiting for me before my first all-hands.</p><p>Cost comparison:</p><ul><li><p>AI API costs: a few thousand dollars</p></li><li><p>Lost productivity from 6-month ramp-up: $150-200K (conservatively)</p></li></ul><p>The ROI math isn&#8217;t subtle.</p><p>But the bigger value wasn&#8217;t time or money. It was discovering things I wouldn&#8217;t have found through meetings alone. Patterns buried in data, invisible until someone asks the right question and has the tools to answer it.</p><h2>Practical Implications</h2><p><strong>For executives considering AI-augmented onboarding:</strong></p><p>Start if you have data access (APIs to GitHub, JIRA, whatever your org uses), you&#8217;re comfortable with iteration (first analysis won&#8217;t be perfect), and you can invest a few thousand in tooling plus 5-10 hours of directed time.</p><p>Don&#8217;t start if data is locked in silos with no export path, you expect AI to &#8220;do it all&#8221; without significant human direction, or you&#8217;re not technical enough to validate the output (or can&#8217;t partner with someone who is).</p><p><strong>For organizations preparing for AI-augmented leaders:</strong></p><p>Your data needs to be accessible. APIs, exports, documentation. If a new executive can&#8217;t query your GitHub or JIRA, they can&#8217;t run this playbook.</p><p>Your data quality matters more than you think. JIRA hygiene issues, inconsistent employee naming across systems, incomplete commit messages&#8212;all of these degrade AI analysis quality.</p><p>Budget for it. A few thousand dollars for an onboarding sprint is trivial compared to what you&#8217;re paying that executive.</p><p><strong>For people building AI tools:</strong></p><p>The use cases are broader than you think. I built claude-mpm for coding. It turned out to be a general-purpose knowledge work accelerator.</p><p>Identity resolution across systems is a major gap. Normalizing usernames across GitHub, JIRA, Slack, and email into a single identity is still harder than it should be.</p><p>Confidence scoring would help. When AI isn&#8217;t sure, it should say so explicitly. Humans can validate uncertain findings; they can&#8217;t validate confident-sounding hallucinations.</p><h2>What This Means for Knowledge Work</h2><p>The traditional playbook for understanding complex organizations&#8212;meetings, gradual osmosis, pattern recognition over months&#8212;isn&#8217;t wrong. It&#8217;s just slow.</p><p>AI doesn&#8217;t replace the human judgment at the core of that process. It accelerates the data gathering that feeds judgment. Ask better questions faster. Validate assumptions with evidence. Discover patterns that would take months to surface organically.</p><p>I spent a week asking questions about an organization I was about to lead. AI spent 500+ hours finding answers. The combination produced understanding I couldn&#8217;t have reached alone, in a timeframe that would have been impossible.</p><p>The coding tool I built turned out to be an organizational intelligence tool in disguise. That&#8217;s not an accident. It&#8217;s what happens when you build systems for parallel analysis of complex information.</p><p>The era of &#8220;gut feel&#8221; leadership isn&#8217;t ending. But the bar for what counts as informed intuition just got a lot higher.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on multi-agent orchestration, read my analysis on the <a href="https://hyperdev.substack.com/p/the-age-of-the-cli-part-2">era of the CLI</a> or my deep dive into <a href="https://hyperdev.substack.com/p/i-hope-never-to-use-claude-code-again">why I hope never to use Claude Code directly again</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[I’ve Joined Duetto as CTO]]></title><description><![CDATA[My Next Chapter]]></description><link>https://hyperdev.matsuoka.com/p/im-joining-duetto-as-cto</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/im-joining-duetto-as-cto</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Fri, 23 Jan 2026 13:30:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!JXmV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JXmV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JXmV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JXmV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JXmV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JXmV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JXmV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg" width="1120" height="912" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:912,&quot;width&quot;:1120,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:422481,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/185207836?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JXmV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JXmV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JXmV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JXmV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdc6bc66-c3c6-4b72-9462-120942ff448e_1120x912.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m starting with Duetto as Chief Technology Officer on Monday.</p><p>After a year of writing about agentic AI, building open-source tools, and consulting with organizations on AI-powered development, I&#8217;m stepping into a full-time leadership role at a company positioned at the intersection of everything I&#8217;ve been thinking about: ML-driven decision systems, enterprise transformation, and the practical application of AI to real business problems.</p><p>This feels like the right move at the right time. Let me explain why.</p><h2>What I Learned As A Solo Practitioner</h2><p>The past year has been an education. After not having seriously coded for decades as the CTO of both small and large organizations, I&#8217;ve transformed to a practitioner: building orchestration frameworks, stress-tested the current crop of AI coding tools, and documented what actually works versus what vendors promise. I&#8217;ve consulted with other CTO and engineering leaders navigating their own AI adoption journeys and helped small teams implement the patterns I&#8217;ve been writing about.</p><p>Here&#8217;s what is clear: the patterns work. The productivity gains are real&#8212;not the 10x hype, but meaningful improvements in how individuals and small teams ship software. My frameworks for multi-agent orchestration, specification-driven development, and human-AI collaboration have proven themselves in production environments.</p><p>And now it&#8217;s time to move to a different playing field &#8212; back to being a CTO, but now armed with tools I couldn&#8217;t have imagined a year ago.</p><p>As a solo practitioner and fractional advisor, I could demonstrate these patterns. I could help teams adopt them. The next challenge is proving them at enterprise scale&#8212;building the organizational structures, developing the team dynamics, and creating the systematic approaches that make AI-augmented development sustainable across an entire engineering organization.</p><p>The real test isn&#8217;t whether one developer can be more productive with AI. It&#8217;s whether an entire engineering organization can transform how it builds software while maintaining quality, security, and velocity. That requires being inside an organization, not advising from the outside.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NYJm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NYJm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 424w, https://substackcdn.com/image/fetch/$s_!NYJm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 848w, https://substackcdn.com/image/fetch/$s_!NYJm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 1272w, https://substackcdn.com/image/fetch/$s_!NYJm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NYJm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png" width="1456" height="642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:642,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:585207,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/185207836?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NYJm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 424w, https://substackcdn.com/image/fetch/$s_!NYJm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 848w, https://substackcdn.com/image/fetch/$s_!NYJm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 1272w, https://substackcdn.com/image/fetch/$s_!NYJm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9ac5971-a298-49d3-85b6-ae36ca1748c4_1457x642.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why <a href="https://www.duettocloud.com/">Duetto</a>?</h2><p>I looked at a lot of opportunities over the year. My criteria were specific:</p><p><strong>Right size.</strong> Large enough to have real engineering challenges and meaningful scale, small enough that a CTO can actually shape culture and direction. Duetto has 200-350 employees serving 7,200+ properties globally&#8212;substantial but not bureaucratic.</p><p><strong>Right ownership.</strong> GrowthCurve Capital acquired Duetto in June 2024 with an explicit investment thesis around AI and data analytics. Their portfolio focus and the capital they&#8217;re willing to deploy signal serious commitment to technical transformation, not just financial engineering.</p><p><strong>Right leadership.</strong> Alex Zoghlin joined as CEO in June 2025 with a track record I respect&#8212;co-founder and CTO of Orbitz, head of strategy and technology at Hyatt, CEO of ATPCO. He understands both travel technology and enterprise transformation. More importantly, my conversations with Alex have impressed me with his technical acumen and genuine openness to new ideas. That combination&#8212;deep experience, technical knowledge, and intellectual curiosity&#8212;is very rare in CEOs.</p><p><strong>Right domain.</strong>  Revenue and profit optimization will be transformed by ML and LLMs in ways most people haven&#8217;t fully grasped. Duetto&#8217;s core business &#8212; dynamic pricing, demand forecasting, yield management &#8212; is fundamentally algorithmic. The company already makes over a million pricing decisions daily across its platform. The opportunity to enhance those systems with modern AI/ML approaches is enormous.</p><p><strong>Right team.</strong> I&#8217;ve met with Growth Curve, the Duetto leadership team, and individual contributors. What I found was enthusiasm, technical competence, and genuine interest in evolving how they work. No organization is perfect, but the cultural foundation is solid.</p><h2>The Transformation Opportunity</h2><p>Here&#8217;s my thesis: we&#8217;re at an inflection point where AI doesn&#8217;t just help individuals code faster&#8212;it enables entirely different organizational structures for building software.</p><p>The traditional model assumes engineering productivity scales linearly with headcount, minus coordination overhead. Add 10 engineers, get roughly 8 engineers worth of output after accounting for meetings, alignment, and communication complexity. This is why Brooks&#8217;s Law has held for decades.</p><p>AI-augmented development changes the equation. Not uniformly&#8212;the 10x productivity claims are mostly marketing. But for specific types of work, with the right tooling and workflows, individual contributors can genuinely achieve 3-5x productivity on substantial portions of their work. More importantly, the nature of what requires human judgment versus what can be delegated to AI systems is shifting rapidly.</p><p>This creates an opportunity to rethink team structures. Not replacing engineers with AI, but redesigning how engineering organizations operate when individuals have dramatically more leverage on certain tasks. What does a product team look like when specification and architecture work becomes the primary human contribution? How do you structure code review when AI generates most of the implementation? What skills matter for senior engineers when junior tasks are increasingly automated?</p><p>I don&#8217;t have complete answers. Nobody does&#8212;we&#8217;re all figuring this out in real time. But Duetto offers something rare: a company with technical leadership willing to experiment, ownership willing to invest, a domain where the results are measurable, and enough scale to validate whether these organizational patterns actually work.</p><h2>Why Hospitality Revenue Management Matters</h2><p>Some readers might wonder why I&#8217;m excited about hotel pricing software. Fair question. I love travel and hospitality.  It&#8217;s been my home for nearly two decades.  I&#8217;m eager to jump back into it.  It&#8217;s also an <a href="https://open.substack.com/pub/hyperdev/p/why-these-ai-travel-scenarios-miss?utm_campaign=post-expanded-share&amp;utm_medium=web">industry</a> crying out for <a href="https://open.substack.com/pub/hyperdev/p/future-of-tripadvisor-can-it-lead?utm_campaign=post-expanded-share&amp;utm_medium=web">technical innovation</a>, as I&#8217;ve written about <a href="https://open.substack.com/pub/hyperdev/p/ai-trip-planning-helpful-assistant?utm_campaign=post-expanded-share&amp;utm_medium=web">before</a>.</p><p>Revenue management is one of the purest applications of ML in enterprise software. The problems are well-defined: given demand signals, competitive data, historical patterns, and capacity constraints, optimize pricing to maximize revenue (or increasingly, profit). The feedback loops are tight&#8212;you can measure whether your pricing decisions worked within days. The data is rich and structured.</p><p>This is exactly the kind of domain where AI advances will compound. Better demand forecasting. More sophisticated segmentation. Real-time competitive response. Natural language interfaces for revenue managers. Eventually, increasingly autonomous systems that can execute pricing strategies without constant human oversight.</p><p>Duetto has been at the forefront of this space for over a decade&#8212;they pioneered &#8220;Open Pricing&#8221; as an alternative to traditional rate management, and they&#8217;ve consistently ranked #1 in their category. But the technology landscape is shifting faster than most incumbents can adapt. The opportunity is to accelerate Duetto&#8217;s AI capabilities while the company is still nimble enough to move quickly.</p><p>The recent acquisitions&#8212;MiceRate for function space optimization, HotStats for profitability benchmarking&#8212;expand the platform&#8217;s scope beyond room pricing to total revenue and profit management. The technical integration work alone is substantial, but the strategic opportunity is larger: building an AI-native platform that helps hospitality operators optimize their entire business, not just room rates.</p><h2>What This Means For HyperDev</h2><p>I&#8217;m not stopping.</p><p>Writing has become essential to how I think. The discipline of articulating ideas clearly, testing them against reader feedback, and building in public has shaped my understanding of this space more than any other practice. Giving that up would be a mistake.</p><p>That said, expect changes.</p><p><strong>Velocity will decrease.</strong> I&#8217;ve been publishing 2-3 substantial pieces weekly. That pace isn&#8217;t sustainable with a full-time role. I&#8217;ll aim for weekly publication, possibly bi-weekly during intense periods.</p><p><strong>Perspective will shift.</strong> I&#8217;ll have less time for comprehensive tool reviews and market surveys. I&#8217;ll have more insight into enterprise-scale AI adoption, team transformation, and the practical challenges of implementing these ideas in production environments.</p><p><strong>Some topics will be off-limits.</strong> I won&#8217;t write about Duetto&#8217;s competitive positioning, proprietary technology, or internal challenges. I will write about generalizable patterns, industry trends, and lessons that benefit the broader community without compromising my employer.</p><p><strong>The community matters more than ever.</strong> As my publishing frequency decreases, I&#8217;ll rely more on reader questions, feedback, and topic suggestions to prioritize what I write about. If you have specific areas you want me to explore, let me know.</p><h2>What I&#8217;m Looking Forward To</h2><p>Honestly? Building something substantial from an organizational perspective again.  I&#8217;m proud of what I&#8217;ve done practitioner-wise and with my startups, but moving organizations are a different type of challenge.</p><p>Consulting and writing are intellectually stimulating but emotionally incomplete. You influence outcomes without owning them. You advise on decisions without living with their consequences. You observe transformation without being transformed yourself.</p><p>I miss the weight of real responsibility&#8212;the 3 AM production incidents, the hard conversations about priorities, the satisfaction of shipping something that matters to customers. I miss building teams, mentoring engineers, and creating environments where people do their best work.</p><p>I also miss being proven wrong in ways that matter. When you&#8217;re an outside advisor, you can always claim your recommendations would have worked if only they&#8217;d been implemented properly. When you&#8217;re the CTO, you own the outcomes. That accountability is uncomfortable and essential.</p><p>Duetto offers the chance to test everything I&#8217;ve been writing about. If AI-augmented development can transform an engineering organization, I should be able to demonstrate it. If the productivity patterns work at scale, they should show up in our velocity and quality metrics. If the organizational structures I&#8217;ve theorized about are viable, I need to build them and see what happens.</p><p>This is the job. I&#8217;m ready.</p><h2>Timeline and Transition</h2><p>I start on Monday. </p><p>Expect the rhythm to change. I&#8217;ll still be here, still thinking through these problems in public, still learning from this community. Just with different constraints and, I hope, deeper insights to share.</p><p>Thank you for reading, for the conversations, and for making this publication worth writing. The next chapter should be interesting.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For context on the AI development patterns I&#8217;ll be bringing to Duetto, read my analysis of <a href="https://hyperdev.substack.com/">multi-agent orchestration in practice</a> or my deep dive into <a href="https://hyperdev.substack.com/">the evolution of agentic AI coding tools</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Trust Is Procedural, Not Cognitive]]></title><description><![CDATA[What Compiler History Teaches Us About AI Code]]></description><link>https://hyperdev.matsuoka.com/p/trust-is-procedural-not-cognitive</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/trust-is-procedural-not-cognitive</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 19 Jan 2026 12:30:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!pzTq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pzTq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pzTq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!pzTq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!pzTq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!pzTq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pzTq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2566702,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/184926790?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pzTq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!pzTq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!pzTq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!pzTq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b66b09-955a-425e-a3d2-68e18aea8b46_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve been framing trust the wrong way.</p><p>For months I&#8217;ve been asking the wrong question about AI-generated code. &#8220;Do I understand what this does?&#8221; That&#8217;s rarely how trust operates in complex systems. Maybe it never was.</p><p>The epistemological shift required to trust AI-generated code mirrors transitions we&#8217;ve already been through: compiler adoption, cockpit automation, statistical process control. Every time, humans learned to trust procedural verification over cognitive comprehension. The developers who struggled? They kept asking &#8220;Do I understand this?&#8221; The ones who adapted? They asked &#8220;Have I verified this adequately?&#8221;</p><h2>The Trust Paradox in Real Numbers</h2><p>The <a href="https://survey.stackoverflow.co/2025/">Stack Overflow 2025 Developer Survey</a> dropped some fascinating data. Only <strong>3.1% of developers highly trust AI output accuracy</strong>. Yet <strong>51% use AI tools daily</strong>. That gap doesn&#8217;t close through belief. It closes through process.</p><p>Among all developers surveyed, <strong>46% distrust AI accuracy</strong> compared to just 33% who trust it. Experience correlates with skepticism: developers with 10+ years show the lowest &#8220;highly trust&#8221; rate (2.5%) and highest &#8220;highly distrust&#8221; rate (20.7%). We&#8217;re not talking about Luddites. These are people who&#8217;ve been burned enough to know better.</p><p>But here&#8217;s the thing that matters: <strong>66% cite &#8220;AI solutions that are almost right, but not quite&#8221;</strong> as their biggest frustration. Not wrong. Almost right. That near-miss problem makes cognitive verification exhausting. You can&#8217;t skim AI output. You have to actually read it, and reading code is harder than writing it.</p><p>One <a href="https://news.ycombinator.com/item?id=46150868">highly-upvoted comment on Hacker News</a> captured this tension: &#8220;Famously, &#8216;it&#8217;s easier to write code than to read it.&#8217; That goes for humans. So why did we automate the easy part and moved the effort over to the hard part?&#8221;</p><h2>The Vibe Coding Backlash</h2><p>The <a href="http://qodo.ai/reports/state-of-ai-code-quality/">Qodo State of AI Code Quality Report 2025</a> quantified what many of us feel: <strong>76% of developers fall into what they call the &#8220;red zone&#8221;</strong>&#8212;frequent hallucinations, low confidence in generated code. Meanwhile &#8220;vibe coding&#8221; has emerged as a lightning rod. Only <strong>15% of developers actively use vibe coding professionally</strong>. <strong>72% reject it entirely</strong>.</p><p>One comment from r/programming resonated across developer communities: &#8220;I just wish people would stop pinging me on PRs they obviously haven&#8217;t even read themselves, expecting me to review 1000 lines of completely new vibe-coded feature that isn&#8217;t even passing CI.&#8221;</p><p>Another developer compared vibe coding to an electrician who &#8220;just threw a bunch of cables through your walls and hoped it all worked out&#8212;things might function initially, but hidden flaws lurk behind the walls.&#8221;</p><p>Here&#8217;s what I take from this backlash. Developers aren&#8217;t rejecting AI assistance. They&#8217;re demanding verification infrastructure. The emerging consensus points toward spec-driven development (write requirements first), test-first verification (AI generates tests alongside code), and incremental acceptance (small, verifiable chunks rather than wholesale generation).</p><h2>The Compiler Parallel Holds Up</h2><p><a href="https://blog.vivekhaldar.com/post/29296581613/when-compilers-were-the-ai-that-scared-programmers">Vivek Haldar&#8217;s February 2025 essay</a> &#8220;When Compilers Were the &#8216;AI&#8217; That Scared Programmers&#8221; provides the strongest historical parallel. In the 1950s, assembly programmers exhibited the same resistance patterns we&#8217;re seeing now.</p><p>&#8220;Many assembly programmers were accustomed to having intimate control over memory and CPU instructions. Surrendering this control to a compiler felt risky. There was a sentiment of, &#8216;if I don&#8217;t code it down to the metal, how can I trust what&#8217;s happening?&#8217;&#8221;</p><p>Three resistance arguments from the compiler era map to AI resistance today:</p><p><strong>The efficiency argument.</strong> Compiled code couldn&#8217;t match hand-tuned assembly. Proven false once optimizing compilers matured.</p><p><strong>The control argument.</strong> Loss of direct understanding meant loss of reliability. Resolved through trust in process.</p><p><strong>The prestige argument.</strong> Easier programming might &#8220;reduce the prestige or necessity of the seasoned programmer.&#8221; The reverse happened. Demand exploded as accessible tools enabled new applications.</p><p>Grace Hopper faced this directly. Management and colleagues initially thought automatic programming was crazy, fearing it would make programmers obsolete. The resolution came through performance proof (IBM&#8217;s Fortran team delivered an optimizing compiler that matched assembly speed) and procedural transparency (compilers began providing diagnostic output that actually <em>improved</em> understanding).</p><p>Haldar&#8217;s conclusion: &#8220;The debate playing out today about what it means to be a programmer when LLMs can churn out large amounts of working code is of exactly the same shape. Let&#8217;s learn from it and not make the same mistakes.&#8221;</p><h2>But the Analogy Has Limits</h2><p><a href="https://ppig.org/files/2022-PPIG-33rd-sarkar.pdf">Microsoft Research&#8217;s Advait Sarkar</a> challenges an easy mapping. His PPIG paper identifies crucial differences between compilers and AI code generation:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dUtm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dUtm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 424w, https://substackcdn.com/image/fetch/$s_!dUtm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 848w, https://substackcdn.com/image/fetch/$s_!dUtm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 1272w, https://substackcdn.com/image/fetch/$s_!dUtm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dUtm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png" width="678" height="252" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:252,&quot;width&quot;:678,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36583,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/184926790?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dUtm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 424w, https://substackcdn.com/image/fetch/$s_!dUtm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 848w, https://substackcdn.com/image/fetch/$s_!dUtm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 1272w, https://substackcdn.com/image/fetch/$s_!dUtm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f909ac-b097-4fc8-9837-ef5b2ecdaf8a_678x252.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Sarkar notes: &#8220;Unlike with compilers, a programmer using AI assistance must still have a working knowledge of the target language, they must actively check the output for correctness, and they get very little feedback for improving their &#8216;source&#8217; code.&#8221; This is &#8220;fuzzy abstraction matching&#8221;&#8212;the AI approximates intent rather than translating deterministically.</p><p>Here&#8217;s why this complication actually strengthens the procedural trust argument. Because AI is less deterministic than compilers, cognitive comprehension becomes <em>even less viable</em> as a primary trust strategy. Understanding still matters for debugging and oversight, but verification infrastructure becomes the load-bearing mechanism. You can&#8217;t understand your way to trust with a non-deterministic system. You have to verify your way there, with comprehension serving as a secondary check rather than the foundation.</p><h2>Aviation Figured This Out Already</h2><p>The transition from &#8220;pilot understands all systems&#8221; to &#8220;pilot trusts instrumentation&#8221; took roughly 18 years (1982-2000). The <a href="https://flightsafety.org/">Flight Safety Foundation</a> documented how automation changed a pilot&#8217;s role to that of a systems manager whose primary task is to monitor displays and detect deviations. A classic vigilance task.</p><p>The automation problems mirror AI coding concerns precisely. <strong>Automation bias</strong>: pilots using automation as a substitute for information gathering. <strong>Over trust</strong>: &#8220;Pilots start trusting the systems because of the fantastic job it does, and start no longer worry about the integrity of the systems.&#8221; <strong>Skills atrophy</strong>: pilots losing manual flying skills.</p><p>The NTSB found that while pilots preferred the glass cockpit design and believed it improved safety, they found learning to use the displays and maintaining their proficiency to be more difficult.</p><p>The resolution is instructive. Airbus now explicitly tells pilots to <strong>keep the autopilot engaged during turbulence</strong> rather than intervening manually. Data showed that in <strong>25% of temporary overspeed events</strong>, pilots who disconnected the autopilot made manual inputs that worsened the situation.</p><p>Procedural trust proved more effective than relying solely on cognitive understanding.</p><h2>Medical Diagnostics Show the Mechanism</h2><p>A <a href="https://www.nature.com/articles/s41591-024-02894-y">2025 study on AI-assisted cardiac diagnostics</a> found something striking. When physicians reviewed AI diagnoses, they rejected the system&#8217;s conclusions <strong>87% of the time</strong>. But when the AI began reporting its own confidence level, rejection fell to <strong>33%</strong>. When the AI was highly confident, doctors accepted findings in almost every case. The override rate dropped to just <strong>1.7%</strong>.</p><p>Doctors didn&#8217;t need to understand the algorithm&#8217;s internals. They needed <strong>confidence calibration</strong>&#8212;the AI communicating its certainty level. Trust became a function of procedural signals (confidence scores, explanations of reasoning) rather than cognitive penetration of the model.</p><p>Algorithm aversion research shows a consistent preference for humans&#8217; opinions over algorithms, even when the algorithms are known to be superior. But this preference diminishes when humans can <strong>choose</strong> their AI tools. Trust increases through procedural autonomy. Nobody likes being forced to use a tool. Give people the choice and watch trust grow.</p><h2>Manufacturing&#8217;s Statistical Process Control Is Pure Procedural Trust</h2><p><a href="https://en.wikipedia.org/wiki/Control_chart">Walter Shewhart&#8217;s 1924 development of control charts</a> at Bell Labs established the paradigm that still governs quality management. Workers didn&#8217;t need to understand <em>why</em> variation occurred, just whether it was within control limits. Shewhart distinguished &#8220;common cause&#8221; variation (inherent to process) from &#8220;special cause&#8221; variation (external factors). The procedural response: trust the chart, not your intuition.</p><p>Statistical process control became the foundation for quality in everything from munitions manufacturing to Six Sigma. The core shift was from <strong>detection</strong> (inspecting finished products) to <strong>prevention</strong> (monitoring processes that catch problems before they manifest).</p><p>This maps directly to AI coding verification. Catch problems through tests and invariants before production, rather than trying to comprehend every line.</p><h2>The Philosophy Actually Helps Here</h2><p><a href="https://philpapers.org/rec/NGUTAA">Philosopher C. Thi Nguyen&#8217;s framework</a> in &#8220;Trust as an Unquestioning Attitude&#8221; provides the most precise conceptual foundation. He defines trust not as mere reliance but as reliance with suspended deliberation: &#8220;What it is to trust, in this sense, is not simply to rely on something, but to rely on it unquestioningly. It is to rely on a resource while suspending deliberation over its reliability.&#8221;</p><p>Nguyen argues we can trust non-agents&#8212;ropes, tools, the ground: &#8220;We can be betrayed by our smartphones in the same way that we can be betrayed by our memory.&#8221; When we trust things, we grant them a degree of cognitive intimacy which approaches that of our own internal cognitive faculties.</p><p>This maps directly to AI tools. Cognitive trust (understanding why it works) isn&#8217;t feasible for complex systems. But Nguyen&#8217;s &#8220;unquestioning attitude&#8221; can be earned through procedural mechanisms: consistent behavior, passed tests, observable failures caught and corrected. We develop trust in AI tools the way we develop trust in any tool&#8212;through repeated successful use within verification constraints.</p><p><a href="https://en.wikipedia.org/wiki/Tacit_knowledge">Michael Polanyi&#8217;s tacit knowledge concept</a> adds another layer. &#8220;We can know more than we can tell.&#8221; Much expertise cannot be articulated into explicit rules. Some research estimates <strong>70-80% of software organization knowledge is tacit</strong> (undocumented, experience-based), though precise measurement varies by context. AI tools may embody tacit patterns without being able to explain them. Just as we trust human experts with tacit knowledge, we can trust AI tools verified through outcomes rather than explanations.</p><h2>Zero Trust Architecture Offers the Blueprint</h2><p>The &#8220;never trust, always verify&#8221; principle from <a href="https://www.nist.gov/publications/zero-trust-architecture">zero trust architecture</a> provides a procedural framework. NIST SP 800-207 defines it as a security framework requiring all users to be authenticated, authorized, and continuously validated before being granted access. No implicit trust based on network location. Continuous verification of every request.</p><p>This eliminates trust assumptions and replaces them with verification. A purely procedural approach. Applied to AI coding: every AI-generated line of code passes through verification gates (tests, static analysis, security scanning) regardless of source. Trust resides in the verification infrastructure, not in understanding the generator.</p><h2>Extended Cognition Reframes Everything</h2><p><a href="https://en.wikipedia.org/wiki/Extended_mind_thesis">Andy Clark and David Chalmers&#8217; Extended Mind thesis</a> argues cognitive processes extend beyond the brain. Their &#8220;Parity Principle&#8221;: If, as we confront some task, a part of the world functions as a process which, were it done in the head, we would have no hesitation in recognizing as part of the cognitive process, then that part of the world is part of the cognitive process.</p><p>Clark&#8217;s <a href="https://www.nature.com/nature-index/article/10.1038/s41467-024-55225-7">2025 paper in Nature Communications</a> extends this to AI: &#8220;We humans are and always have been &#8216;extended minds&#8217;&#8212;hybrid thinking systems defined (and constantly re-defined) across a rich mosaic of resources only some of which are housed in the biological brain.&#8221;</p><p>AI coding tools can become genuine extensions of cognitive systems when properly integrated&#8212;trusted through the same mechanisms we trust our own memory (which also fails, also requires verification through external notes and checks).</p><h2>The Practical Verification Stack Is Crystallizing</h2><p>Best practices are emerging that reflect procedural trust architecture:</p><p><strong>Guardrails at generation.</strong> <a href="https://www.codacy.com/">Codacy&#8217;s system</a> integrates directly with AI coding assistants to enforce coding standards and prevent non-compliant code from being generated in the first place. <a href="https://snyk.io/">Snyk</a> recommends making access to AI coding assistants contingent on the local security setup. Constraint replaces comprehension.</p><p><strong>Test coverage as trust proxy.</strong> <a href="https://addyosmani.com/">Addy Osmani</a> recommends <strong>&gt;70% test coverage as a minimum gate</strong> for AI-generated code. &#8220;The developers who succeed with AI at high velocity aren&#8217;t the ones who blindly trust it; they&#8217;re the ones who&#8217;ve built verification systems that catch issues before they reach production.&#8221;</p><p><strong>Observability as continuous verification.</strong> <a href="https://opentelemetry.io/">OpenTelemetry&#8217;s GenAI SIG</a> is developing standards for tracking agent actions, reasoning traces, tool calls, and performance metrics. The infrastructure for procedural trust is being built.</p><p><strong>Formal verification for AI output.</strong> The &#8220;Genefication&#8221; approach uses generative AI to draft code or specifications, followed by applying formal verification to rigorously ensure that the design satisfies critical safety and correctness properties. AI generates; formal methods verify. Pure procedural trust.</p><h2>The Identity Crisis Is Real</h2><p><a href="https://refactoring.fm/">Luca Rossi&#8217;s January 2026 essay</a> &#8220;Finding Yourself in the AI Era&#8221; captures the identity dimension. He distinguishes <strong>puzzle-solvers</strong> (who find coding intellectually stimulating; AI removes the satisfying parts) from <strong>problem-solvers</strong> (who care about shipping; complexity gets in the way).</p><p>For puzzle-solvers, AI assistance feels like having someone else solve your crossword puzzles for you.</p><p>Rossi notes common comfort-driven behaviors: &#8220;I only use LLMs as autocomplete so I can check every single line of code&#8221;; &#8220;It takes more time to review LLM code than to write it myself&#8221;; &#8220;If I make AI write it, my skills will atrophy.&#8221;</p><p>These may be rationalizations. &#8220;You should keep your antennas up and intercept when your behavior is guided by your own comfort, as opposed to what is best for the team/product/business.&#8221;</p><p>The <a href="https://www.technologyreview.com/">MIT Technology Review</a> captured developer Luciano Nooijen&#8217;s experience: &#8220;I was feeling so stupid because things that used to be instinct became manual, sometimes even cumbersome... Just as athletes still perform basic drills, the only way to maintain an instinct for coding is to regularly practice the grunt work.&#8221;</p><p>Skills atrophy is real. A genuine cost of procedural trust that requires deliberate counter-measures.</p><p>The <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR study</a> found that experienced developers <strong>believed AI made them 20% faster</strong> while objective tests showed they were actually <strong>19% slower</strong>. The perception gap reveals how much identity and self-image are at stake.</p><h2>The Craft Identity Question</h2><p><a href="https://stackoverflow.blog/">Stack Overflow&#8217;s editorial team</a> posed the anxiety question directly: &#8220;Are you a real coder, or are you using AI?&#8221; The answer they propose&#8212;&#8221;Yes&#8221;&#8212;doesn&#8217;t resolve the tension. AI coding tools create &#8220;developers who don&#8217;t understand the context behind the code they&#8217;ve written or how to debug it.&#8221;</p><p>The difference between &#8220;I trust this&#8221; and &#8220;I understand this&#8221; is the gap where craft identity lives.</p><p>One developer quoted in the <a href="https://enterprisespectator.com/">Enterprise Spectator</a> deactivated Copilot after two years: &#8220;The reason is very simple: it dumbs me down. I&#8217;m forgetting how to write basic things, and using it at scale in a large codebase also introduces so many subtle bugs which even I can&#8217;t usually spot directly. Call me old fashioned, but I believe in the power of writing, and thinking through, all of my code, by hand, line by line.&#8221;</p><p>That loss is real. Loss of a relationship to the craft that some developers built their identities around.</p><p><a href="https://www.coltonvoege.com/">Colton Voege</a> addressed the social pressure: &#8220;I wouldn&#8217;t be surprised to learn AI helps many engineers do certain tasks 20&#8211;50% faster, but the nature of software bottlenecks means this doesn&#8217;t translate to a 20% productivity increase&#8212;and certainly not a 10&#215; increase.&#8221; His permission: &#8220;It&#8217;s okay to sacrifice some productivity to make work enjoyable.&#8221;</p><h2>What the Timeline Looks Like</h2><p>Based on historical evidence, trust transitions follow a pattern: initial resistance (2-5 years), performance proof (3-10 years), procedural codification (5-15 years), normalized trust (10-20+ years).</p><p>Compiler adoption took roughly 20 years (1955-1975). Glass cockpit adoption took roughly 18 years (1982-2000). AI coding tools launched 2022&#8212;we&#8217;re in year 3-4 of what may be a 15-20 year transition.</p><p>The trust probably won&#8217;t come from AI tools becoming cognitively transparent, at least not in the short term. It will come from verification infrastructure becoming reliable, from procedural safeguards becoming standard, and from a new generation of developers who never knew any other way.</p><h2>The Numbers to Remember</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xw1M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xw1M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 424w, https://substackcdn.com/image/fetch/$s_!Xw1M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 848w, https://substackcdn.com/image/fetch/$s_!Xw1M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 1272w, https://substackcdn.com/image/fetch/$s_!Xw1M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xw1M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png" width="671" height="432" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53ea5193-690b-4735-95f9-282200f4874f_671x432.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:432,&quot;width&quot;:671,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58081,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/184926790?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xw1M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 424w, https://substackcdn.com/image/fetch/$s_!Xw1M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 848w, https://substackcdn.com/image/fetch/$s_!Xw1M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 1272w, https://substackcdn.com/image/fetch/$s_!Xw1M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53ea5193-690b-4735-95f9-282200f4874f_671x432.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Frameworks Worth Keeping</h2><p><strong>Trust Debt.</strong> Every time you accept AI output without verification, you borrow against future understanding. Someone eventually must &#8220;pay&#8221; by deeply reviewing that code.</p><p><strong>The Junior Developer Model.</strong> Treat AI as a junior teammate&#8212;fast but unreliable, needs supervision, never the final word. (maybe junior &#8594; journeyman at this point)</p><p><strong>Nguyen&#8217;s Unquestioning Attitude.</strong> Trust emerges through consistent, reliable behavior over time&#8212;the climbing rope that has held many times, the test suite that continues to pass.</p><p><strong>Zero Trust Applied.</strong> &#8220;Never trust, always verify&#8221;&#8212;continuous checking through tests, invariants, and observable failures creates trust without cognitive penetration.</p><p><strong>Extended Cognition.</strong> AI tools become extensions of our cognitive systems when properly integrated&#8212;trusted through the same mechanisms we trust our own memory.</p><h2>The Bottom Line</h2><p>There&#8217;s growing evidence that supports shifting from comprehension-based to verification-based trust. You don&#8217;t need to understand every line of AI-generated code. You need verification systems that catch problems before they reach production.</p><p>The psychological transition remains difficult. Developer craft identity is built on understanding code; procedural trust asks developers to accept that understanding everything is no longer possible or necessary. That loss is real. But the same was true for assembly programmers, pilots, and quality inspectors.</p><p>The emerging infrastructure&#8212;guardrails, observability, confidence calibration, test coverage gates, chain-of-thought monitoring&#8212;provides the verification architecture that makes procedural trust rational. The question has shifted.</p><p>Not &#8220;Do I understand this code?&#8221; but &#8220;Have I verified this code adequately?&#8221;</p><p>That&#8217;s a procedural rather than cognitive form of trust. And it&#8217;s how we&#8217;ll learn to work with AI that writes code we can&#8217;t fully comprehend.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on how trust actually works with AI tools, read my analysis of <a href="https://hyperdev.substack.com/p/ghost-in-the-machine">non-deterministic debugging challenges</a> or my deep dive into <a href="https://hyperdev.substack.com/p/multi-agent-orchestration">multi-agent orchestration</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Don’t Be A Canut — Be a Pattern Master]]></title><description><![CDATA[Practical advice from the Jacquard lesson]]></description><link>https://hyperdev.matsuoka.com/p/dont-be-a-canut-be-a-pattern-master</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/dont-be-a-canut-be-a-pattern-master</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Mon, 12 Jan 2026 12:31:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xIja!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;af8584e1-bdf4-4132-a8db-f1f121e1e003&quot;,&quot;duration&quot;:null}"></div><p>The Canuts were Lyon&#8217;s master silk weavers. Legendary craftspeople. Their identity was wrapped up in thread manipulation&#8212;every pass of the shuttle, every tension adjustment, every pattern emerging from their hands.</p><p>Then Jacquard showed up with his programmable loom. Punch cards replaced manual thread selection. The Canuts rioted. Some adapted. Many didn&#8217;t.</p><p>The ones who survived weren&#8217;t the fastest weavers. They were the ones who realized the job had changed. Post-Jacquard operators didn&#8217;t weave. They designed patterns, translated those patterns to punch cards, loaded the cards, supervised execution, and quality-checked output. The skill shifted from &#8220;manipulate threads&#8221; to &#8220;program the loom.&#8221;</p><p>Sound familiar?</p><h3>The Numbers Don&#8217;t Lie</h3><p>Here&#8217;s what happened to the Canuts:</p><p>Year Master Weavers Wages What Changed 1789 5,575 Baseline Peak of craft era 1804 ~5,000 Declining Jacquard loom introduced 1812 ~4,500 -20% 11,000 Jacquard looms in France 1830 3,000-4,000 -50% Wages half of 1810 levels 1831 ~3,500 Crisis First Canut revolt, 600+ dead (estimates vary) 1834 ~3,000 Crisis Second revolt, ~10,000 imprisoned/deported</p><p><em>Sources: <a href="https://www.encyclopedia.com/history/encyclopedias-almanacs-transcripts-and-maps/silk-workers-revolts">Encyclopedia.com</a>, <a href="https://marxist.com/the-lyon-silk-workers-uprisings-of-1831-and-1834.htm">Marxist.com analysis</a>, <a href="https://en.wikipedia.org/wiki/Canut_revolts">Wikipedia</a></em></p><p>Total silk workers stayed around 30,000. The looms didn&#8217;t eliminate jobs&#8212;they <a href="https://www.worldsocialism.org/spgb/socialist-standard/2015/2010s/no-1325-january-2015/page-history-1834-canut-revolt-lyon/">compressed the master craftsman class while creating lower-wage operator roles</a>. By 1831, the 308 silk merchants controlled pricing for 5,575 master weavers who managed 20,000+ workers in cramped workshops, <a href="https://www.encyclopedia.com/history/encyclopedias-almanacs-transcripts-and-maps/silk-workers-revolts">often working 14-18 hour days</a>.</p><p>The Jacquard loom didn&#8217;t kill weaving. It commoditized the skill. Pattern selection&#8212;the highest-value cognitive work&#8212;moved to punch cards. What remained was loading, monitoring, and maintenance. The same number of people worked. Fewer could call themselves craftsmen.</p><p>The cautionary tale isn&#8217;t mass unemployment. It&#8217;s wage collapse and status compression for those who kept doing the same job while the job&#8217;s value eroded beneath them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xIja!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xIja!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xIja!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xIja!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xIja!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xIja!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg" width="1104" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;active-image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="active-image" title="active-image" srcset="https://substackcdn.com/image/fetch/$s_!xIja!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xIja!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xIja!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xIja!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27944217-0e71-4d65-bd2a-ab4f4c40299c_1104x832.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Canut Mindset</figcaption></figure></div><p>The IDE isn&#8217;t the issue. The <em>mindset</em> is.</p><p>If you see yourself as &#8220;person who writes lines of code,&#8221; you&#8217;re a Canut. Nothing wrong with that&#8212;master craftspeople, all of them. But when the loom becomes programmable, thread manipulation skills matter less than pattern design skills.</p><p><strong>My own numbers back this up.</strong> Since <a href="https://www.anthropic.com/news/claude-3-5-sonnet">Opus 4.5 dropped November 24</a>, I&#8217;ve pushed 77 PRs, 3,167 commits, and 2.9 million lines across <a href="https://github.com/bobmatnyc">27 repositories</a>. All through Claude Code and <a href="https://github.com/bobmatnyc/claude-mpm">claude-mpm</a>. About 54% is Markdown&#8212;specs, design docs, research notes. The punch cards. Roughly 1.3 million lines of actual Python, TypeScript, and Svelte came out the other end. These are throughput metrics, not quality proxies&#8212;but throughput matters when your constraint is &#8220;how much can I ship this month.&#8221; Right now I&#8217;m running 5 MPM instances across ports 8765-8769, each orchestrating its own Claude Code session. The <a href="https://github.com/bobmatnyc/mcp-vector-search">claude-mpm ecosystem</a> accounts for 55% of my commit activity. Production infrastructure, not experimental. I haven&#8217;t opened VS Code for anything except quick file diffs in six weeks.</p><p>I&#8217;m not weaving anymore. I&#8217;m programming looms.</p><p>Don&#8217;t believe me? Here&#8217;s what the engineers actually building these tools are saying:</p><div><hr></div><h2>Boris Cherny (Claude Code Creator, Anthropic)</h2><p><a href="https://officechai.com/ai/claude-code-creator-says-he-didnt-open-an-ide-all-of-last-month-used-claude-code-for-all-his-coding/">According to interviews circulated in late 2025</a>, Cherny didn&#8217;t open an IDE for the entire month of December. His reported December output: 259 PRs, 497 commits, 40,000 lines added&#8212;all AI-written through the tool he created.</p><p>But here&#8217;s the part that matters: <a href="https://twitter-thread.com/t/2007179832300581177">he reportedly runs 5-15 Claude instances simultaneously</a>. Not one agent typing code. A swarm. He orchestrates rather than implements.</p><p>His <a href="https://www.developing.dev/p/boris-cherny-creator-of-claude-code">interview with Developing.dev</a> captures the psychological adjustment: &#8220;Software engineering is radically changing, and the hardest part even for early adopters and practitioners like us is to continue to re-adjust our expectations.&#8221;</p><p><a href="https://coder.com/blog/inside-anthropics-ai-first-development">Per Coder&#8217;s analysis</a>, Anthropic&#8217;s internal estimates show engineering output jumped 70% per engineer even as headcount tripled. Cherny&#8217;s estimate: a project that would&#8217;ve required 20-30 engineers working 2 years at Meta now takes 5 engineers and 6 months. Trending toward 1.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!77F9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!77F9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!77F9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!77F9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!77F9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!77F9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg" width="1104" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:157562,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/183842945?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!77F9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!77F9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!77F9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!77F9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd22ef0e0-8b8e-4eaa-92c5-e8c159ae46ce_1104x832.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Andrej Karpathy (OpenAI Co-Founder)</h2><p>Karpathy <a href="https://x.com/karpathy/status/1886192184808149383">coined &#8220;vibe coding&#8221; in February 2025</a>&#8212;the practice of giving in to AI suggestions without fully understanding the code. <a href="https://en.wikipedia.org/wiki/Vibe_coding">Collins Dictionary named it Word of the Year</a>. That&#8217;s how fast this entered the lexicon.</p><p>But his position evolved. By April 2025, he acknowledged vibe coding becomes a &#8220;painful slog&#8221; for deployed production apps. By October, he <a href="https://github.com/karpathy/nanochat">released nanochat</a>&#8212;8,000 lines, entirely hand-coded&#8212;calling AI tools &#8220;net unhelpful&#8221; for novel work that sits outside training data.</p><p><a href="https://dnyuz.com/2025/12/30/the-guy-who-coined-vibe-coding-now-says-hes-never-felt-more-behind-as-a-programmer/">His December 2025 take</a> is the most honest assessment I&#8217;ve seen: &#8220;I&#8217;ve never felt this much behind as a programmer. The profession is being dramatically refactored.&#8221;</p><p>The guy who named the movement now feels behind. Let that sink in.</p><h2>Google Engineers</h2><p><a href="https://twitter.com/rakyll">Jaana Dogan</a>, Principal Engineer on the Gemini API team, <a href="https://the-decoder.com/google-engineer-says-claude-code-built-in-one-hour-what-her-team-spent-a-year-on/">dropped a bomb on Twitter</a>: Claude Code replicated in one hour what her team spent a year building. A toy version, she clarified. But directionally significant.</p><p>Her broader observation: &#8220;In 2025, they can create and restructure entire codebases. Quality and efficiency gains beyond what anyone could have imagined.&#8221;</p><p><a href="https://www.technologyreview.com/2025/12/15/1128352/rise-of-ai-coding-developers-2026/">Sundar Pichai confirmed</a> that over 25% of new Google code is now AI-generated. A quarter of all new code at one of the world&#8217;s largest engineering organizations. That&#8217;s not a pilot program.</p><h2>Coinbase</h2><p><a href="https://www.linkedin.com/in/robwitoff/">Rob Witoff</a>, Head of Platform at Coinbase, reported <a href="https://www.technologyreview.com/2025/12/15/1128352/rise-of-ai-coding-developers-2026/">90% speedups</a> for code restructuring and test writing. Specific task categories, but when your fintech platform sees 90% acceleration on restructuring work, you stop asking whether AI coding tools are useful.</p><h2>Steve Yegge (Amazon/Google Veteran, Sourcegraph)</h2><p>Yegge&#8217;s been in this industry longer than most. His take borders on inflammatory: &#8220;If you&#8217;re still using an IDE to develop code by January 1st, 2025, you&#8217;re a bad engineer.&#8221;</p><p><a href="https://ai-native-devcon-2025.heysummit.com/talks/lessons-learned-vibe-coding-with-steve-yegge-12k-locday-and-more/">He claims 12,000 lines of production code per day</a> while spending $300 daily on AI tokens. He runs 3-4 agents simultaneously. He co-authored <em><a href="https://itrevolution.com/product/vibe-coding-book/">Vibe Coding</a></em> with Gene Kim.</p><p>Built an entire issue tracker called &#8220;Beads&#8221; through pure vibe coding&#8212;as a proof of concept that the approach actually ships production software, not just demos.</p><p><a href="https://www.theregister.com/2025/10/21/book_review_vibe_coding/">The Register&#8217;s review</a> captures the thesis: trust the AI. Let go of the illusion that you need to understand every line.</p><h2>Open Source Maintainers</h2><p>The shift isn&#8217;t limited to well-funded companies. Open source maintainers report the same pattern.</p><p><a href="https://simonwillison.net/">Simon Willison</a> (Django creator) <a href="https://simonwillison.net/2025/Oct/7/vibe-engineering/">coined &#8220;vibe engineering&#8221;</a> to distinguish production-quality AI work from casual demos. Vibe coding can be sloppy. Vibe engineering requires you to understand enough to guide the AI toward robust solutions.</p><p><a href="https://blog.marcnuri.com/boosting-developer-productivity-ai-2025">Marc Nuri</a> (Red Hat, Kubernetes MCP Server) went from 10-15 contributions per day to 25+. He migrated an entire frontend&#8212;193 files&#8212;in minutes using Claude Code.</p><p><a href="https://twitter.com/indragie">Indragie Karunaratne</a> (Mac developer since 2008) shipped a 20,000-line macOS app and wrote fewer than 1,000 lines by hand. 95%+ generated.</p><h2>Meta</h2><p><a href="https://fortune.com/2025/01/24/mark-zuckerberg-ai-engineer-capex-spend/">Zuckerberg predicted</a> AI would replace &#8220;mid-level engineers&#8221; by 2025. Whether that&#8217;s happened is debatable. That he said it publicly isn&#8217;t. Meta is building internal AI coding agents with a stated goal: automate the implementation work that currently occupies the engineering org.</p><h2>The Emerging Workflow Patterns</h2><p>Across all these examples, certain patterns repeat:</p><p><strong>Multiple parallel agents.</strong> Cherny runs 5-15. Yegge runs 3-4. I run 5. The single-agent model where you have one AI helping you code is already dated. Orchestration is the new skill.</p><p><strong>Context engineering becomes critical.</strong> CLAUDE.md files in repositories. Specification documents before code. The upfront investment in context pays compound returns when agents can reference shared understanding.</p><p><strong>CLI over IDE.</strong> Google previewed <a href="https://blog.google/products/google-cloud/antigravity-ai-native-ide/">Antigravity</a> (AI-first IDE) in November 2025. But the momentum favors terminal-based tools. Claude Code. Cursor. Command-line interfaces that agents understand natively.</p><p><strong>Human-on-the-loop, not human-in-the-loop.</strong> Supervision, not co-creation. You review and redirect rather than collaborate keystroke-by-keystroke.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CgLS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CgLS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CgLS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CgLS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CgLS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CgLS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg" width="1104" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:158210,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/183842945?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CgLS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CgLS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CgLS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CgLS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7903cd54-323c-485b-9f07-ac644120d23a_1104x832.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Engineers Who Improve the Looms</h2><p>Here&#8217;s what the Canut analogy misses: Jacquard&#8217;s 1804 loom wasn&#8217;t production-ready. The original had problems. Punch cards wore out. The mechanism jammed. Complex patterns exceeded the machine&#8217;s capacity. The silk coming out wasn&#8217;t consistent enough for commercial production.</p><p>It took decades of engineering to fix this. Card durability improved through material science. Tension systems got recalibrated. Contemporary estimates suggest 11,000 Jacquard looms were operating in France by 1812, up from essentially zero eight years earlier. That scaling didn&#8217;t happen because pattern masters got better at loading cards. It happened because engineers made the looms themselves more reliable.</p><p>Charles Babbage visited Lyon in 1840, obsessed not with silk but with the punch card concept. His Analytical Engine borrowed directly from Jacquard&#8217;s mechanism. He wasn&#8217;t weaving patterns. He was abstracting the loom&#8217;s logic into something more general.</p><p>The parallel to agentic coding infrastructure maps cleanly.</p><p>Pattern masters use Claude Code and orchestration frameworks to ship software. That&#8217;s valuable work. But someone has to build the MCP servers that give agents access to external tools. Someone has to write the orchestration layers that coordinate multiple agents without context collision. Someone has to optimize the vector databases that make retrieval-augmented generation actually work at scale.</p><p>That&#8217;s what I spend half my time on now. The <a href="https://github.com/bobmatnyc/claude-mpm">claude-mpm ecosystem</a>&#8212;the orchestration framework, the MCP vector search, the ticketing integration&#8212;that&#8217;s loom improvement, not pattern design. Making the infrastructure more reliable so pattern masters can trust it.</p><p>The Canuts who survived didn&#8217;t all become pattern masters. Some became loom mechanics. Some designed improvements to the punch card system. Some figured out how to chain looms together for industrial-scale production.</p><p>Agentic coding has the same split. You can master the patterns, or you can improve the looms. Both roles survive the transition. Thread manipulation shrinks in leverage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eLhn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eLhn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eLhn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eLhn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eLhn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eLhn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg" width="1104" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:237508,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/183842945?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eLhn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eLhn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eLhn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eLhn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf0cdf53-36f2-4408-8ea8-21f576390d5e_1104x832.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Caveats Worth Noting</h2><p>Karpathy&#8217;s hand-coded nanochat provides the clearest counterexample. For novel work&#8212;concepts that sit outside training data&#8212;current AI struggles. He called AI tools &#8220;net unhelpful&#8221; for that specific project.</p><p>This tracks with the Jacquard parallel. The loom couldn&#8217;t weave patterns that hadn&#8217;t been programmed. Novel patterns still required human design. Punch cards automated execution, not invention.</p><p>Quality concerns persist for deployed production apps. The demos look impressive. The maintenance burden on generated code remains less studied.</p><h2>What This Means</h2><p>According to YC partners speaking at Demo Day, 25% of Winter 2025 batch companies had 95% AI-generated codebases. Startups are shipping with almost no hand-written code. They skipped the Canut phase entirely.</p><p>The productivity metrics cluster around consistent ranges:</p><ul><li><p>70% improvement at Anthropic</p></li><li><p>90% speedups for specific tasks at Coinbase</p></li><li><p>2.5x daily contribution increase for Marc Nuri</p></li><li><p>3,167 commits in 44 days for my own work (72 per day average)</p></li></ul><p>These aren&#8217;t marginal gains. They&#8217;re what happens when you stop manipulating threads and start programming looms.</p><h2>The Mindset Shift</h2><p>The Canuts who survived the Jacquard revolution weren&#8217;t the fastest weavers. They were the ones who recognized that &#8220;weaver&#8221; was becoming &#8220;loom operator&#8221; and then &#8220;pattern designer.&#8221; Some took a different path&#8212;becoming loom mechanics, improving the card systems, figuring out how to scale production.</p><p>The engineers in this article made the same recognition. Cherny running 15 agents isn&#8217;t &#8220;faster typing.&#8221; He&#8217;s programming looms, not working threads. Yegge spending $300/day on tokens isn&#8217;t &#8220;expensive autocomplete.&#8221; He&#8217;s investing in loom capacity.</p><p>The IDE is an artifact of the implementer mindset. If you see yourself fixing lines of code, you&#8217;re a Canut. Masterful. Skilled. And increasingly misaligned with how production software actually gets built.</p><p>Two paths forward survive the transition. Design patterns that looms execute&#8212;writing specs, orchestrating agents, supervising output. Or improve the looms themselves&#8212;build the infrastructure that makes agentic coding reliable at scale.</p><p>Thread manipulation shrinks in economic value.</p><p>The transition is underway. The question is whether you&#8217;ve noticed.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on multi-agent orchestration, read my analysis of <a href="https://hyperdev.matsuoka.com/p/i-hope-never-to-use-claude-code-again">claude-mpm</a> or my deep dive into <a href="https://hyperdev.matsuoka.com/p/what-the-other-shoe-dropping-sounds">the economics of AI token consumption</a>.</em></p><p><strong>Further Reading on the Canuts and Jacquard Loom:</strong> If you want to understand this history properly, start with <a href="https://en.visiterlyon.com/out-and-about/culture-and-leisure/culture-and-museums/museums/maison-des-canuts">La Maison des Canuts</a> in Lyon&#8217;s Croix-Rousse district&#8212;the old weaving quarter. It&#8217;s part museum, part working workshop. I visited years ago and they still operate 19th-century Jacquard looms, still produce silk for clients worldwide. The fabric samples alone are worth the trip. The <a href="https://www.encyclopedia.com/history/encyclopedias-almanacs-transcripts-and-maps/silk-workers-revolts">Encyclopedia.com entry on the Silk Workers&#8217; Revolts</a> covers the political and economic context. For the technical evolution of the loom itself, James Essinger&#8217;s <em>Jacquard&#8217;s Web</em> traces the line from punch cards to Babbage to modern computing.</p>]]></content:encoded></item><item><title><![CDATA[The Age of the CLI, Part 2]]></title><description><![CDATA[From Nanny Coding to Fire and Check In]]></description><link>https://hyperdev.matsuoka.com/p/the-age-of-the-cli-part-2</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/the-age-of-the-cli-part-2</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 08 Jan 2026 12:30:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!57Hd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!57Hd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!57Hd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!57Hd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!57Hd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!57Hd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!57Hd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1580460,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/183506779?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!57Hd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!57Hd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!57Hd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!57Hd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98c82ec2-081c-4e35-bd52-1c977d4c1300_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Beyond Nanny Coding...</figcaption></figure></div><h2>TL;DR</h2><ul><li><p>The REPL model (type prompt, watch agent work, repeat) is already obsolete for developers who&#8217;ve mastered specification-driven prompting</p></li><li><p>Human-on-the-Loop (HOTL) replaces Human-in-the-Loop (HITL) as the dominant paradigm: strategic direction instead of tactical supervision</p></li><li><p>With TxDD (Ticket-Driven Development), I&#8217;m seeing 90%+ task completion on scoped work in my own testing</p></li><li><p>Tools like Ralph, claude-flow, and Claude Squad enable multi-agent orchestration that runs for extended periods with minimal oversight</p></li><li><p>By mid-2026, &#8220;fire and check in&#8221; will likely become the default for serious AI-assisted development</p></li></ul><div><hr></div><p>I&#8217;ve been running <a href="https://github.com/bobmatnyc/claude-mpm">claude-mpm</a> for months now, enforcing strict delegation to agent teams. And I&#8217;ve noticed something that&#8217;s reshaped how I think about AI development: I don&#8217;t nanny code anymore.</p><p>Here&#8217;s what that shift looked like. Six months ago, I&#8217;d prompt Claude to refactor a module, then watch it work. It would start down a path I didn&#8217;t like. I&#8217;d interrupt. Redirect. It would drift again. More interruption. The task might take two hours, but I&#8217;d be actively engaged for most of that time&#8212;half writing, half supervising.</p><p>Now? Last week I spun up an agent team to restructure the authentication flow in a client project. Wrote a detailed spec with acceptance criteria, test requirements, and constraints. Fired it off. Checked Slack, made coffee, reviewed some PRs on another project. Came back 45 minutes later to a working implementation that passed all the tests I&#8217;d specified. The agents had made different architectural choices than I would have, but the code worked and met the requirements. That&#8217;s the shift.</p><p>The cognitive move from &#8220;s<a href="https://hyperdev.matsuoka.com/p/nanny-coding-why-we-still-should">upervise every response</a>&#8221; to &#8220;trust the framework, check the output&#8221; took longer than I expected. But once you&#8217;re on the other side of it, the old REPL model feels like driving with your hands on someone else&#8217;s wheel.</p><p>This is Part 2 of my CLI series. <a href="https://hyperdev.matsuoka.com/">Part 1 explored why command-line interfaces beat IDEs for agentic work</a>. Now I want to dig into what comes next: communication models for AI engineering teams that run autonomously for minutes or hours, needing only strategic direction from humans.</p><h2>HITL was never the destination</h2><p>Human-in-the-Loop made sense when models were flaky. You&#8217;d prompt, watch, correct, repeat. Every response needed validation. The REPL pattern emerged from genuine necessity.</p><p>But we&#8217;ve been treating HITL like an end state instead of a stepping stone. The pattern assumes human attention is cheap and AI judgment is expensive. That ratio has flipped. My attention is now the bottleneck. The agent teams aren&#8217;t perfect, but they&#8217;re good enough that constant supervision costs more than occasional correction.</p><p>The alternative is <a href="https://thenewstack.io/human-on-the-loop-the-new-ai-control-model-that-actually-works/">Human-on-the-Loop (HOTL)</a>. You set objectives, define constraints, establish checkpoints. Then step back. The system runs autonomously with structured telemetry. You intervene when escalation triggers fire, not when you feel like checking in.</p><p>HOTL emphasizes what researchers call minimal trust surface: limit what agents can access, track commands and file edits and external calls, pause on unexpected conditions, run agent outputs through validation like CI/CD for human code. It&#8217;s not &#8220;set and forget.&#8221; It&#8217;s &#8220;set and verify.&#8221;</p><p>The <a href="https://langchain-ai.lang.chat/langgraph/tutorials/multi_agent/agent_supervisor/">LangGraph implementation</a> makes this concrete. Their <code>interrupt()</code> functions pause graph execution mid-process, wait for human input, then resume cleanly. Strategic checkpoint placement matters: graph-level interrupts at predefined nodes, node-level interrupts for dynamic requests, approval loops before costly operations. The framework even supports time travel, rewinding to earlier states for alternative trajectories.</p><h2>The 90% threshold and TxDD</h2><p>Here&#8217;s my working theory on what makes autonomous operation actually work: Ticket-Driven Development. I call it TxDD. Want to see it in action?  Look at some of my <a href="https://github.com/bobmatnyc">Github issues</a>.</p><p>The old approach was vibe coding&#8212;throwing loose prompts at agents and iterating through failures. Works fine for small tasks. Falls apart when you want agents running for an hour with minimal oversight.</p><p>TxDD inverts this. Before any code gets written, you build out a structured ticket hierarchy:</p><ul><li><p>Parent ticket with the overall objective and acceptance criteria</p></li><li><p>Sub-tickets breaking work into discrete, testable chunks</p></li><li><p>Each sub-ticket specifies success criteria, constraints, and validation steps</p></li><li><p>Dependencies mapped so agents know what to complete first</p></li></ul><p>With this level of specification, I&#8217;m tracking around 90% task completion on scoped work in my own projects. That&#8217;s based on reviewing agent output against the acceptance criteria in each ticket&#8212;did it complete the sub-tasks, meet the constraints, produce working code? Not everything hits that bar. Complex architectural decisions still need human judgment. Multi-day autonomous operation still drifts. But for defined features, bug fixes, refactoring, documentation? The agents deliver more often than not.</p><p><a href="https://dev.to/dzianiskarviha/integrating-claude-code-into-production-workflows-lbn">A detailed case study</a> of Claude Code integration into a 350k+ LOC codebase showed 80%+ of code changes fully written by agents with an estimated 30-40% productivity increase. Their key: explicit plan review gates, feature-based directory structure for context management, custom subagents for code review, fresh context per subtask. The methodology matters more than the model.</p><h2>What people are building right now</h2><p>The ecosystem for autonomous agent operation has exploded. Four categories worth watching:</p><h3>Iteration loop enforcers</h3><p>The <a href="https://github.com/anthropics/claude-code/tree/main/plugins/ralph-wiggum">Ralph plugin</a> (officially ralph-wiggum, named after the Simpsons character who persists despite setbacks) implements a stop-hook pattern. When Claude would normally finish and return control, Ralph re-feeds the same prompt. The agent sees its previous work in modified files and git history. Progress persists through filesystem state, not conversation context.</p><p>Start a loop with <code>/ralph-loop "Build the authentication system" --completion-promise "DONE" --max-iterations 20</code>. The agent keeps working until it explicitly outputs &#8220;DONE&#8221; or hits the iteration limit. A <a href="https://github.com/frankbria/ralph-claude-code">community fork</a> adds intelligent rate limiting, circuit breakers with error detection, and tmux session integration. It also catches agents spinning in failure loops with heuristics like <code>MAX_CONSECUTIVE_TEST_LOOPS=3</code>.</p><h3>Multi-agent orchestrators</h3><p><a href="https://github.com/smtg-ai/claude-squad">Claude Squad</a> provides a terminal TUI managing multiple agents (Claude Code, Aider, Codex CLI, Gemini) in separate workspaces with dual isolation through tmux sessions and git worktrees. YOLO mode auto-accepts all prompts for hands-off execution.</p><p>The <a href="https://aws.amazon.com/blogs/opensource/introducing-cli-agent-orchestrator-transforming-developer-cli-tools-into-a-multi-agent-powerhouse/">AWS CLI Agent Orchestrator</a> formalizes orchestration into four modes: Handoff (synchronous task transfer), Assign (asynchronous parallel execution), Send Message (direct communication with running agents), and Flow (scheduled runs via cron expressions).</p><p><a href="https://github.com/ruvnet/claude-flow">claude-flow</a> goes further with a hive-mind architecture featuring 64 specialized agents and a &#8220;queen&#8221; coordinator. Its Dynamic Agent Architecture supports self-organizing agents with fault tolerance. Stream-JSON chaining runs 40-60% faster than file-based agent communication.</p><h3>Commercial autonomous platforms</h3><p><a href="https://devin.ai/">Devin</a> operates through Slack integration. Message it like a colleague. Devin 2.0 added an Agent-Native IDE with embedded code editor, terminal, sandboxed browser, and smart planning tools. Enterprise customers like Nubank reportedly see 12x efficiency improvements on ETL migrations&#8212;though <a href="https://news.ycombinator.com/item?id=42734681">the answer.ai team&#8217;s rigorous trial</a> found only 3 out of 20 tasks succeeded (15%). The gap between vendor case studies and independent testing remains wide.</p><p><a href="https://factory.ai/">Factory AI</a> embeds task-specific &#8220;Droids&#8221; across the development workflow. Users assign GitHub issues to Factory; it pulls context and creates PRs automatically. The platform supports running 1000+ agents in parallel for migrations and allows fine-tuning how independently agents operate.</p><h3>Communication protocols</h3><p>The <a href="https://github.com/ag-ui-protocol/ag-ui">AG-UI protocol</a> has emerged as an open standard, emitting ~16 event types during agent executions. Status streaming operates at three layers: token-level (LLM generation as it happens), node-level (state transitions between workflow steps), and custom streaming (progress during long-running operations).</p><h2>The communication layer problem</h2><p>Running agents autonomously creates a new problem: how do you know what they&#8217;re doing?</p><p>I&#8217;ve been experimenting with a pattern where I fire up CLI tools in a tmux session and have an LLM interpret and summarize results. The outer LLM acts as my translator. It reads agent output, compresses hours of activity into digestible summaries, flags decisions that need my input.</p><p>This is emerging elsewhere too. <a href="https://factory.ai/">Factory AI&#8217;s research</a> shows structured summarization retains more useful information than raw conversation logs. <a href="https://blog.google/technology/developers/">Jules from Google</a> generates audio summaries of recent commits. <a href="https://github.com/cline/cline">Cline v3.15&#8217;s Task Timeline</a> provides a visual storyboard where each block represents a key step from initial prompt through tool execution to file edits.</p><p>Dashboards are maturing through tools like <a href="https://langfuse.com/">Langfuse</a> (open-source session tracking), <a href="https://docs.sentry.io/product/insights/ai/agents/dashboard/">Sentry AI Agents Insights</a> (traces showing LLM calls and tool errors), and <a href="https://github.com/mem0ai/mem0">Mem0</a> (multi-level memory for user, session, and agent state).</p><p>The pattern I keep seeing: LLMs reading LLM output and telling humans what happened. That translation layer didn&#8217;t exist six months ago, and it&#8217;s becoming essential infrastructure. Without it, autonomous operation creates opacity&#8212;you don&#8217;t know why an agent made a decision until something breaks.</p><h2>What works and what breaks</h2><p>Let me be specific about where autonomous operation delivers and where it fails.</p><p><strong>Works reliably:</strong></p><ul><li><p>Fixing merge conflicts, linter errors, and simple bugs with clear reproduction steps</p></li><li><p>Adding boilerplate code and writing tests for existing code</p></li><li><p>Refactoring within defined boundaries</p></li><li><p>Prototyping MVPs with clear specs</p></li><li><p>Documentation generation and updates</p></li></ul><p><strong>Still needs human oversight:</strong></p><ul><li><p>Complex architectural decisions</p></li><li><p>Enterprise-scale codebases (context limitations kill autonomy)</p></li><li><p>Multi-day autonomous operation (drift is real)</p></li><li><p>Anything requiring cross-system coordination</p></li><li><p>Security-sensitive changes</p></li></ul><p><a href="https://www.cerbos.dev/blog/productivity-paradox-of-ai-coding-assistants">The Cerbos analysis</a> calls this the &#8220;70% problem&#8221;: AI gets you most of the way, but the last 30% sometimes takes longer than doing it from scratch. I think that understates current capability for teams with good specification discipline. With TxDD, I&#8217;m seeing closer to 90% on well-scoped tasks&#8212;but &#8220;well-scoped&#8221; does a lot of work in that sentence.</p><h2>Fire and check in becomes the norm</h2><p>Here&#8217;s my prediction for 2026: &#8220;fire and check in&#8221; will likely replace REPL as the default interaction pattern for serious AI-assisted development.</p><p>Not &#8220;set and forget.&#8221; That&#8217;s too optimistic. But the ratio of human attention to agent execution time should collapse. Instead of watching agents work, you&#8217;ll:</p><ol><li><p>Write detailed specs in structured tickets (TxDD discipline)</p></li><li><p>Fire up agent teams with clear objectives</p></li><li><p>Check in periodically for status summaries</p></li><li><p>Review completed work in batches</p></li><li><p>Intervene only when escalation triggers fire</p></li></ol><p><a href="https://cursor.sh/">Cursor 2.0&#8217;s Background Agents</a> are already pushing this direction, running up to 8 agents simultaneously with OS notifications when agents complete or need input. <a href="https://replit.com/agent">Replit Agent 3</a> offers &#8220;Max Autonomy Mode&#8221; running up to 200 minutes continuously with minimal supervision.</p><p>The tooling exists. The communication layers are maturing. The missing piece is developer discipline: the willingness to invest upfront in specification quality so agents can run autonomously.</p><p>That&#8217;s the real skill transfer happening right now. We&#8217;re learning to manage AI engineering teams the way we&#8217;d manage human ones: clear objectives, defined constraints, trust but verify, intervene on exceptions rather than supervising every action.</p><h2>What I&#8217;m building toward</h2><p>My own workflow is converging on something like an engineering team I direct but don&#8217;t supervise. Claude-mpm handles the orchestration layer. TxDD specs define what &#8220;done&#8221; looks like. Agent teams run in tmux sessions. A summarization layer tells me what happened.</p><p>The part I&#8217;m still figuring out: communication abstraction. Right now I&#8217;m firing off specs and checking results manually. I want a pattern where I can queue tasks, get progress notifications, review outputs in batches, redirect when needed. Something closer to a project management interface than a chat window.</p><p>That&#8217;s what&#8217;s next after the REPL. Not faster autocomplete. Not smarter suggestions. A different relationship entirely: humans as strategic directors of autonomous AI engineering teams.</p><p>The agents are almost good enough. The frameworks exist. The remaining gap is process and tooling for that translation layer between human intent and agent execution. Whoever solves that owns 2026.</p><div><hr></div><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on this topic, see <a href="https://open.substack.com/pub/hyperdev/p/the-age-of-the-cli-part-1?r=nff5&amp;utm_campaign=post&amp;utm_medium=web&amp;showWelcomeOnShare=true">Part 1: Why CLI Beats IDE for Agentic Work</a> or my deep dive into <a href="https://hyperdev.matsuoka.com/">claude-mpm orchestration patterns</a>.</em></p>]]></content:encoded></item></channel></rss>