<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Hyperdev: Releases]]></title><description><![CDATA[Release news for my projects:  https://aipowerranking.com/en; https://www.npmjs.com/package/@bobmatnyc/ai-code-review; https://www.npmjs.com/package/@bobmatnyc/mcp-desktop-gateway]]></description><link>https://hyperdev.matsuoka.com/s/releases</link><image><url>https://substackcdn.com/image/fetch/$s_!j9a7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab665959-5546-4469-9e93-9e1518976e2b_1024x1024.png</url><title>Hyperdev: Releases</title><link>https://hyperdev.matsuoka.com/s/releases</link></image><generator>Substack</generator><lastBuildDate>Tue, 05 May 2026 17:05:13 GMT</lastBuildDate><atom:link href="https://hyperdev.matsuoka.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Robert Matsuoka]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[hyperdev@matsuoka.com]]></webMaster><itunes:owner><itunes:email><![CDATA[hyperdev@matsuoka.com]]></itunes:email><itunes:name><![CDATA[Robert Matsuoka]]></itunes:name></itunes:owner><itunes:author><![CDATA[Robert Matsuoka]]></itunes:author><googleplay:owner><![CDATA[hyperdev@matsuoka.com]]></googleplay:owner><googleplay:email><![CDATA[hyperdev@matsuoka.com]]></googleplay:email><googleplay:author><![CDATA[Robert Matsuoka]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Claude Code 2.1.0 ships ]]></title><description><![CDATA[skills hot-reload and 45+ bug fixes]]></description><link>https://hyperdev.matsuoka.com/p/claude-code-210-ships</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/claude-code-210-ships</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Wed, 07 Jan 2026 23:57:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uko0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uko0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uko0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!uko0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!uko0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!uko0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uko0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1723040,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/183853952?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uko0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!uko0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!uko0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!uko0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00cc769c-4e5d-4c7a-9a98-eae76f60f24e_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Claude Code 2.1.0 launched January 7, 2026</strong>, marking Anthropic&#8217;s first feature release of the year with significant improvements to the skills system, terminal compatibility, and permission handling. The update introduces automatic skill hot-reloading, wildcard patterns for Bash permissions, and addresses a security vulnerability that could expose sensitive tokens in debug logs. While no pricing changes accompany this release, community sentiment around Claude Code remains mixed&#8212;developers praise Opus 4.5&#8217;s capabilities but express frustration with usage limits.</p><h2>Skills system gets major quality-of-life upgrades</h2><p>The headline feature in 2.1.0 is <strong>automatic skill hot-reload</strong>: skills created or modified in <code>~/.claude/skills</code> or <code>.claude/skills</code> now activate immediately without restarting the session. This eliminates a significant friction point for developers iterating on custom workflows.</p><p>Additional skills improvements include:</p><ul><li><p><strong>Forked sub-agent context</strong>: Skills can now run in isolated sub-agent contexts using <code>context: fork</code> in frontmatter</p></li><li><p><strong>Progress indicators</strong>: Skills display tool uses in real-time during execution</p></li><li><p><strong>Improved suggestions</strong>: Recently and frequently used skills get prioritized in recommendations</p></li><li><p><strong>Visibility controls</strong>: Skills from <code>/skills/</code> directories appear in the slash command menu by default (opt-out via <code>user-invocable: false</code>)</p></li></ul><p>The <code>agent</code> field in skill frontmatter now allows specifying which agent type executes the skill, enabling more granular control over autonomous operations.</p><h2>Terminal and keyboard handling sees broad fixes</h2><p>Version 2.1.0 addresses longstanding terminal compatibility issues. <strong>Shift+Enter now works out of the box</strong> in iTerm2, WezTerm, Ghostty, and Kitty without requiring terminal configuration changes. Word navigation (Alt+B/Alt+F) has been fixed across these terminals, and Cmd+V now supports image paste in iTerm2.</p><p>The release adds substantial vim motion support:</p><ul><li><p><code>;</code> and <code>,</code> for repeating f/F/t/T motions</p></li><li><p>Full yank/paste with <code>y</code>, <code>yy</code>/<code>Y</code>, <code>p</code>/<code>P</code></p></li><li><p>Text objects: <code>iw</code>, <code>aw</code>, <code>iW</code>, <code>aW</code>, plus quote and bracket variants</p></li><li><p>Indent/dedent with <code>&gt;&gt;</code> and <code>&lt;&lt;</code></p></li><li><p>Line joining with <code>J</code></p></li></ul><h2>Wildcard permissions reduce approval fatigue</h2><p>A practical improvement for power users: <strong>Bash tool permissions now support wildcard pattern matching</strong> using <code>*</code> at any position. Developers can configure rules like <code>Bash(npm *)</code>, <code>Bash(* install)</code>, or <code>Bash(git * main)</code> to pre-approve command families. Combined with the removal of permission prompts when entering plan mode, this significantly reduces interruptions during autonomous workflows.</p><p>The unified Ctrl+B backgrounding now handles both bash commands and agents simultaneously, streamlining the background task experience.</p><h2>New configuration options expand customization</h2><p>Four new settings address specific user requests:</p><p>Setting Purpose <code>language</code> Configure Claude&#8217;s response language (e.g., <code>language: "japanese"</code>) <code>respectGitignore</code> Per-project control over @-mention file picker behavior in settings.json <code>CLAUDE_CODE_HIDE_ACCOUNT_INFO</code> Hide email/organization from UI <code>CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS</code> Override default file read token limit</p><h2>Security fix patches token exposure in debug logs</h2><p>The release includes a <strong>critical security fix</strong> preventing OAuth tokens, API keys, and passwords from appearing in debug logs. Organizations using Claude Code in CI/CD pipelines or shared development environments should upgrade immediately.</p><p>Other notable fixes address:</p><ul><li><p>Session resume failures from orphaned tool results during concurrent execution</p></li><li><p>OAuth token refresh race conditions with stale keychain cache</p></li><li><p>Memory leak in git diff parsing where sliced strings retained large parent strings</p></li><li><p>Files created via Write tool now respect system umask instead of hardcoded 0o600 permissions</p></li></ul><h2>MCP and hooks gain new capabilities</h2><p>Model Context Protocol support expands with <code>list_changed</code> notifications, allowing MCP servers to dynamically update available tools, prompts, and resources without reconnection. The YAML-style list syntax in frontmatter <code>allowed-tools</code> field simplifies skill declarations.</p><p>Hooks receive several enhancements:</p><ul><li><p>Support for prompt and agent hook types from plugins (previously limited to command hooks)</p></li><li><p>Agent frontmatter can now define PreToolUse, PostToolUse, and Stop hooks scoped to the agent&#8217;s lifecycle</p></li><li><p>New <code>once: true</code> config option for single-execution hooks</p></li></ul><h2>Breaking change requires zod 4.0+</h2><p>The SDK&#8217;s minimum zod peer dependency has changed to <strong>^4.0.0</strong>, potentially requiring updates for projects using older zod versions. The Atlassian MCP integration also switched to streamable HTTP as the default configuration.</p><h2>Community reception reflects broader tensions</h2><p>While 2.1.0 addresses many requested fixes, developer sentiment around Claude Code remains polarized. On Hacker News, users praise Anthropic&#8217;s shipping velocity&#8212;&#8221;It&#8217;s breathtaking how fast the Claude Code team ships&#8221;&#8212;and Opus 4.5&#8217;s code quality. Boris Cherny&#8217;s December tweet claiming <strong>259 PRs and 40,000 lines</strong> written entirely by Claude Code garnered 4.4M views.</p><p>However, usage limits dominate complaint threads. Reddit and GitHub issues document developers &#8220;burning through the whole damn quota in one or two days&#8221; even on $200/month Max subscriptions. The expiration of Anthropic&#8217;s holiday bonus (doubled limits December 25-31) triggered accusations of &#8220;bait and switch&#8221; pricing. Some developers report quality inconsistencies during peak hours, though Anthropic officially denies throttling.</p><h2>Competitive positioning remains strong but contested</h2><p>Claude Code maintains approximately <strong>70% market share</strong> among agentic coding tools according to Vibe Kanban data, though this dropped from 83% in September 2025. The tool excels at complex multi-file operations and autonomous refactoring&#8212;benchmarks show 77.2% accuracy on SWE-bench with the 200K context window.</p><p>Competitors have narrowed the gap: Cursor&#8217;s $20/month unlimited model attracts cost-conscious developers, while OpenAI Codex gains traction for structured, step-controlled workflows. GosuEvals benchmarks now rank Kiro, Windsurf, and Crush ahead of Claude Code, though margins are within 10%.</p><p>For most developers, the pragmatic approach combines Claude Code for complex architectural tasks with lighter tools for daily coding&#8212;a pattern that 2.1.0&#8217;s improved Bash permissions and skill system further enables.</p><h2>Conclusion</h2><p>Claude Code 2.1.0 delivers meaningful quality-of-life improvements rather than headline features. The skills hot-reload, wildcard permissions, and extensive vim motions address genuine workflow friction points. The security fix is essential for enterprise deployments. However, the release doesn&#8217;t address the community&#8217;s primary frustration&#8212;usage limits&#8212;which continues driving some developers toward alternatives. For teams already committed to Claude Code, the upgrade is straightforward (watch the zod dependency) and immediately beneficial.</p>]]></content:encoded></item><item><title><![CDATA[MCP Vector Search: Semantic Search for Code]]></title><description><![CDATA[The AI Tool That Doesn&#8217;t Cost Per Query]]></description><link>https://hyperdev.matsuoka.com/p/mcp-vector-search-semantic-search</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/mcp-vector-search-semantic-search</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Thu, 11 Dec 2025 14:30:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!x7DL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;b1dec481-1984-47af-aece-8eabf6d76d75&quot;,&quot;duration&quot;:null}"></div><p>We spend a lot of time talking about token costs. API budgets. The $200/month subscription that gets exhausted in three days. But here&#8217;s something that flips that entire conversation: vector search costs you nothing after the initial index.</p><p>Zero. Once your codebase is embedded, searches are pure math. No inference. No API calls. No watching your Claude Max credits drain while you hunt for that authentication middleware you know exists somewhere.</p><p>I&#8217;ve been building <a href="https://github.com/bobmatnyc/mcp-vector-search">mcp-vector-search</a> for the past few months, and it&#8217;s become one of the most useful tools in my development workflow&#8212;not because it&#8217;s fancy, but because it&#8217;s <em>always there</em>. No rate limits. No &#8220;please wait&#8221; messages. No monthly bill anxiety.</p><h2>Why Semantic Search Beats Grep for LLM Context</h2><p>Here&#8217;s what traditional code search gives you: exact string matches. Grep finds <code>authenticate_user</code>. Ripgrep finds it faster. Neither one finds &#8220;the part of the code that handles login verification&#8221; when somebody named that function <code>verify_credentials</code> instead.</p><p>Semantic search understands meaning. Ask for &#8220;authentication middleware&#8221; and it returns code that <em>does</em> authentication&#8212;regardless of what the developer named things. Ask for &#8220;error handling patterns&#8221; and it finds try/catch blocks, custom error classes, logging calls, the whole ecosystem of how your codebase deals with failures.</p><p>But here&#8217;s the thing that really matters for LLM-assisted development: this makes coding agents dramatically more effective.</p><p>Augment Code figured this out early. Their context builder uses vector search to understand your codebase&#8212;it&#8217;s one of the reasons Auggie can answer questions about your project without chewing through every file. Claude Code doesn&#8217;t have this yet. It relies on scanning, grepping, reading directories. Works, but it&#8217;s slower and burns tokens doing reconnaissance.</p><p>mcp-vector-search gives Claude Code users a similar advantage. (I&#8217;m sure Augment has additional tricks up their sleeve, but the core capability is the same.) The agent asks &#8220;where does this project handle database connections?&#8221; and gets precise, ranked results in milliseconds. No scanning. No guessing. Just direct access to relevant code.</p><h2>The Cheap LLM Layer</h2><p>Vector search alone returns code chunks ranked by semantic similarity. Useful, but sometimes you want more than raw results.</p><p>So I added a thin LLM controller layer. Nothing expensive&#8212;Haiku or GPT-4-mini for quick query generation and result consolidation. The model helps translate natural language questions into effective search queries, then synthesizes the results into coherent answers.</p><p>The economics work because the LLM does minimal work. It&#8217;s not reading your entire codebase. It&#8217;s not generating code. It&#8217;s just:</p><ol><li><p>Turning &#8220;how does auth work here?&#8221; into an optimized search query</p></li><li><p>Looking at the top 5-10 results</p></li><li><p>Summarizing what it found</p></li></ol><p>Total cost per question? Fractions of a cent. Maybe a penny for complex queries.</p><p>For actual Q&amp;A&#8212;&#8221;explain the main architecture&#8221; or &#8220;walk me through the payment flow&#8221;&#8212;I bump up to Sonnet or GPT-4o. These need reasoning, not just consolidation. But even then, the context is pre-filtered by vector search, so the models see exactly what they need instead of wading through irrelevant files.</p><h2>The Killer Combo: Answerable Codebases</h2><p>Give an LLM access to vector search tools and something interesting happens. Your codebase becomes <em>answerable</em>.</p><pre><code><code>mcp-vector-search chat &#8220;how does the login flow work?&#8221;
</code></code></pre><p>The tool searches semantically, retrieves relevant code, feeds it to the LLM, and returns an explanation grounded in your actual implementation. Not generic advice about authentication patterns. Your code. Your architecture. Your specific implementation details.</p><p>This works through MCP integration too&#8212;Claude Desktop can query your indexed codebase directly during conversations. Ask about your code while you&#8217;re planning changes, and Claude pulls the relevant context without you having to copy-paste files into the chat.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x7DL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x7DL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 424w, https://substackcdn.com/image/fetch/$s_!x7DL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 848w, https://substackcdn.com/image/fetch/$s_!x7DL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!x7DL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x7DL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png" width="1433" height="1164" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1164,&quot;width&quot;:1433,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:217749,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/181237894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x7DL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 424w, https://substackcdn.com/image/fetch/$s_!x7DL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 848w, https://substackcdn.com/image/fetch/$s_!x7DL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 1272w, https://substackcdn.com/image/fetch/$s_!x7DL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20813bba-d6ce-455c-871b-e6e2679e02c8_1433x1164.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>What&#8217;s Available Now</h2><p>Everything I just described ships today:</p><pre><code><code># Install
pipx install mcp-vector-search

# Index your codebase
cd your-project
mcp-vector-search init
mcp-vector-search index

# Search semantically
mcp-vector-search search &#8220;authentication middleware&#8221;

# Chat with your code
mcp-vector-search chat &#8220;explain the main architecture&#8221;
</code></code></pre><p>The <code>chat</code> command does dual-intent detection&#8212;it figures out if you&#8217;re asking a question (explain something) or searching (find something) and responds appropriately. Add <code>--think</code> for complex reasoning that uses the heavier models.</p><p>MCP server integration means Claude Desktop can use your indexed codebase as a tool:</p><pre><code><code>mcp-vector-search setup --platform claude_desktop
</code></code></pre><p>Then Claude has access to <code>search_code</code>, <code>search_similar</code>, <code>search_context</code>, and other tools for querying your project during any conversation.</p><h2>Coming Soon: Structural Analysis</h2><p>Vector search tells you what code means. But it doesn&#8217;t tell you if that code is a mess.</p><p>I&#8217;m adding an <code>analyze</code> command that does structural analysis&#8212;the kind of metrics you&#8217;d get from SonarQube or similar static analysis tools:</p><ul><li><p><strong>Cognitive complexity</strong>: How hard is this function to understand?</p></li><li><p><strong>Cyclomatic complexity</strong>: How many paths through the code?</p></li><li><p><strong>Nesting depth</strong>: How many levels deep does the indentation go?</p></li><li><p><strong>Coupling metrics</strong>: How tangled are these modules?</p></li></ul><p>The goal: quality-aware search. Filter results by complexity (<code>--max-complexity 15</code>), exclude code smells (<code>--no-smells</code>), weight rankings by code health.</p><p>Just starting development on this now. Phase 1 targets core metrics, Phase 2 adds CI/CD integration with SARIF output, Phase 3 brings cross-file analysis for coupling and circular dependencies.</p><h2>The TkDD Experiment</h2><p>Here&#8217;s where it gets meta. I&#8217;m building this entire feature using Ticket-Driven Development with Claude-MPM.</p><p>The workflow: my agents created the tickets (using <a href="https://github.com/bobmatnyc/mcp-ticketer">mcp-ticketer</a>, another tool worth checking out), and they&#8217;ll build the project from those tickets, updating progress as they go.</p><p>If you&#8217;re interested in TkDD&#8212;how AI agents can orchestrate complex projects through structured ticket workflows&#8212;you can follow along in real time:</p><ul><li><p><strong>Project Board</strong>: <a href="https://github.com/users/bobmatnyc/projects/13">https://github.com/users/bobmatnyc/projects/13</a></p></li><li><p><strong>Milestones</strong>: <a href="https://github.com/bobmatnyc/mcp-vector-search/milestones">https://github.com/bobmatnyc/mcp-vector-search/milestones</a></p></li></ul><p>Watch the tickets move from Backlog &#8594; Ready &#8594; In Progress &#8594; Done. See the commits reference issues. Watch PRs close tickets with evidence. It&#8217;s TkDD in public, with working code you can actually try at each milestone.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hpe5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hpe5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 424w, https://substackcdn.com/image/fetch/$s_!hpe5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 848w, https://substackcdn.com/image/fetch/$s_!hpe5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 1272w, https://substackcdn.com/image/fetch/$s_!hpe5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hpe5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png" width="1310" height="920" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:920,&quot;width&quot;:1310,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:237378,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/181237894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hpe5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 424w, https://substackcdn.com/image/fetch/$s_!hpe5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 848w, https://substackcdn.com/image/fetch/$s_!hpe5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 1272w, https://substackcdn.com/image/fetch/$s_!hpe5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb14e60e1-a82e-45bd-a737-f796e5c384f5_1310x920.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Bottom Line</h2><p>Vector search isn&#8217;t new technology. But packaging it as a CLI tool that any developer can install in 30 seconds and point at their codebase? That changes things.</p><p>Augment Code users have had this kind of semantic understanding built in. Now Claude Code users can get something similar&#8212;without the subscription, without the vendor lock-in, and without per-query costs eating into your budget.</p><p>No inference costs for search. Cheap LLM layer for intelligence. MCP integration for Claude Desktop. Structural analysis coming soon. All built in public with TkDD you can follow.</p><p>The future of code understanding isn&#8217;t better grep. It&#8217;s semantic infrastructure that makes your codebase queryable, analyzable, and answerable&#8212;without burning your API budget doing it.</p><div><hr></div><p><em><a href="https://github.com/bobmatnyc/mcp-vector-search">mcp-vector-search</a> is open source, as is <a href="https://github.com/bobmatnyc/mcp-ticketer">mcp-ticketer</a>.</em></p><p><em>I&#8217;m Bob Matsuoka, writing about agentic coding and AI-powered development at <a href="https://hyperdev.substack.com/">HyperDev</a>. For more on multi-agent orchestration, read my piece on <a href="https://claude.ai/claude-mpm">Claude-MPM</a> or my analysis of <a href="https://claude.ai/orchestration-landscape">the orchestration landscape</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Claude MPM 5]]></title><description><![CDATA[Git-First Agent/Skills Distribution]]></description><link>https://hyperdev.matsuoka.com/p/claude-mpm-5</link><guid isPermaLink="false">https://hyperdev.matsuoka.com/p/claude-mpm-5</guid><dc:creator><![CDATA[Robert Matsuoka]]></dc:creator><pubDate>Sun, 07 Dec 2025 20:42:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Cxpf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cxpf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cxpf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Cxpf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Cxpf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Cxpf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cxpf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1363776,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://hyperdev.matsuoka.com/i/180983197?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cxpf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Cxpf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Cxpf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Cxpf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F090d10bf-cd70-45d4-b807-004c919678f2_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The biggest friction point in multi-agent orchestration frameworks isn&#8217;t the just orchestration itself&#8212;it&#8217;s keeping agents and skills updated and enabling contributions. Claude MPM 5.0 solves this with a git-first architecture that treats agent and skill repositories as first-class citizens.</p><h2>The Problem We Were Trying to Solve</h2><p>Before 5.0, updating Claude MPM agents meant waiting for a package release. Contributing a new agent meant understanding the entire build system, submitting a PR to the main repository, and hoping someone reviewed it before your use case became irrelevant.</p><p>This created a painful bottleneck. Teams building custom agents had no clean way to share them. The official agent set updated slowly. And the cognitive overhead of contribution meant most improvements never made it upstream.</p><p>The fix: treat agents like any other code artifact. Put them in git repositories. Let organizations maintain their own collections. Let priority rules handle conflicts.</p><h2>How It Works</h2><p>Claude MPM now syncs agents and skills from configurable git repositories at startup. The default configuration pulls from two sources:</p><p><strong>Agents</strong>: <code>bobmatnyc/claude-mpm-agents</code> (47+ agents) <strong>Skills</strong>: <code>bobmatnyc/claude-mpm-skills</code> (community) + <code>anthropics/skills</code> (official Anthropic skills)</p><p>Adding your own repository takes one command:</p><pre><code><code>claude-mpm agent-source add https://github.com/yourorg/your-agents
</code></code></pre><p>The <code>--test</code> flag validates the repository before saving configuration&#8212;fail-fast behavior that prevents startup issues later:</p><pre><code><code>claude-mpm agent-source add https://github.com/yourorg/your-agents --test
</code></code></pre><p>Priority-based resolution handles conflicts when multiple repositories provide the same agent. Lower numbers win:</p><pre><code><code># ~/.claude-mpm/config/agent_sources.yaml
repositories:
  - url: https://github.com/myteam/agents
    priority: 10    # Your custom agents take precedence
    
  - url: https://github.com/bobmatnyc/claude-mpm-agents
    priority: 100   # System defaults as fallback
</code></code></pre><p>This means organizations can override any system agent with their own version while still receiving updates for agents they haven&#8217;t customized.</p><h2>The Hierarchical BASE-AGENT.md Pattern</h2><p>One feature that emerged during development addresses a common maintenance headache: shared instructions across related agents.</p><p>Consider a team with Python specialists for FastAPI, Django, and Flask. All three need the same Python style guidelines, testing expectations, and code review standards. Before 5.0, you&#8217;d duplicate that content in each agent file and pray you remembered to update all copies.</p><p>The hierarchical BASE-AGENT.md pattern solves this:</p><pre><code><code>your-agents/
  BASE-AGENT.md              # All agents inherit this
  engineering/
    BASE-AGENT.md            # Engineering agents inherit this too
    python/
      fastapi-engineer.md    # Inherits both BASE files
      django-engineer.md     # Inherits both BASE files
    rust/
      systems-engineer.md    # Inherits engineering + root BASE
</code></code></pre><p>Each agent inherits from all BASE-AGENT.md files in parent directories, cascading from root to leaf. Update the Python testing standards once, and every Python agent receives the change. Same inheritance pattern used in configuration management for decades&#8212;just applied to agent instructions.</p><h2>Performance: ETag Caching and Two-Phase Progress</h2><p>Git operations on every startup would be intolerable. ETag-based HTTP caching reduces network traffic by 95%+ after the initial sync. The framework only pulls when content has actually changed.</p><p>Visibility into the sync/deploy process matters when troubleshooting. Two-phase progress bars now show distinct stages:</p><ol><li><p><strong>Sync phase</strong>: Repository cloning and updates</p></li><li><p><strong>Deploy phase</strong>: File discovery and deployment to ~/.claude/agents/</p></li></ol><p>Real counts, real-time updates. When something fails, you know exactly where.</p><h2>The Contribution Workflow Is Now Just Git</h2><p>The agent cache at <code>~/.claude-mpm/cache/remote-agents/</code> is a full git repository. To contribute:</p><ol><li><p>Edit agents in the cache directory</p></li><li><p>Test locally with <code>claude-mpm run</code></p></li><li><p>Commit and push</p></li></ol><p>That&#8217;s it. No build system. No package release cycle. The contribution workflow matches what developers already do for any other code.</p><p>Previously, I&#8217;d encounter a situation where an agent needed tweaking&#8212;the Python engineer didn&#8217;t handle async context managers well, or the QA agent missed a testing pattern. The fix was obvious, maybe 10 lines. But the contribution overhead meant I&#8217;d make a local note and move on. The improvement never happened.</p><p>Now I edit the agent in the cache, test it works, commit with a conventional commit message, and push. The improvement exists in the shared repository within minutes. Other users get it on their next startup sync.</p><p>For organizations, this means maintaining internal agent repositories becomes trivial. Fork the community repository, add your customizations, configure priority, done. Your sales engineering team can have agents tuned for demo preparation. Your platform team can have agents that understand your infrastructure conventions. Each group maintains their own repository without coordination overhead.</p><p>The priority system means customizations don&#8217;t require abandoning upstream improvements. Set your custom repository at priority 10, keep the community repository at priority 100. Your overrides win for agents you&#8217;ve customized. Everything else updates automatically.</p><h2>Nested Repository Support</h2><p>Some skill repositories organize content in nested directories&#8212;category folders, framework folders, complexity levels. Claude Code requires a flat structure in <code>~/.claude/skills/</code>.</p><p>Rather than force repository maintainers to flatten their organization, the framework handles this automatically. Nested SKILL.md files are discovered recursively and flattened during deployment. Original directory structure is preserved in metadata for reference. One less barrier to contributing skill collections with sensible organization.</p><h2>What&#8217;s Coming: DeepEval-Based Behavioral Reinforcement</h2><p>The next major feature addresses a harder problem: ensuring agents actually follow their instructions.</p><p>Multi-agent systems have a hidden reliability problem. You write careful instructions telling the PM agent to delegate work. You define circuit breakers for when agents should stop. You require evidence for claims. Then you deploy and cross your fingers.</p><p>We&#8217;ve built a behavioral evaluation framework based on DeepEval that treats agent compliance as a testable property.</p><p>The framework tests across 51 scenarios in 6 categories:</p><ul><li><p><strong>Delegation patterns</strong>: Does the PM agent properly delegate instead of doing work directly?</p></li><li><p><strong>Circuit breakers</strong>: Do agents stop when they should?</p></li><li><p><strong>Tool usage</strong>: Are agents using the correct tools for each task?</p></li><li><p><strong>Workflow compliance</strong>: Do agents follow defined workflows?</p></li><li><p><strong>Evidence requirements</strong>: Do agents provide evidence for claims?</p></li><li><p><strong>File tracking</strong>: Do agents properly track files they create?</p></li></ul><p>Each scenario defines an input, expected behavior, and scoring criteria. The scoring system provides measurable feedback: 1.0 for exact match, 0.8 for acceptable fallback, 0.0 for failure. A MockPMAgent with intelligent agent selection logic validates responses against compliance rules.</p><p>Delegation authority testing (DEL-011) presents a task that should be delegated to the ticketing agent. The test verifies the PM produces a delegation response, not a direct action. Eight sub-scenarios cover edge cases: ambiguous requests, multi-step workflows, fallback conditions.</p><p>The universal delegation meta-test (DEL-000) goes further. It synthesizes novel work types not explicitly covered in instructions, testing whether the PM generalizes delegation patterns correctly. This catches instruction gaps before users encounter them.</p><p>Early testing revealed some uncomfortable findings. The PM agent was delegating correctly in most scenarios but occasionally bypassed the ticketing agent for &#8220;simple&#8221; operations&#8212;a 12% violation rate on what should be absolute rules. The security agent provided recommendations without evidence 15% of the time. The research agent sometimes made claims without citing sources.</p><p>These patterns were invisible without systematic behavioral testing. Manual spot-checking missed them entirely. Only by running hundreds of scenarios did the failure modes become apparent.</p><p>The goal is tight feedback loops: modify instructions, run behavioral tests, see exactly what changed. The same principle that makes test-driven development work for code should work for agent instructions. CI/CD pipelines can catch behavioral regressions before they reach production.</p><p>We&#8217;re also exploring whether behavioral test results can feed back into agent instructions&#8212;automated identification of instruction gaps based on failure patterns. This is speculative, but the data from systematic testing opens possibilities that weren&#8217;t available when we were flying blind.</p><p>This isn&#8217;t ready for release yet&#8212;we&#8217;re validating the test scenarios across different Claude model versions and building the integration with existing CI pipelines. But it represents where agent orchestration frameworks need to go: from &#8220;hope the agents behave&#8221; to &#8220;verify the agents behave.&#8221;</p><h2>Upgrade Path</h2><p>For existing installations:</p><pre><code><code>pipx install --upgrade claude-mpm</code></code></pre><p>Existing configurations are preserved. The git-first architecture activates automatically with sane defaults. You&#8217;ll see 47+ agents and hundreds of skills appear without configuration changes.</p><p>If you&#8217;ve customized agents, they&#8217;re still in <code>.claude-mpm/agents/</code> and take precedence over repository sources. Nothing breaks.</p><p>To verify the upgrade worked:</p><pre><code><code>claude-mpm agent-source list
claude-mpm skill-source list
ls ~/.claude/agents/    # Should show 47+ agents
ls ~/.claude/skills/    # Should show dozens of skills</code></code></pre><h2>The Bigger Picture</h2><p>Infrastructure optimization matters more than code generation quality in AI-assisted development. This has been my consistent finding over months of testing various AI coding tools.</p><p>Claude MPM 5.0 reflects this philosophy. The git-first architecture doesn&#8217;t make Claude smarter&#8212;it reduces friction in maintaining and distributing the instructions that make Claude effective. The behavioral testing framework doesn&#8217;t improve Claude&#8217;s capabilities&#8212;it provides visibility into whether Claude is following instructions at all.</p><p>The features that determine whether a multi-agent system remains useful at scale or gradually drifts into unreliable behavior.</p><p>The repositories are public. Contributions welcome.</p><p><strong>Links:</strong></p><ul><li><p><a href="https://pypi.org/project/claude-mpm/">Claude MPM on PyPI</a></p></li><li><p><a href="https://github.com/bobmatnyc/claude-mpm-agents">Agent Repository</a></p></li><li><p><a href="https://github.com/bobmatnyc/claude-mpm-skills">Skills Repository</a></p></li><li><p><a href="https://github.com/bobmatnyc/claude-mpm/tree/main/docs">Documentation</a></p></li></ul><div><hr></div><p><em>Claude MPM is an open-source orchestration framework for Claude Code. Version 5 was released December 2025.</em></p>]]></content:encoded></item></channel></rss>