I Built a Coding Tool. Then I Used It to Onboard as CTO
How To Avoid Playing The “New Guy” Card
I built claude-mpm to orchestrate multiple Claude Code agents for complex software development. Parallel execution, specialized agent roles, persistent memory across sessions—the whole multi-agent playbook.
Then I got hired as CTO of a 150-person R&D organization. And the first thing I did wasn’t write code. It was to point my coding orchestration framework at a completely different problem: understanding the organization I was about to lead.
The Traditional Onboarding Grind
Here’s what CTO onboarding usually looks like:
Months 1-2: You do a “listening tour.” Fifty to a hundred one-on-ones with engineers, managers, product leads. Read whatever documentation exists (outdated). Attend sprint reviews and architecture meetings. Try to figure out who does what.
Months 3-4: Patterns start emerging. You notice recurring complaints in conversations. You start to see disconnects between stated priorities and actual work. You form hypotheses.
Months 5-6: You finally synthesize enough to make preliminary recommendations. Often conservative ones, because you still don’t have complete context. You’re still discovering things you didn’t know you didn’t know.
After six months, a typical new CTO has good qualitative intuition but limited quantitative backing. Many insights remain anecdotal. Critical data—exact staffing costs, work distribution, technical debt concentration—might still be missing.
I had two weeks before my start date. And I had a tool designed for parallel analysis of complex technical systems.
The Experiment
The organization had granted me early access to their systems: GitHub, JIRA, Slack, Confluence, budget spreadsheets. Standard pre-start due diligence stuff.
I started with Claude.AI and Cowork—Anthropic’s desktop tools for non-developers. Asked questions, got answers, generated some useful analysis documents. It worked. I could pull insights from individual data sources, get summaries, ask follow-up questions.
But the volume wasn’t there. I’d get a handful of documents, maybe a dozen useful artifacts across a few days of work. Decent for casual exploration. Not enough to actually understand a 150-person organization.
Then I pointed claude-mpm at the same data sources.
Part of what made this possible: I’d recently added MCP Google Workspace integration to the framework. That meant agents could pull directly from Drive, Sheets, Docs—wherever the organization kept its data. Budget spreadsheets, org charts, planning documents, historical analyses. All queryable through the same orchestration layer that handles code repositories.
The difference wasn’t 2x or 5x. It was closer to 10x. Maybe more. The CLI-based orchestration could spawn specialized agents, run them in parallel, maintain persistent memory across sessions, and churn through analysis while I was doing other things. What took hours of back-and-forth in a chat interface happened in minutes with coordinated agents.
The collaboration model looked like this:
HUMAN ROLE AI ROLE
Ask strategic questions → Parse 27K commits from 141 repos
Provide business context → Classify work by type
Validate findings → Identify anomalies
Make decisions → Generate analysis artifacts
Every analysis started with a human question. “Who are the critical contributors?” “What is the team actually working on versus what’s budgeted?” “Where is technical debt concentrated?” AI didn’t guess what I needed—it answered what I asked.
Then I’d look at the answers and ask follow-up questions. Iterate. Triangulate across data sources. Challenge assumptions with data.
What One Week of Human-AI Collaboration Produced
The actual research took about a week. I invested maybe 5 hours of hands-on time. The AI agents collectively ran for 500+ hours equivalent of analysis.
Five hours. Let that sink in.
The output:
267 markdown documents spanning people, processes, code, and strategy. Not summaries—deep analyses with specific evidence.
A 15MB database containing 27,343 classified commits from 141 repositories. Every commit tagged by work type: feature development, bug fixes, maintenance, refactoring.
A 620KB knowledge database integrating budget data, org charts, and work attribution. Who’s working on what? How does that compare to budget allocation?
An interactive web platform for exploring the data. “Show me everyone who touched the authentication system in the last year.” “Which repositories haven’t had commits in six months?”
Discoveries that would have taken months to surface organically. The kind of things that don’t come up in one-on-ones—gaps between budgeted priorities and actual work allocation, security hygiene issues nobody had noticed, concentration risk in the codebase. None of these came from AI having opinions. They came from AI parsing data at scale while I asked increasingly pointed questions.
The Insight: Orchestration Isn’t About Code
Here’s what surprised me: claude-mpm worked better for organizational analysis than I expected. Not because it’s designed for that use case—it isn’t. Because the underlying pattern is the same.
Multi-agent orchestration solves a specific problem: complex work requiring synthesis across multiple information sources, where no single context window can hold everything, and where specialized approaches to different sub-problems produce better results than one generalist.
That describes software development. It also describes:
Due diligence on acquisitions
Market research synthesis
Competitive intelligence
Academic literature reviews
Legal discovery
Any knowledge work requiring multi-source analysis at scale
The agents don’t care what they’re analyzing. Code, commit histories, JIRA tickets, budget spreadsheets, Slack conversations—it’s all context to be parsed, patterns to be identified, insights to be surfaced.
What AI Actually Did (And Didn’t Do)
Let me be precise about the division of labor.
AI handled:
Parsing tens of thousands of commits and classifying them by work type (85-90% accuracy)
Correlating GitHub usernames to JIRA accounts to Slack handles to budget line items
Identifying statistical anomalies (bus factor calculations, spend variances, contribution patterns)
Generating first drafts of analysis documents
Building queryable databases from unstructured data
I handled:
Asking the right questions (this was most of my 5 hours)
Providing business context AI couldn’t infer (”this team was recently reorganized”)
Validating findings (”that anomaly is because of the acquisition, not a problem”)
Making judgment calls (”that maintenance percentage is a crisis, not just a number”)
Deciding what to do about it
AI didn’t replace my judgment. It gave me things to have judgment about. Faster, with better evidence, across more data than I could have processed alone.
The Economics
Traditional estimate for CTO onboarding: 3-6 months to reach 60% organizational understanding. Mostly qualitative.
AI-augmented approach: 1 week to reach maybe 90% understanding. Quantitatively backed.
Here’s a concrete comparison: The CTO who replaced me at my previous company was six months into the role when I last saw her in action. She was still asking basic questions about how teams were organized, who owned which systems, where the budget was actually going. Six months. I walked into this new company on day zero with answers to questions she hadn’t thought to ask yet.
Every new executive knows the “new guy card”—that implicit grace period where you get to say “I’m still ramping up” and nobody expects you to have answers. It’s comfortable. It buys you six months of not being accountable for what you don’t know yet.
I didn’t want to play that card. I wanted to walk in like a veteran.
Here’s the honest limitation: I absolutely cannot do that with the people. Relationships take time. Trust gets built through interactions. No amount of AI research tells you who’s politically dangerous, who’s quietly brilliant, who needs to vent before they can hear feedback. That’s human intelligence that only accumulates through presence.
But the code? The stack? The projects? The products? The budget allocation patterns? The organizational structure? The technical debt? I can know all of that before my first all-hands. And that means when I’m in meetings, I can focus on reading the room instead of scrambling to understand what people are talking about.
That’s not a knock on her abilities—she was doing it the traditional way, which is the only way most people know. Meetings, osmosis, gradual pattern recognition.
I had 267 documents, a queryable database of every commit, and quantified answers waiting for me before my first all-hands.
Cost comparison:
AI API costs: a few thousand dollars
Lost productivity from 6-month ramp-up: $150-200K (conservatively)
The ROI math isn’t subtle.
But the bigger value wasn’t time or money. It was discovering things I wouldn’t have found through meetings alone. Patterns buried in data, invisible until someone asks the right question and has the tools to answer it.
Practical Implications
For executives considering AI-augmented onboarding:
Start if you have data access (APIs to GitHub, JIRA, whatever your org uses), you’re comfortable with iteration (first analysis won’t be perfect), and you can invest a few thousand in tooling plus 5-10 hours of directed time.
Don’t start if data is locked in silos with no export path, you expect AI to “do it all” without significant human direction, or you’re not technical enough to validate the output (or can’t partner with someone who is).
For organizations preparing for AI-augmented leaders:
Your data needs to be accessible. APIs, exports, documentation. If a new executive can’t query your GitHub or JIRA, they can’t run this playbook.
Your data quality matters more than you think. JIRA hygiene issues, inconsistent employee naming across systems, incomplete commit messages—all of these degrade AI analysis quality.
Budget for it. A few thousand dollars for an onboarding sprint is trivial compared to what you’re paying that executive.
For people building AI tools:
The use cases are broader than you think. I built claude-mpm for coding. It turned out to be a general-purpose knowledge work accelerator.
Identity resolution across systems is a major gap. Normalizing usernames across GitHub, JIRA, Slack, and email into a single identity is still harder than it should be.
Confidence scoring would help. When AI isn’t sure, it should say so explicitly. Humans can validate uncertain findings; they can’t validate confident-sounding hallucinations.
What This Means for Knowledge Work
The traditional playbook for understanding complex organizations—meetings, gradual osmosis, pattern recognition over months—isn’t wrong. It’s just slow.
AI doesn’t replace the human judgment at the core of that process. It accelerates the data gathering that feeds judgment. Ask better questions faster. Validate assumptions with evidence. Discover patterns that would take months to surface organically.
I spent a week asking questions about an organization I was about to lead. AI spent 500+ hours finding answers. The combination produced understanding I couldn’t have reached alone, in a timeframe that would have been impossible.
The coding tool I built turned out to be an organizational intelligence tool in disguise. That’s not an accident. It’s what happens when you build systems for parallel analysis of complex information.
The era of “gut feel” leadership isn’t ending. But the bar for what counts as informed intuition just got a lot higher.
I’m Bob Matsuoka, writing about agentic coding and AI-powered development at HyperDev. For more on multi-agent orchestration, read my analysis on the era of the CLI or my deep dive into why I hope never to use Claude Code directly again.



