The Acceptable Writing Machine: Building a System That Passes the Sniff Test
But Still Needs Me to Make It Interesting
TL;DR
I built a multi-layer system for AI-assisted writing: death list vocabulary, pattern elimination, chaos injection, and voice foundation
The system produces competent, natural-sounding content, but competent isn’t interesting
Matt Rosenberg’s independent experiment confirms it: the human author still provide the conceptual leaps
The style guide and pattern elimination work. The spark still requires a writer’s brain.
I’ve spent a good chunk of 2025 building a writing tool/process I’m proud of.
It’s a system for AI-assisted writing. Four years of my own work fed to Claude, anti-AI detection research baked in, style guides running to 3,000+ words. The system solves what I call the competence problem: raw AI output that reads like a corporate press release gets filtered into something that sounds like me. A draft that opens with “In today’s rapidly evolving landscape of AI development...” becomes “I spent three hours last week chasing a bug that shouldn’t exist.” Same information, different voice. Content comes out clean. Reads like something I’d write at 2 AM after too much coffee, not like a corporate press release. Professional quality without the obvious tells.
Matt Rosenberg just ran an experiment that confirmed something I’d been circling around.
Matt (a writer with 25+ years of experience, published pieces, screenplays, keynote presentations) decided to test whether Claude could learn to write like him. He fed it samples of his work, built his own prompts, and asked it to write a piece about the experience of Claude trying to write like Matt. Meta as hell. The format alone was interesting: Claude’s drafts with Matt’s bold-text edits visible, showing the collaboration in real time. You can read it in the companion piece, “Remembrance of Prompts Past.”
What Matt uncovered: the system produces acceptable writing, not great writing.
The conceptual leap (writing as Claude about Claude’s limitations, making the edits visible, the fossil record metaphor that actually landed) came from Matt. Claude handled sentence variety and avoided the obvious AI tells. But the idea that made the piece worth reading? The Human brain, every time.
Why AI Writing Gets Caught
The detection problem isn’t mysterious. I dug through GPTZero’s methodology, academic papers on statistical text analysis, and Reddit threads where people obsess over catching AI output. The fingerprints are specific and measurable.
Perplexity. How predictable each word choice is. LLMs optimize for the “right” word, which means they consistently pick expected options. Human writing is messier. We reach for the wrong word sometimes. That unpredictability shows up in the math. GPTZero’s analysis of 3.3 million texts identified this as the primary signal: AI text clusters around low perplexity scores. Too predictable, too “correct.”
Burstiness. Sentence length variation. Humans write in bursts. Three words, then forty, then twelve, then a fragment. AI maintains steady rhythms. Twenty-word sentence. Twenty-two words. Nineteen. The consistency is subtle but detection tools pick up on the pattern.
Vocabulary clustering. LLMs trained on similar data reach for similar words with statistically abnormal frequency. “Delve” became the canonical example (so overused in AI output that it’s practically a detection trigger on its own).
The Death List
I maintain a blacklist. Not suggestions. Hard bans.
Examples of forbidden words: delve, leverage, utilize, harness, tapestry, robust, seamless, pivotal, foster, nuanced. Forbidden transitions: Moreover, Furthermore, Additionally, In conclusion. Banned openings: In today’s digital world..., In the ever-evolving landscape of..., At its core...
The full list runs to thirty-plus items. Ctrl+F for each one before anything ships.
Why these specific words? GPTZero’s analysis found they appear with statistically abnormal frequency in AI-generated content. Not because they’re bad words, but because LLMs reach for them reflexively, creating fingerprints that detection tools flag. And trained readers notice too. Once you’ve seen these patterns, they tend to stand out. Now they jump out at me anytime I read anything written with tools that don’t filter for them. It’s like hearing a wrong note in a song.
Pattern Elimination
Vocabulary isn’t enough. Certain sentence structures scream AI regardless of word choice.
The worst offender: “It’s not X, it’s Y.”
This construction appears constantly in LLM output:
❌ “It’s not just about speed, it’s about efficiency”
❌ “That gap isn’t optimism—it’s pricing in future value”
❌ “This wasn’t luck; it was preparation”
The em-dash version, the semicolon version, the simple comma version. All of them. Kill on sight. (Some writers, like Matt, use em dashes naturally and well. The problem isn’t the punctuation. It’s the “not X, it’s Y” structure that AI produces reflexively.)
The fix: break into two sentences. State what’s happening directly. Let the reader connect the dots.
✅ “Speed matters, but efficiency drives real value.”
✅ “Investors aren’t paying 167x for chatbot subscriptions. They’re pricing in future value.”
Another pattern: “X is clear.” Passive clarity claims tell instead of showing. If something is actually clear, present it. Readers will see it.
Chaos Injection
Human writing has wild variance that AI struggles to replicate. The system deliberately breaks rhythmic consistency:
Start short.
Then suddenly you’re in a much longer sentence that meanders through multiple ideas before eventually (maybe) getting to the point you were trying to make in the first place.
Back to short.
Then BAM. Three words.
Never more than two similar-length sentences in a row. Contractions everywhere. Casual language mixed with technical precision (”the API barfed” rather than “the API returned an error”). Self-corrections. Incomplete thoughts that trail off...
The obvious goal: increase perplexity and burstiness beyond the statistical norms of AI output. The less obvious goal is to give me word clay that takes less time to turn into something that expresses my point in a way I’d normally use.
I deliberately include self-corrections (”Actually, scratch that—I’m wrong about...”), uncertainty when genuine, emotional reactions, time references (”Last Tuesday,” “around 3 AM”), physical sensation (”My eyes hurt from staring at logs”). Natural inconsistencies help too: shifting between formal and casual within the same piece.
The Voice Foundation
Those techniques are safeguards. They keep the output from getting flagged. But the real work is voice.
I fed Claude four years of my own writing. Not the volume Matt has (he’s more prolific and, honestly, better) but enough to establish baseline patterns. Word preferences. Sentence rhythms. The specific way I structure technical explanations. The style guide documents these patterns.
His real insight: the guide documents voice. It doesn’t create voice.
Matt has decades of writing behind him. When he built his own prompts and fed Claude his work, he was documenting patterns from that history. The guide captures a voice that already exists.
For someone without that history (someone who hasn’t struggled through sentence-level work, hasn’t discovered their own rhythms) the guide wouldn’t work. There’s nothing to document.
“Write like me” requires a “me” to write like.
What the System Cannot Do
Matt’s experiment made this clear: Claude followed every rule it was given. Sentence variety. Forbidden words eliminated. Voice patterns matched.
But Claude missed the interesting idea.
The first draft was conventional. Claude wrote as if Matt were the author. Safe. Pattern-following. Matt had to read it and say: “Wait, shouldn’t you acknowledge that you’re Claude writing about whether Claude can write like me? Isn’t that the more interesting piece?”
Yes. Obviously. And Claude missed it.
The system covers the mechanical work: vocabulary control, sentence variety, detection avoidance, voice consistency.
What it misses: surprising framing decisions, conceptual leaps, knowing which rules to break, finding the angle that transforms competent into interesting.
The fossil metaphor emerged from the collaboration and actually worked. “Matt’s style guide is a fossil record. I’m an AI trained on fossils, trying to make the dinosaur walk.” Claude generated that phrasing in response to Matt’s prompting. Good line. But Claude couldn't recognize that the format itself was the story. Couldn't spot the pacing problems that made the middle drag. Couldn't suggest that showing the revision history would make the piece more honest.
The system produces acceptable. he made it interesting, I try to do the same.
The Practical Workflow
What’s this actually useful for? Not replacing me. Matt’s experiment killed that fantasy (not that that was ever a goal). But for expediting my writing (especially when I have clear conceptual direction) the system works.
I provide direction. Not just topic. The angle. The frame. What makes this piece worth reading.
System generates draft. Following voice patterns, avoiding detection triggers, maintaining chaos.
I provide conceptual intervention. The “wait, shouldn’t you...” moments. The decisions the system can’t make.
System revises. Implementing my direction while maintaining voice consistency.
Second LLM proofreads. A GPT instance trained on my style and fact-checking catches errors and inconsistencies, makes sure I’m linking to the right article.
I finalize. Adding the touches no amount of prompt engineering produces.
Maybe 30% human intervention for competent output to become interesting output. Sometimes more. The system accelerates the mechanical work. But the interesting parts (the parts that make someone want to read past the first paragraph) still require my brain.
The Defense of AI-Assisted Writing
I’m not claiming my writing is great. I enjoy doing it, I enjoy what I produce, and sometimes I create pieces I’m very proud of, like my article comparing software engineers to the canuts of the Jacquard loom era. Sometimes I rush a bit, if I’m being honest.
But I believe the content is interesting and valuable for my audience. And I would never have the time or energy to produce it without these tools. Given how much has changed in AI development this year (the tool releases, the market shifts, the methodology evolution) I feel like I’m providing a valuable service by documenting it in real time. That documentation wouldn’t exist if I had to write every sentence by hand.
When someone dismisses work as “written with AI,” they’re missing the point. The question isn’t whether AI touched the text. The question is: whose ideas are these?
My articles contain my research into tools I actually tested, my analysis of market dynamics I’ve observed, my experiences from decades of building software, my opinions formed through real-world implementation, my frameworks developed over years of practice.
The system helps me construct sentences faster. It handles the mechanical parts (word choice, rhythm, structure) while I focus on what actually matters: the ideas, the angles, the insights that make something worth reading.
A carpenter doesn’t get less credit for using a power saw instead of a hand saw (well actually they do, but each is given credit work they do). The craft is in knowing what to build and how to build it well. My tools just make the cutting faster.
I have more ideas than hours. This system lets me ship more of them. The alternative isn’t “Bob writes everything by hand.” The alternative is “most of Bob’s ideas never get written at all.”
What I Actually Built
I built a machine that produces acceptable writing. Clean sentences. Natural rhythm. Passes the sniff test.
And I’m proud of that.
But acceptable isn’t the goal. Acceptable is table stakes.
The system is a power tool. Fast. Increases output. Reduces mechanical friction. But a power tool doesn’t make you a craftsman. It just makes the work go faster.
Matt Rosenberg demonstrated this by making something interesting from Claude’s output. The format choice. The visible collaboration. The meta-commentary that gives the piece its hook. Those weren’t in any style guide. They couldn’t be. They came from decades of writing practice and the specific creative decision to frame the experiment as something worth reading.
My system works. I use it weekly.
But I’ve never believed it solves the interesting problem.
Acceptable writing is the baseline. I provide everything else.
The companion piece, “Remembrance of Prompts Past” by Matt Rosenberg, demonstrates these limitations in action, with Claude’s drafts and Matt’s visible edits showing exactly how human intervention transforms acceptable output into something worth reading.
I’m Bob Matsuoka, writing about agentic coding and AI-powered development at HyperDev. For more on AI-assisted workflows, see my analysis of multi-agent orchestration systems or my deep dive into what’s actually in my toolkit.




For me this was the big takeaway and something I've been thinking about for a while.
The alternative isn’t “Bob writes everything by hand.”
The alternative is “most of Bob’s ideas never get written at all.”