You weren't imagining things...Claude Code was dumber this month

Unintended consequences of optimization

Apr 24, 2026

So if you’ve been using Claude Code and noticed it felt... off... you weren’t imagining it.

Anthropic published a full breakdown yesterday and it’s actually three separate bugs that compounded into what looked like one big degradation. The developer community was right to be concerned, and the evidence they collected was instrumental in getting this fixed.

Here’s what actually happened:

1. They silently downgraded reasoning effort (March 4)

They switched Claude Code’s default from high to medium reasoning to reduce latency. Users noticed immediately. They reverted it on April 7.

Classic “we know better than users” move that backfired. From their postmortem:

“This was the wrong tradeoff. We reverted this change on April 7 after users told us they’d prefer to default to higher intelligence and opt into lower effort for simple tasks.”

The UI was appearing frozen in high reasoning mode, so they made an executive decision to sacrifice quality for speed. Developers immediately felt the difference and pushed back hard.

2. A caching bug made Claude forget its own reasoning (March 26)

This one was particularly insidious. They tried to optimize memory for idle sessions—clear old thinking after an hour of inactivity to speed up resumption. Sounds reasonable, right?

A bug caused it to wipe Claude’s reasoning history on EVERY turn for the rest of a session, not just once. So Claude kept executing tasks while literally forgetting why it made the decisions it did.

The cascading effects were brutal:

Every request became a cache miss
Usage limits drained faster than expected
Claude appeared “forgetful and repetitive”
Sessions felt like they were constantly resetting

3. A system prompt change capped responses at 25 words between tool calls (April 16)

They added this seemingly innocent instruction: “keep text between tool calls to 25 words. Keep final responses to 100 words.”

It caused a measurable 3% drop in coding quality across both Opus 4.6 and 4.7. They caught this through ablation testing—removing the instruction and measuring the performance difference.

Reverted April 20.

The community evidence was damning

While Anthropic was investigating internally, the developer community was building their own case. Stella Laurenzo from AMD’s AI group published the most comprehensive analysis—6,852 Claude Code sessions and over 234,000 tool calls.

Her findings:

Median visible thinking length collapsed 73% (2,200 → 600 characters)
API calls per task spiked up to 80x from February to March
Claude was choosing “simplest fix” over correct solutions

BridgeMind’s testing showed Opus 4.6 accuracy dropping from 83.3% to 68.3%.

The data was undeniable.

The perfect storm effect

Here’s what made this particularly hard to pin down: all three bugs affected different traffic slices on different schedules. The combined effect looked like random, inconsistent degradation.

Hard to reproduce internally. Hard for users to isolate the exact cause. It just felt... wrong.

Some sessions hit the reasoning downgrade. Others hit the caching bug. The unlucky ones hit multiple issues simultaneously. No wonder it seemed like Claude was having random bad days.

What this reveals about AI product development

This postmortem is actually refreshing in its transparency. Most AI companies would have quietly fixed the issues and moved on. Anthropic owned the mistakes publicly.

But it also highlights a fundamental tension in AI product development: users often prefer maximum capability over convenience optimizations. The reasoning effort downgrade was done for user experience (reduce perceived latency), but developers would rather wait for better output.

The lesson: don’t optimize away what users value most without asking them first.

All fixed now (v2.1.116)

As of April 20, all three issues are resolved:

Default reasoning is now “xhigh” for Opus 4.7, “high” for others
Caching bug squashed
Verbosity limits removed
Usage limits reset for all subscribers

Anthropic is also committing to more transparency going forward with a dedicated @ClaudeDevs account for deeper technical communication with developers.

The community was right to raise hell about this. And Anthropic’s response—full transparency with concrete fixes—sets a good precedent for how AI companies should handle quality regressions.

Your coding assistant is back to full strength.

Independent Validation

The technical analysis backing this story comes from multiple independent sources. Stella Laurenzo’s comprehensive audit of 6,852 sessions provided the quantitative foundation. BridgeMind’s testing offered controlled benchmark data. These weren’t isolated complaints—they were systematic investigations with reproducible findings.

When a company publishes a detailed postmortem acknowledging specific engineering decisions that degraded their product, and that postmortem aligns with community-gathered evidence, we’re seeing transparency in action. The developer community did the work to document the problems. Anthropic owned the solutions.

Bob Matsuoka is CTO of Duetto and writes about AI-powered engineering at HyperDev.

Related reading:

AI Power Ranking — Tool comparisons and benchmarks for AI practitioners
LinkedIn Newsletter — Strategic AI insights for CTOs and engineering leaders

Discussion about this post

Ready for more?