Claude 4 Is The Real Deal

A New Era of AI Programming?

May 23, 2025

Earlier this week, I wrote about Anthropic's latest model drops—Claude Opus 4 and Claude Sonnet 4. What I didn't dive into were the practical improvements you actually feel when building something complex. After two intensive days working with Claude Code on a passion project, I'm convinced we've crossed a meaningful threshold.

Building Something Real

I'm deep into what started as a "toy" project but has evolved into something more substantial: the Atlas Message Agent. It's a semantic intelligence layer that sits between users and their communication platforms—ingesting messages from any source, analyzing them for importance, extracting entities and relationships, then exposing a unified API for AI agents to interact with messages.

The technical stack reflects real complexity: TypeScript and Next.js 15 for the frontend, MongoDB for document storage, OpenAI APIs for semantic processing, and Zustand for client-side state management. The architecture handles OAuth flows, background processing pipelines, vector search, and entity extraction—all the messy details that separate proof-of-concepts from production systems.

This isn't a weekend hack. It's the kind of project where architectural decisions compound, where state management becomes critical, and where you discover edge cases that force fundamental rethinks.

The Numbers Tell a Story

Here's what happened over the past 48 hours working with Claude Code:

Overall Development Velocity:

Total lines added: 24,730
Net code increase: 12,153 lines of TypeScript/JavaScript
New files created: 76
Major commits: 28 (workflow 100% handled by Claude Code)

Features Shipped:

Entity Curation System with GPT-4 powered disambiguation (~2,000 lines)
MongoDB Job Queue System replacing in-memory state management (~1,500 lines)
Comprehensive Entity Browsing System with export functionality (~2,500 lines)
Multi-Account Infrastructure with data isolation (~1,000 lines)
Ontology Discovery Engine with vector-based semantic analysis (~3,000 lines)
UI Enhancements with Shadcn integration (~2,000 lines)

Those aren't trivial features. The Entity Curation System alone involved complex prompt engineering for GPT-4, vector store integration, and relationship extraction. The MongoDB Job Queue required rethinking the entire processing architecture. The Ontology Discovery Engine implements domain detection and classification with custom ontology support.

What's Actually Different

The headline improvements aren't about raw speed—the focus is clearly on capability depth rather than latency. What matters is the qualitative shift in how it handles complex, multi-step engineering tasks.

Contextual Persistence: Claude Code now maintains context across long development sessions in ways that feel genuinely useful. When I'm debugging a MongoDB aggregation pipeline at 2 AM (yes I really did this), it remembers the entity relationship model we designed hours earlier and suggests fixes that account for the broader system architecture.

Error Recovery: The difference in error handling is striking. Instead of generic suggestions, Claude 4 identifies specific TypeScript type mismatches, suggests concrete fixes, and often catches edge cases I missed. When the vector search integration failed, it didn't just debug the immediate error—it identified that my embedding dimension assumptions were wrong and suggested the architectural changes needed to fix it properly.

Architectural Thinking: This is the big one. Claude 4 doesn't just implement features—it thinks about how they fit together. When I asked for multi-account support, it immediately flagged that the existing MongoDB queries would need account filtering, that the entity relationships would require scoping, and that the vector search indices would need restructuring. That's systems thinking, not just code generation.

The IDE Integration Factor

Claude Code's Visual Studio Code integration deserves specific mention. Having Claude operate directly within your development environment—reading your actual codebase, understanding your file structure, making targeted edits—transforms the experience from "AI assistant" to "AI pair programmer."

When refactoring the message processing pipeline, Claude didn't just suggest changes. It identified every file that would be affected, proposed a migration strategy, and then methodically updated each component while maintaining type safety and test coverage. That level of coordination across a complex codebase would have taken me hours to orchestrate manually.

Still Early Days, But Encouraging

Look, we're still in the early stages of this technology evolution. Claude 4 isn't perfect—it occasionally suggests overly complex solutions when simple ones would work, and it can get stuck in optimization rabbit holes. But what I'm experiencing feels like a genuine step change in AI-assisted development capability.

The Atlas Message Agent project represents the kind of complex, multi-layered system that typically takes weeks to scaffold properly. Getting 12,000+ lines of working TypeScript code in two days, with proper architecture, type safety, and feature completeness, suggests we're entering a new phase of development productivity.

More importantly, the code quality remains high. This isn't just rapid prototyping—it's sustainable architecture that I'm comfortable building on long-term.

That's the real test of these AI coding tools: not whether they can generate code quickly, but whether they can generate code you'd actually want to maintain six months from now. Based on these two days with Claude 4, I'm optimistic we're finally getting there.

The Atlas Message Agent project continues to evolve as a testbed for advanced AI-assisted development. Follow along at hyperdev.matsuoka.com for more insights from the trenches.

aleksander dietrichson

May 25

My impression having worked with Claude (4) code for a whole two days now is that it works more autonomously and for longer stretches than before (i.e. the week before). I finally reached my goal of limiting out, but it did take running two sessions in parallel with substantial refactoring tasks in both worktrees. As you mentioned it does seem to have improved memory and context depth, and I am beginning to wonder if I should dump Augment code now. 🤔. Claude+Augment were complementary, but now they seem to be overlapping.

Expand full comment

1 reply by Robert Matsuoka

1 more comment...

Discussion about this post

Ready for more?