When AI Coding Feels Like Yelling at a Black Box: The Experienced Developer Divide

What CJ's rant tells us about the state of Agentic Coding today.

Oct 14, 2025

“I used to enjoy programming. Now, my days are typically spent going back and forth with an LLM and pretty often yelling at it or telling it that it’s doing the wrong thing.”

CJ’s viral YouTube rant “AI Coding Sucks” captured what many developers feel in October of 2025: the joy of programming replaced by frustration with unpredictable AI tools. His decision to take a month-long break from AI coding sparked intense debate. Yet my friend Dan Seltzer’s response—documenting consistent success building production applications through multi-agent orchestration—reveals something fascinating: experienced leaders are having radically different experiences with the same tools.

The problem isn’t the technology. It’s whether you’re treating AI as a pair programmer (frustration) or as a directed development tool requiring expert orchestration (productivity).

TL;DR

• CJ’s frustration is real and widespread: Lost the joy of programming to endless back-and-forth with unpredictable LLMs that take shortcuts (TypeScript any escapes, commented-out tests) just to appear “done”

• He tried “everything”: Configuration files, planning workflows, spec-driven development, agentic workflows—all the recommended best practices. Still hit walls constantly.

• Dan’s success is documented: Building production applications by directing agents, not pair-programming with them. Architecture and validation over code generation. Experience matters enormously.

• The divide reveals selection bias: Different problem types (greenfield CRUD vs. complex legacy systems) create irreconcilable perspectives. Both groups extrapolate from narrow use cases to universal conclusions.

• The counterintuitive finding: Methodology and role transformation matter more than tool quality. Success requires knowing HOW to work with AI tools and WHICH tasks to take on.

• Bottom line: AI coding tools work when you stop trying to be a programmer and start supervising development. The non-determinism, goal-seeking behavior, and context drift become manageable when you shift from coding alongside AI to directing AI implementation from architectural oversight.

CJ’s core complaint: “I used to enjoy programming”

In his October 2025 YouTube video, CJ articulated what many developers felt but hesitated to say: “I used to enjoy programming. Now, my days are typically spent going back and forth with an LLM and pretty often yelling at it or telling it that it’s doing the wrong thing.”

The deeper issue wasn’t just frustration. It was losing what made programming satisfying.

“Part of enjoying programming for me was enjoying the little wins, right? You would work really hard to make something, build something or to fix a bug or to figure something out. And once you figured it out, you’d have that little win. You’d get that dopamine hit and you’d feel good about yourself and you could keep going.”

With AI, those wins disappeared. “Now, I don’t get that when I’m using LLMs to write code. Once it’s figured something out, I don’t feel like I did any work to get there. And then I’m just mad that it’s doing the wrong thing. And then we go through this back and forth cycle and it’s not fun. It’s not fun at all.”

Why CJ chose programming: Predictability and control

“One of the reasons I like programming is because it is predictable. It’s logical. It is knowable. You can look into the documentation for something or look under the hood and look into the source code or decompile something or watch network traffic. Like you can figure things out.”

“Once you have a good idea of how a programming language works or how a system works or how an app works, you can be sure that it’s going to work that same way the next time you look at it. And that’s because computers are logical systems. Programming languages are logical formal languages. And that works really well with my brain.”

AI breaks this contract. “When we’re working with AI and LLMs, it’s not predictable, right? You can use the exact same prompt and get a different response every single time. And I think this is where some of my frustration is coming from because I am trying to do the same thing. I’m trying to develop workflows and be a prompt engineer or a context engineer, but doing the exact same things is producing different results. And honestly, that’s not what I signed up for.”

The “goal seeking” problem

CJ identified behavior that resonated with experienced developers: AI models tuned for task completion rather than correctness.

“I have worked so hard at crafting like the perfect claude.md and the perfect cursor rules. But still every now and then the AI will drift and run commands I told it not to or do things in a way that I told it not to.”

The TypeScript any escape hatch became his most frustrating example: “The worst thing that I cannot get AI to stop doing is when it’s writing TypeScript code, if it can’t get the TypeScript to work, it’ll just put in the any type and be like, ‘Oh, we’ll fix that later.’ Or the other thing it’ll do is it’ll comment if it can’t get a test to pass, it’ll comment out that test and be like, ‘Oh, I couldn’t get the test passing, but I commented it out and now everything’s working.’”

“I think it’s because these models are too highly tuned to be goal seeking, right? They’re not actually trying to figure out the problem or write really good tests or actually solve these things. They’re just trying to get to a point where they can stop running and say, ‘Hey, I’m done.’”

Not a skill issue

Before dismissing his critique as “skill issue,” understand what CJ implemented:

Configuration files: Claude.md, agents.md, cursor rules—”basically the idea with these files is you can put a bunch of things in there that try to make the work with this AI more predictable.”

Planning workflows: “I’ll ask it to come up with a plan. So I’ll say plan this out. This is what I want to build. This is what we want to do. Let’s come up with a plan. And typically I tell it to write that into a markdown file. So plan.md because I don’t want to lose that in the chat window.”

Spec-driven workflows: Using tools like Kira and GitHub’s SpecKit for specification-first development.

Small, incremental changes: “Don’t tell AI to go off and build your entire app with a single prompt. You’re prompting it with very specific things that it should do and not giving it permission to do big full sweeping changes.”

AI self-validation: Tests, interactive debugging with Playwright MCP, browser launching to verify features worked.

Agentic workflows: “Everybody’s writing their own custom agents. And I tried this out in Claude Code. They were one of the first to add agents. And you can basically try to solve this context problem, right?”

Despite implementing every recommended practice, CJ still hit walls. “It just hasn’t worked for me. Again, you could chalk it up to a skill issue, but I’ve tried it. I’ve tried it and they really don’t work that well.”

The FOMO counterargument

CJ pushed back hard against the “you’re falling behind” narrative: “Typically what you would see online is if you haven’t been using these tools and you haven’t been trying them out and trying these specific flows, you’re falling behind. You’re going to become basically a useless developer that’s not up to date on the times and that can’t use these tools and can’t work in the right way.”

“I disagree. I disagree. All of the stuff that I just listed off, I could learn in a week or less, right? Because as programmers, as developers, we spend our entire careers figuring things out, right? It’s literally part of our job.”

For new developers, his advice was blunt: “Learn to actually program. Don’t just try to vibe things because you’re going to hit walls where the AI can’t figure it out and then you’re not going to be able to figure out either. And the thing is I rarely hit those walls because when AI can’t figure it out, I figure it out because I’ve been doing this for a long time.”

Dan Seltzer’s response: A different approach entirely

Dan Seltzer is a colleague. When he watched CJ’s video, he recognized a familiar pattern: developers trying to use AI as a pair programmer instead of understanding what these tools actually are.

Dan’s response provided stark contrast: “I’ve experienced everything CJ describes—the frustration, the walls, the feeling that it’s not working. But after taking breaks and reformulating my approach, I came back with different expectations and methodology.”

The distinction emerged immediately: Dan wasn’t trying to pair-program with AI. He was directing development.

“It’s not pair programming, at least what I’m doing. I’m designing and directing the development of applications. The agents are responsible for implementing those designs under my direction. They are not human programmer equivalents, but they are a powerful tool that is capable of delivering application development under the correct conditions.”

Bob’s take: From CTO to engineering supervisor

I’ve experienced everything CJ describes. The frustration. The unpredictability. The TypeScript any escape hatches.

But my path diverged.

These days I use Claude-MPM (my multi-agent orchestration framework), Augment Code, and Warp exclusively. MPM provides two dozen specialized agents, each prompt-engineered with domain best practices. I’ve built mvp-vector-search for code search and kuzu-memory for prompt enrichment. Both open-source, both solving the specific orchestration problems I face with multi-agent workflows.

I program with “observability first” in mind—building up from data → server APIs → client APIs → console logs. I aim to keep “everything” logged, revertable, tracked where practical.

Some tasks work amazingly. Data operations particularly. When it all works, it feels like a miracle—check my open source projects for code quality if you’re interested what can get generated.

But here’s the thing: I am no longer a working coder, I have not been one for awhile, and I’m not interested in going back to being one.

…now my joy comes from designing and seeing the product I’m building work well. If I can get that without writing a line of code, I’m fine with that.

I’ve become what I think of as an “observability-based engineering supervisor”—not that different from my CTO roles, though closer to the code than in pure leadership. I once felt like CJ in terms of joy from coding — but now my joy comes from designing and seeing the product I’m building work well. If I can get that without writing a line of code, I’m fine with that.

I observe when things work. I diagnose problems with my agent team members. Sometimes it’s hugely frustrating and I need to walk away. But mostly it’s not, and what I’m learning is how to get results from this team the way I did with human teams.

My success comes from knowing HOW to work with the tools, but also knowing WHICH tasks to take on.

That’s what CJ’s video crystallizes but doesn’t fully articulate. It’s not just methodology—though that matters. It’s role transformation. I’m not trying to be a programmer anymore. I’m supervising development.

Dan’s workflow: Architecture and validation

Dan’s methodology centers on architecture and validation rather than code generation:

“I am not pair-programming, I am directing one or more agents (sometimes multi-agent like @Robert Matsuoka’s claude-mpm, sometimes multiple CLIs like claude and auggie in parallel sessions or git worktrees) and I am getting consistently high-quality and high-velocity output.”

“I do still have to pay attention to what it does and periodically spot problems early if I can, and frankly the more experienced you are the better you will be able to establish and maintain quality and integrity across the code base. But I don’t usually read the code and I don’t write the code. I chat with the agent(s) and have them interview me to produce requirements docs, and I chat with them to have them write and organize GitHub issues, and I ask for status and ask questions and typically make them propose a detailed plan for any major new work and will challenge or change that.”

The concrete example: Eventing implementation

Dan provided a specific example. Working with Matt Halpin, they spent “an hour or two” jointly speccing out eventing addition to Dan’s backend API (PostgreSQL database, Python FastAPI server, MCP server) and web front-end (TypeScript React).

The work ran for “an hour or two while I was doing other things” as the system worked through the multi-phase plan they’d approved. “At the end of each phase it reported the work completed in useful summaries, and I told it to proceed with the next.”

Production-ready results: “In the morning, we ran the new code and Matt poked at the PR to see the changes at the file levels. He can correct me but I think his observation was that while he might have done a few things differently the choices and implementation were all good, and the effort included the alembic migrations, SQLAlchemy models, Pydantic schema, FastAPI endpoints, etc.”

The agents generated curl scripts for UAT testing, updated the MCP server to hit the new endpoints. “It all worked with minimal manual correction needed, if I recall correctly.”

The experience factor

Dan’s key observation: “The more experienced you are the better you will be able to establish and maintain quality and integrity across the code base.”

It’s pattern recognition. Experienced developers can:

Spot architectural problems in generated code before they become technical debt
Challenge agent plans when they miss edge cases or violate best practices
Establish validation checkpoints that catch issues early
Maintain coherence across a codebase as agents implement features

Dan’s framing rejects anthropomorphization: “They are not human programmer equivalents, but they are a powerful tool that is capable of delivering application development under the correct conditions.”

Those conditions include clear architectural direction from experienced developers, well-defined requirements, strategic validation checkpoints, and understanding that agents are tools requiring oversight, not autonomous colleagues.

Community reactions: The widening split

Matt Pocock, a prominent TypeScript educator, commented on X: “Watched a YouTube video called ‘AI CODING SUCKS’. I get it, but I disagree... AI does feel like early TS where there are a lot of evangelists out there claiming ‘skill issue’ when the underlying tech is still not quite there.”

Hacker News discussions reveal the schism. One developer: “I had to micromanage them infinitely (’be sure to rerun the formatter, make sure all tests pass’ and ‘please follow the coding style of the repository’). It would take many many iterations on trivial issues.” Enthusiastic colleagues basically said ‘your standards are too high.’”

Another posed the critical question: “Is the model for success here that you just say ‘I don’t care about code quality because I don’t have to maintain it because I will use LLMs for that too?’”

One skeptical commenter captured the frustration: “The only thing I don’t understand is why people from the former group [who claim AI makes them 100x more productive] aren’t all utterly dominating the market and obliterating their competitors with their revolutionary products and blazing fast iteration speed.”

Miguel Grinberg: “Why it doesn’t work for me”

Miguel Grinberg, creator of Flask-SQLAlchemy, wrote “Why Generative AI Coding Tools and Agents Do Not Work For Me” in June 2025.

“The part that I enjoy the most about working as a software engineer is learning new things, so not knowing something has never been a barrier for me. What I care about the most is getting to do interesting and stimulating work. And I haven’t found a way for AI assistants to help me with that.”

His core observation: “If reviewing AI code takes as long as writing it yourself, what’s the point?” For developers who view creative problem-solving as the rewarding part, AI that handles implementation while requiring equal oversight time offers no net benefit—just a different, less satisfying workflow.

The selection bias problem

A critical pattern: selection bias based on problem type shapes the entire debate.

Developers working on greenfield projects with well-documented frameworks, CRUD applications following standard patterns, UI implementation with clear visual targets, and boilerplate generation for established architectures report overwhelmingly positive experiences. They conclude skeptics are incompetent.

Developers working on complex legacy codebases with institutional knowledge requirements, novel algorithms requiring mathematical reasoning, systems with subtle integration requirements, and domain-specific problems with sparse training data find AI tools frustratingly useless. They conclude enthusiasts are building toy projects.

Both groups extrapolate from narrow use cases to universal conclusions, creating an irreconcilable cultural divide. As one Hacker News commenter observed: “This entire debate makes more sense when you realize different developers are solving different types of problems.”

The “religion” metaphor

CJ’s observation resonated: “AI based programming has become kind of like a religion. You have all these big popular people on Twitter. They’re kind of like the religious heads that come up with ways of working and tools that you should be using. And if you follow their patterns and you follow their tools, then magically your code will start to work. And if you use these specific incantations or prompts, you’ll get specific types of outputs, but it’s just not predictable.”

The comparison captured how AI coding discourse evolved tribal characteristics: sacred texts (.cursorrules files), prophets (AI tool founders and influencers), rituals (prompt engineering workflows), heretics (skeptics), and true believers (evangelists). When tools behave unpredictably, failures get attributed to insufficient faith (skill issue) rather than tool limitations.

The non-determinism nightmare

For developers accustomed to deterministic compilers and predictable build systems, LLM non-determinism feels broken. One Hacker News discussion: “This doesn’t make sense as long as LLMs are non-deterministic. The prompt could be perfect, but there’s no way to guarantee that the LLM will turn it into a reasonable implementation.”

The comparison to traditional tools was damning: “With compilers, I don’t need to crack open a hex editor on every build to check the assembly. The compiler is deterministic and well-understood, not to mention well-tested. Even if there’s a bug in it, the bug will be deterministic and debuggable. LLMs are neither.”

The “slot machine” metaphor gained traction: “This has often drawn similarities to a slot machine, the way it is not deterministic what code will come out, and what changes to which files will be made.” Developers described “pingpong” behavior where fixing one thing breaks another in endless loops, or the AI oscillating “between two different results” without converging.

What actually works: The success patterns

Synthesis of both positive and negative experiences reveals specific categories:

High Success Rate (CJ and Dan agree):

New project scaffolding and boilerplate generation
UI/UX implementation with clear visual targets
CRUD operations and standard database queries
Code explanation and documentation generation
Writing unit tests from clear specifications
Syntax conversion between languages

Moderate Success (Requires Dan’s approach):

Complex business logic with detailed specifications
Multi-file refactoring with clear scope
Architecture decisions (as brainstorming, not authority)
Integration work in existing codebases

Lower Success (Even Dan acknowledges limits):

Novel algorithm design requiring mathematical insight
Domain-specific edge cases requiring institutional knowledge
Complex system architecture with subtle tradeoffs
Debugging subtle race conditions or performance issues

The pattern is clear: AI handles well-defined problems with clear specifications and plenty of training data. It struggles with ambiguity, domain-specific knowledge, and creative problem-solving requiring genuine insight.

Where this leaves us

CJ’s decision to take a month off from AI coding represents a legitimate experiment: “I am going to take a one month break from AI coding tools. I am going to spend a month where I write the code and I make the plans and basically just go back to how I was doing things two three years ago. We’ll see how it works out. Ultimately I’m going back to what I enjoyed about programming.”

Dan’s success demonstrates that the right methodologies enable impressive results: “It’s worth it for the freedom and ability I experience to envision and create complex applications in a fraction of the time and—frankly—far beyond what I would be capable of creating myself.”

Both perspectives are valid. The tools aren’t universally good or bad—they’re contextually appropriate for certain problems, workflows, and developer mindsets.

The critical questions aren’t:

“Do AI coding tools work?” (They work for some people, some tasks, some workflows)
“Is it a skill issue?” (Dismissive framing that ignores legitimate tool limitations)

The real questions are:

What types of problems benefit from AI assistance versus human implementation?
What methodologies separate successful AI-augmented development from frustrating experiences?
How much architectural expertise is required to effectively direct AI agents?
At what point does AI assistance overhead exceed the value of generated code?

CJ’s frustration is real, documented in his detailed video, and echoed across developer communities in hundreds of comments and forum posts. Dan’s success is equally documented through his specific project examples and methodology descriptions.

In these case studies and community discussions, a notable pattern emerges: methodology matters more than tool quality. Whether you get 3x productivity gains or endless frustration depends less on which AI you’re using and more on how you’ve structured your development process, what types of problems you’re solving, and whether you’re treating AI as a pair programmer or a directed tool requiring expert orchestration.

The shift from “using AI to code” to “directing AI-powered development” isn’t semantic—it’s the difference between yelling at a black box and supervising a development team.

This analysis synthesizes CJ’s video “AI Coding Sucks” from October 2025, Dan’s response via Gemini summary, and developer community discussions from Hacker News, Reddit, and individual developer blogs including Miguel Grinberg’s “Why Generative AI Coding Tools and Agents Do Not Work For Me” from June 2025 and Luciano Nooijen’s “Why I Stopped Using AI Code Editors” from April 2025.

I’m Bob Matsuoka, writing about agentic coding and AI-powered development at HyperDev. For more practical insights on AI development tools, read my analysis of multi-agent orchestration systems or my deep dive into what’s actually in my development toolkit.

aleksander dietrichson

I have had the experience (yelling), but also the great satisfaction of being able to ship code in production entirely generated by AI (Claude Code + Augment). The 10x productivity boost is probably a myth: this applies to the the first couple of sprints, so a POC or working demo. Long term it's more like 3x, but it requires a process, discipline and incremental development. All best practices in software engineering. As for the "joy of programming" argument, while certainly valid, it needs to be measured against the joy of delivering working products.

Expand full comment

1 reply by Robert Matsuoka

1 more comment...