Fear and Loathing in AWS

How Claude Helped Me Discover The Joys of Complex Infrastructure

May 07, 2026

TL;DR: I spent years avoiding AWS due to its overwhelming complexity, preferring the simplicity of Vercel and composable stacks. Then Claude MPM changed how I work with AWS. Now I’m running sophisticated multi-service AWS deployments with GPU instances, comprehensive monitoring, and infrastructure-as-code—all managed through AI-assisted tooling. The lesson: agentic approaches transform ops workflows just as profoundly as they do coding.

Remember when AWS felt like digital quicksand? Every innocent “let me just deploy this simple app” spiraled into an afternoon lost in IAM policies, VPC configurations, and security group rules that made no sense. I’d start with what should be a five-minute deployment and emerge three hours later, bleary-eyed, with a working application and absolutely no confidence I could recreate the process.

That’s why I became a Vercel evangelist. One git push, automatic deployments, zero configuration. The developer experience was everything AWS wasn’t: predictable, fast, and actually enjoyable. For most projects, this composable stack approach—Vercel for frontend, managed databases, serverless functions where needed—delivered exactly the right balance of power and simplicity.

But somewhere along the way, that changed.

The Claude Code/MPM Turning Point

Today I’m running the kind of infrastructure I would have delegated to an ops team a year ago. GPU instances for ML workloads. Multi-AZ deployments across six subnets. Sophisticated monitoring pipelines that integrate CloudWatch metrics, Cost Explorer analysis, and automated GitHub issue creation. Terraform managing multi-account infrastructure with cross-service dependencies.

The difference? I don’t even need AWS’s Q assistant. I have something better: purpose-built AWS skills in Claude MPM that handle service deployment, infrastructure analysis, and operational workflows.

This transformation illustrates something crucial about the agentic revolution: it’s not just changing how we write code. It’s fundamentally altering how we approach operational complexity.

The Infrastructure Reality Check

Let me show you what I mean with real numbers. Here’s what I’m actually running across three active projects to support internal tools:

CloudWatch Reporting Service (Serverless):

12 Lambda functions handling health checks, metrics aggregation, and MCP server functionality
API Gateway HTTP API with sophisticated CORS and authentication
DynamoDB tables for state management and external directory lookups
SNS/SQS for alerting and dead letter queue handling
Direct Bedrock integration with Claude 4.5 Haiku for automated analysis
CloudWatch Events scheduling 5-minute monitoring cycles
Secrets Manager for GitHub app credentials and API keys

Code Intelligence Platform (Compute):

Two EC2 instances: t3.xlarge for web serving, g4dn.xlarge for GPU-accelerated indexing
EBS volumes with gp3 storage and custom IOPS configuration
VPC with six subnets across availability zones
Application Load Balancer with Route53 DNS and ACM certificates
EFS for shared file storage across instances
CloudWatch Synthetics for endpoint monitoring
Lambda-based Slack notifications triggered by SNS topics

Enterprise Infrastructure (Multi-Account):

Terragrunt-managed infrastructure across production and staging accounts
S3 backend for Terraform state with DynamoDB locking
Cross-account IAM policies and service integration
Integration with external providers (Sentry, GitHub, Kubernetes clusters)

A year ago, this list would have been my personal infrastructure horror story. Today, it’s Tuesday.

What Changed

The transformation wasn’t gradual. It was a step function that happened when I realized AI assistants could handle the cognitive overhead that makes AWS painful.

Before: AWS documentation as archaeological expedition. Digging through service guides, trying to understand the relationship between VPC route tables and security groups, wondering if I need an Internet Gateway or a NAT Gateway or both. Every deployment felt like solving a puzzle where half the pieces were hidden.

After: Natural language infrastructure requests. “Set up monitoring for this Lambda function with alerting to Slack” becomes a series of guided steps where the AI handles the AWS-specific implementation details while I focus on the business requirements.

The key insight is that AWS’s complexity isn’t inherently bad—it’s just cognitively expensive. When you remove that cognitive load through AI assistance, you can appreciate what all those services actually enable.

Take my monitoring setup. Previously, I would have settled for basic uptime checks because configuring comprehensive CloudWatch metrics, Cost Explorer integration, and automated issue creation felt like a weekend project. With claude-mpm AWS skills, it became an hour of guided configuration that resulted in production-grade observability.

Or consider the GPU instance management. The g4dn.xlarge for ML indexing runs sophisticated start/stop automation, monitors for runaway processes, and automatically scales EBS volumes based on data requirements. Setting this up manually would have required deep expertise in EC2 lifecycle management, CloudWatch alarms, and Lambda automation. With AI assistance, I focused on defining the business logic while the tooling handled the AWS implementation.

The DX Philosophy Still Matters

None of this means AWS wins every comparison. Vercel’s developer experience remains superior for the use cases it targets. When I need to ship a marketing site or a straightforward web application, git push deployment still beats any infrastructure-as-code workflow.

The difference is recognizing when complexity serves a purpose versus when it’s just complexity. Vercel abstracts away infrastructure concerns because most web applications don’t need granular control over compute, storage, and networking. But when you’re building systems that do need that control—ML pipelines, high-throughput data processing, complex service topologies—AWS’s granularity becomes valuable rather than burdensome.

AI assistance changes the cost-benefit calculation. When configuring VPC networking takes 20 minutes of guided conversation instead of three hours of documentation archaeology, you can choose AWS for projects where you previously would have compromised on requirements to avoid operational overhead.

But there’s an honest accounting problem buried in that logic. Claude Code isn’t free. API costs, subscription fees—if you’re running significant conversation volume to figure out your infrastructure, you’re spending real money. At some point, you’re spending more on AI assistance than a Vercel seat would cost. The “AWS saves money at scale” argument gets complicated fast when you factor in the cognitive tooling required to get there.

So let me be direct about where each wins. Pure self-service developer experience—one engineer, a web app, ship it fast? Vercel, and it’s not particularly close. The moment you need an AI co-pilot to configure your infrastructure, you’ve added a cost layer that Vercel eliminates by design. But complex multi-service deployments—ML pipelines, GPU compute alongside serverless, multi-account Terraform, monitoring infrastructure that spans six services—those don’t live in Vercel’s world. That’s where the math inverts and AWS earns its complexity premium.

The Broader Implications

This transformation reveals something important about how agentic approaches will reshape technology adoption. We’re not just making individual tasks more efficient—we’re changing which categories of tools become accessible to developers.

I see this pattern across the infrastructure stack. Database migrations and performance tuning become approachable when AI translates business requirements into specific configuration changes. Kubernetes stops being “too complex for small teams” when you can describe desired behavior in natural language and get helm charts and operators generated automatically. IAM policies, security groups, and compliance frameworks become manageable when AI can analyze your application requirements and generate least-privilege configurations.

These tools were always powerful. They were just too expensive to learn and maintain for many use cases. AI assistance changes that economics.

Where This Goes Next

We’re still early in AI-assisted infrastructure management. Today’s tooling handles deployment and configuration. Cost optimization, security posture, performance tuning—those are coming. Full system architecture from high-level requirements is probably further out than the hype suggests, but it’s not science fiction.

But the fundamental lesson remains: complexity isn’t always the enemy. Sometimes it’s just temporarily inaccessible. When AI removes the accessibility barriers, you can choose tools based on their actual capabilities rather than their learning curves.

For now, I’m running infrastructure that would have seemed impossible to manage solo a year ago. And it’s kind of fun.

AWS might still feel like quicksand sometimes. But now I have a helicopter.

Bob Matsuoka is CTO of Duetto and writes about AI-powered engineering at HyperDev.

Related reading:

AI Power Ranking — Tool comparisons and benchmarks for AI practitioners
LinkedIn Newsletter — Strategic AI insights for CTOs and engineering leaders

Discussion about this post

Ready for more?