The Checklist Before Shipping with AI: 3 pillars vibe coding ignores

1,217 days and a painful realization

My first AI prompts were late 2022. GPT-3.5. I don’t remember the exact prompt, but I remember the feeling: this could save me massive amounts of time. Level up my dev skills. Fix bugs faster. Build things I wouldn’t have dared tackle alone.

Except the reality of 2022 was something else entirely. The generated code wasn’t great. Bug fixes didn’t always work, and often created new regressions. You’d go back, fix things by hand, test, start over. Context windows were tiny, models were limited. You spent as much time correcting the AI as you would have spent coding yourself.

I subscribed anyway. I started integrating AI at my clients’ through the OpenAI API (GPT-3.5 Turbo). Then GPT-4 arrived, and that changed the game. A real qualitative leap. Meanwhile, I tested Claude, did a lot of work with Cursor, and gradually migrated everything to Anthropic. Today, Claude Code is my main tool.

1,217 days later, here’s the realization I wish I’d heard sooner: vibe coding produces code. It doesn’t produce software.

The problem nobody wants to hear

Vibe coding is intoxicating. I wrote about this in the Claude Coder paradox: the better the AI gets, the less you think about what it’s doing. You stop reading the output. You stop questioning the approach. You just ship.

Harrison Chase said it: one person can now do the work of five. He’s right. But those five people had five sets of guardrails. Five sets of habits. Five people who would catch each other’s mistakes in code review.

You have none of that.

And here’s the part nobody wants to hear: if you’re not a senior developer, you’re not skipping steps on purpose. You’re skipping steps you never learned existed.

You don’t write tests because you’ve never experienced the 3 AM production bug that tests would have caught. You don’t set up monitoring because you’ve never shipped to ten thousand users who found the edge case you missed. You don’t think about security because you’ve never had your database dumped on a public paste site.

Robert C. Martin wrote it in Clean Code over fifteen years ago:

“It is not enough for code to work.”

That sentence hits different when the code was written by something that doesn’t understand what it does.

The 3 pillars vibe coding ignores

After 1,217 days of building with AI, I’ve identified three pillars that separate “code that runs on my machine” from “software I can ship to real users.”

Every vibe coder skips at least one. Most skip all three.

Pillar 1: Test before you ship

AI writes code that works in the prompt. Not in production.

Think about what happens when you ask Claude to build a feature. It generates code that satisfies the request. It handles the happy path. It returns something that looks right.

But it doesn’t think about what happens when the input is null. It doesn’t consider the race condition when two users hit the same endpoint. It doesn’t test what happens when the database is slow, the network drops, or the user pastes an emoji where you expected an email address.

What gets skipped every time:

Unit tests for individual functions
Integration tests for connected systems
Edge case coverage
Regression tests after changes

The cost is brutal. You ship a bug to production. Users report it. You open the code. And you realize you can’t debug it, because you didn’t write it. You don’t understand the logic. You don’t know why it made the choices it made. You’re reverse-engineering AI output at 2 AM while your users are leaving.

Robert C. Martin dedicated an entire section of The Clean Coder to TDD. Not because he loves writing tests. Because he knows that code without tests is code you can’t trust. And code you can’t trust is code you can’t maintain.

When AI writes the code, tests aren’t optional. They’re your only proof that the thing actually works.

I go deeper on this in Testing AI code: TDD and unit tests for builders who ship with Claude.

Pillar 2: Automate deployment, monitor production

If you deploy by copying files, you don’t have a product. You have a demo.

I see this constantly. Someone builds an app with Claude Code, pushes it to a server manually, and calls it shipped. No CI/CD pipeline. No error tracking. No performance monitoring. No alerts.

Then something breaks. And they don’t know. Their users know. Their users leave. And they find out three days later when someone sends them a screenshot of a 500 error.

What gets skipped:

CI/CD pipelines that run tests before deploy
Error tracking (Sentry, LogRocket, anything)
Uptime monitoring
Performance metrics
Automated rollback on failure

The cost is invisible until it isn’t. You’re shipping broken code and you don’t know it. Your app is down for six hours and nobody tells you. A deploy introduces a regression and you can’t roll back because you deployed by hand.

Robert C. Martin’s Clean Architecture is fundamentally about boundaries: knowing where your system ends and the outside world begins. Deployment and monitoring are those boundaries. Without them, you’re building a house with no doors and no windows. You can’t see outside, and nothing protects what’s inside.

A solo builder with a CI/CD pipeline and basic monitoring is more reliable than a team of five deploying manually. That’s not an opinion. That’s math.

I break down the full setup in CI/CD and monitoring for solo builders.

Pillar 3: Secure and host like a professional

AI can scan your code. It can’t think like an attacker.

This is the pillar that scares me the most. Because the consequences aren’t a bug or some downtime. The consequences are a breach. Leaked data. Lost trust. Legal liability.

I’ve seen AI-generated code that stores passwords in plain text. That exposes API keys in client-side bundles. That trusts user input without validation. That has SQL injection vulnerabilities that would make a first-year CS student wince.

And the developer shipping it has no idea. Because the code works. It does what they asked. It just also does what an attacker would ask.

What gets skipped:

Input validation and sanitization
Authentication and authorization best practices
HTTPS everywhere, secure headers, CORS
Dependency auditing
Load testing before launch
Proper hosting configuration

The cost is existential. A breach doesn’t just break your app. It breaks your reputation. It breaks your users’ trust. And depending on your jurisdiction, it breaks your bank account.

Robert C. Martin’s concept of professionalism in The Clean Coder applies directly here. A professional doesn’t ship what they don’t understand. A professional doesn’t hope their code is secure. A professional verifies.

When AI writes the code, you need to be twice as paranoid about security. Not half.

Full breakdown here: Security and hosting for AI builders.

The screenshot-able checklist

Save this. Print it. Tape it next to your monitor.

Pillar	Question to ask yourself	If the answer is no
Testing	Do I have unit tests for core logic?	Write them before you ship
Testing	Do I test edge cases and error states?	List 5 ways your input can be wrong
Testing	Do I run tests automatically before deploy?	Set up a pre-commit hook today
Testing	Can I debug AI-generated code when it breaks?	Add comments and tests until you can
Deployment	Do I have a CI/CD pipeline?	Set one up. GitHub Actions takes 20 minutes
Deployment	Do I have error tracking in production?	Install Sentry. It’s free for solo devs
Deployment	Do I get alerts when my app goes down?	Set up uptime monitoring now
Deployment	Can I roll back a bad deploy in under 5 minutes?	You don’t have a deploy process
Security	Do I validate and sanitize all user input?	You have vulnerabilities right now
Security	Are my API keys and secrets out of the client bundle?	Rotate them immediately
Security	Have I run a dependency audit this month?	`npm audit` takes ten seconds
Security	Have I load tested before launch?	Your first real traffic spike will be your test

If you checked “no” on more than three of these, you’re vibe coding. You’re not engineering.

That’s not an insult. That’s a diagnosis. And now you have the prescription.

What Robert C. Martin would tell vibe coders

I keep coming back to Uncle Bob’s trilogy (Clean Code, The Clean Coder, Clean Architecture) because these books were written before AI coding existed, and they’ve never been more relevant.

Clean Code doesn’t say “write everything yourself.” It says “understand everything you ship.” When AI writes your code, understanding means testing it, reading it, questioning it. Not just running it and moving on.

The Clean Coder doesn’t say “be slow.” It says “be responsible.” Shipping fast is fine. Shipping fast without tests, without monitoring, without security review is not speed. It’s negligence.

Clean Architecture doesn’t say “over-engineer everything.” It says “draw boundaries.” Know where your system ends. Know what depends on what. Know what breaks when something changes.

These principles don’t conflict with AI-assisted development. They complete it.

AI gives you speed. The three pillars give you trust.

Speed without trust is a liability.

Summary

Here’s the framework. Three pillars. Three articles. One checklist.

Pillar	What it covers	Deep dive
Testing	Unit tests, integration, edge cases, TDD	Testing AI code
Deployment	CI/CD, error tracking, monitoring, rollback	CI/CD and monitoring
Security	Input validation, secrets, hosting, load testing	Security and hosting

I’ve been building with AI for 1,217 days. I’ve shipped things I’m proud of and things I’m embarrassed by. The difference was never the AI. It was always the method.

AI is the best tool I’ve ever used. But a tool without method is a toy.

Stop vibe coding. Start engineering.

Related reading: