How to Let Your Team Vibe Code Without Breaking Things
Published on 27.04.2026
The Two Outcomes
Two numbers landed on my desk this month and sat next to each other like an indictment. Anthropic merged a 22,000-line pull request, written mostly by Claude, into their offline reinforcement-learning codebase. Shipped clean. Zero post-merge incidents. The same month, Escape Analysis published their audit of 5,600 vibe-coded applications and found 2,000 vulnerabilities, 400 exposed secrets, and 175 production apps serving unauthenticated data to anyone who asked. Lovable left a user data leak open for 48 days before patching it. Bolt was not far behind. Same models. Same tools. The gap between those two outcomes is the whole subject of this edition.
The headline numbers from Veracode's 2025 GenAI Code Security Report are ugly. Forty-five percent of AI-generated code contains detectable vulnerabilities at the moment of generation. The Cloud Security Alliance puts the figure higher at 62 percent. Georgia Tech's CVE tracker logged 35 new CVEs in March 2026 traced directly to AI-generated code. More than double February's count. Six times January's. The trend line is not flattening.
Meanwhile the other side of the ledger keeps ticking up. Claude Opus 4.7 hit 87.6 percent on SWE-bench Verified on 16 April. Cursor 3 shipped on 2 April with agent workflows that keep a whole codebase loaded at once. GPT-5.5 landed on 23 April. The models are pulling away from the median human developer on contained tasks. They are also generating more broken code at scale than any team of humans ever has.
Both things are true at once, which is why executives are confused and security teams are furious. The chief information security officer sees 62 percent of output needing a scrub. The head of engineering sees a 10x productivity bump on greenfield features. Both are looking at the same model. They are measuring different slices of the same output. What separates the companies shipping cleanly from the companies writing incident reports is not the model. It is everything around the model.
Four Principles for Responsible Vibe Coding
Erik Schluntz gave a talk at Code with Claude eleven months ago titled "Vibe Coding in Prod (Responsibly)." Four principles. No jargon.
Principle one: be Claude's PM. Do not type the code. Plan the work, decompose the task, write the ticket you would hand to a junior. Schluntz argued that humans spend ninety percent of their time typing and ten percent planning. Vibe coding inverts that ratio. The typing is cheap now. The planning is where the value moves.
Principle two: target leaf nodes. Pick parts of the codebase that nothing else depends on. Utilities. Scripts. Isolated services. When Claude breaks a leaf node, the blast radius is contained. When Claude changes a core abstraction, the blast radius is everything.
Principle three: verify behaviour, not implementation. Do not read every line of generated code hunting for bugs. Write tests and harnesses that stress the output. Run the thing. Check the outputs. Trust the behaviour envelope, not your ability to spot subtle flaws in code you did not write.
Principle four: embrace exponentials. The models are getting better fast. Build habits that scale with the tooling. Do not optimise your process around a six-month-old model capability.
The Five Patterns
Pattern one: a spotter is non-negotiable for anyone non-technical. Half an hour of developer time across a whole project. One review pass before deployment. That single step catches the Veracode 45 percent almost in full. Skip it and the 48-day window becomes your most likely outcome.
Pattern two: scope selection is the skill, not prompting. Every case that shipped safely started with a narrow problem where the worst-case failure was a wasted afternoon. The teams that collapsed were building user-facing production apps on day one. Pick small. Stay small. Graduate slowly.
Pattern three: verifiability beats review. Reading generated code line by line is how engineers who grew up writing code themselves try to adapt. It does not scale and it misses things. Tests, observable inputs and outputs, stress harnesses. Check the behaviour envelope.
Pattern four: maintenance ownership is assigned before the first line is written. Day 90 is where the silent failures live. Someone leaves. The tool breaks. Nobody remembers who built it or how. Assign the owner on day zero, or you are building something that will fail silently later while everyone celebrates the initial win.
Pattern five: the licence-to-habit bridge requires explicit permission. Companies buy Cursor seats. Seats sit unused. The missing ingredient is not training. It is a named executive saying out loud "you are allowed to use this on real work."
Key Takeaways
- AI models can generate 10x productivity on greenfield features, but 45-62% contains vulnerabilities
- Anthropic's 22K-line merge shows what proper process around AI tools can achieve
- Five patterns separate successful vibe coding from incident reports: spotter, scope selection, verifiability, ownership, and permission
- The model is not the problem; everything around the model is the problem
Why Do I Care
I've been thinking about how to introduce AI coding tools to teams, and this piece crystallized something I've been feeling. The conversation often focuses on which model or tool to use, but the real challenge is the process around it.
The spotter concept is particularly valuable. Half an hour of developer review before deployment seems like a small investment that could catch most issues. It's not about slowing down; it's about having a safety net.
Scope selection is where I see teams get into trouble most often. The excitement about what AI can do leads to building user-facing production apps immediately. Starting small with bounded blast radius seems obvious when stated explicitly, but it's tempting to skip when you're excited about the possibilities.
The permission point hits home. I've seen teams where the tools were available but people weren't sure they were allowed to use them on real work. Sometimes it takes an explicit statement from leadership to create that psychological safety.