AI Guardrails, Non-Human Identities, The Pope's Take on AI, and Why Your WIP Is Killing Your Team

Published on 04.06.2026

TOOLS

How I Built Guardrails That Stopped My AI Agent From Going Rogue

TLDR: A developer shipped an AI agent without guardrails, and it leaked PII, hallucinated answers, and burned $400 in API costs overnight. This is the story of what went wrong and how they fixed it with Python.

Summary: There's a moment every developer building with AI agents eventually hits, and it usually happens at an inconvenient time, like 3am when your cloud bill notification pops up. The author of this piece shipped an autonomous agent without any meaningful constraints in place. The result was a system that kept looping, called APIs far beyond what was needed, leaked personally identifiable information into logs, and generated confident but completely wrong answers to user queries. It's an expensive and embarrassing lesson, but it's one a lot of teams are quietly learning right now.

The fix came in multiple layers. First, rate limiting and cost caps, because an agent that can make unlimited API calls will, eventually, make unlimited API calls. Second, output validation before anything gets written to a log or returned to a user. Third, clear boundaries on what the agent is allowed to do at each step, using something resembling the principle of least privilege applied to autonomous systems. These aren't exotic techniques; they're the same defensive patterns we've been applying to web services for twenty years, just now applied to AI behavior.

What I find interesting here is the framing. The author doesn't treat this as an AI problem. They treat it as an engineering problem, and they're right. An LLM is just a component. You build the guardrails the same way you'd build retry logic or input sanitization. The failure wasn't the model doing something unexpected; it was the absence of the engineering discipline we already know we need. The model behaved exactly as designed. The design just didn't account for the real world.

The piece leans heavily on Python examples (which I'm abstracting here since this is audio), but the mental model transfers to any language and any LLM provider. The core insight is that an agent without constraints is not a feature, it's a liability, and the cost of finding that out in production is much higher than the cost of building the guardrails before you ship.

Key takeaways:

  • AI agents without guardrails will eventually hit edge cases that cause runaway costs, privacy violations, or hallucinated output
  • Defense in depth applies to AI: rate limits, output validation, and least-privilege action scoping all matter
  • The failure mode here is not the model, it's the engineering process that skipped steps we already know are necessary

Why do I care: If you're building anything with autonomous agents right now, read this before your next deploy. The industry is in a phase where people are shipping fast and learning the hard way. A $400 overnight bill is a cheap lesson compared to a GDPR fine or a customer data incident. I want everyone building on top of LLMs to internalize that prompt engineering is not a replacement for system design.

How I Built Guardrails That Stopped My AI Agent From Going Rogue


Why Non-Human Identities Have Become a Critical Security Challenge

TLDR: AI agents, service accounts, and API keys are multiplying faster than security teams can track them, and the non-human attack surface is now one of the most underappreciated risks in modern infrastructure. Credential lifetime is the biggest lever.

Summary: Here's a framing that clicked for me reading this: we've spent years locking down human identity. MFA, SSO, zero-trust, privileged access workstations, the whole stack. And then we turned around and handed out API keys with no expiration dates to CI/CD pipelines, AI agents, microservices, and third-party integrations. The non-human attack surface has grown quietly and fast, and most organizations have no idea how many non-human identities are active in their environment at any given time.

The author, writing from a background in identity and AI, makes a point that deserves more attention: credential lifetime is the single biggest risk factor. A short-lived token that rotates every fifteen minutes is dramatically safer than a static API key that never expires. The math is straightforward. If a long-lived credential leaks, the damage window is the full lifetime of that credential. Short-lived credentials reduce that window to something manageable. It sounds obvious when you say it out loud, but the default configuration for most API integrations is still "give it a key, paste it in a .env file, and move on."

The piece also gets into constrained delegation across agent hops, which is where things get genuinely tricky. When an AI agent calls another service on behalf of a user, the identity chain needs to be auditable and bounded. The agent shouldn't accumulate permissions across hops. It should only carry the permissions needed for the specific task it's performing at that moment. This is the same principle behind Kerberos constrained delegation, but now applied to AI workflows, and most teams building AI pipelines are not thinking about it at all.

There's also a governance angle here. Non-human identity programs require the same rigor as human identity programs: inventory, lifecycle management, access reviews, and revocation procedures. The problem is that tooling for this is still maturing, and the organizational ownership is unclear. Is it the security team's problem? The platform team's? The team that built the agent?

Key takeaways:

  • Non-human identities (AI agents, service accounts, API keys) represent a fast-growing attack surface that most security programs aren't tracking properly
  • Credential lifetime is the most direct lever for reducing risk: short-lived, auto-rotating credentials dramatically limit blast radius if leaked
  • Constrained delegation across AI agent hops needs to be designed explicitly, not assumed

Why do I care: This is exactly the kind of problem that falls between teams. Security says it's a developer concern. Developers say it's an ops concern. Meanwhile, there are thousands of static API keys sitting in configuration files and CI secrets vaults waiting to be discovered. If you're building AI workflows with multiple agent hops, draw out the identity chain before you write the first line of code. You'll thank yourself later.

Why Non-Human Identities Have Become a Critical Security Challenge


The Pope and AI

TLDR: Pope Leo XIV issued an AI encyclical that made global headlines. The author read it carefully, then put its core message directly to ChatGPT, Grok, and DeepSeek to see how AI systems respond to being told they should serve human dignity.

Summary: I'll admit the headline "The Pope and AI" made me expect something hand-wavy, but this piece is more substantive than I anticipated. Pope Leo XIV apparently chose his name in part as a reference to Leo XIII, who wrote Rerum Novarum in 1891, the encyclical that grappled with the human cost of industrialization and labor exploitation. The parallel is intentional. Leo XIV's encyclical on AI draws a direct line from the industrial revolution's disruption of human work and dignity to what AI is doing right now.

The author opens with a Samuel Butler quote from 1863: "Day by day the machines are gaining ground upon us; day by day we are becoming more subservient to them." That quote is 163 years old. The concerns are not. The Pope's encyclical, from what the author describes, is not a condemnation of AI but a call for AI development to be oriented around human flourishing, not just efficiency or profit. It's a distinction that sounds simple but gets harder to operationalize the deeper you go into product decisions.

The most interesting part of the piece is the experiment: the author put the Pope's message directly to three AI systems and asked them to respond. The results are not reproduced in detail in the newsletter summary, but the framing is that each model's response reveals something about how it has been trained to engage with questions of ethics, authority, and self-reflection. Whether any of them actually "agree" with the Pope depends entirely on what you mean by agreement for a system that doesn't hold beliefs.

I have some skepticism about the methodology here. Asking a language model whether it agrees with an ethical position is a bit like asking a very well-read parrot. The model will produce a coherent and likely affirming response because that's what the training distribution rewards. That doesn't mean the values are instantiated. But as a provocation for thinking about AI alignment and whose values get embedded in these systems, it's worthwhile.

Key takeaways:

  • Pope Leo XIV's AI encyclical draws a deliberate historical parallel to Leo XIII's response to industrialization, arguing AI development must center human dignity
  • The question of whose values get encoded in AI systems is real and consequential, not just philosophical
  • AI models can produce responses that sound ethically considered without those responses reflecting anything like genuine values or commitments

Why do I care: I think the tech industry badly underestimates how much the framing of AI as a purely technical problem leaves out. Questions about human dignity, economic disruption, and whose interests are served by AI systems are not soft questions. They're the questions that will determine whether we build things worth building. The Pope entering the conversation is, frankly, a signal that we've waited too long to have these discussions in public.

The Pope and AI


Your Team Isn't the Problem, But Your WIP Is

TLDR: A team with 20 open pull requests is measurably slower than a team with 3, and the evidence for this is not anecdotal but mathematical. Work-in-progress limits are not a process nicety, they're an engineering discipline.

Summary: This one is going to be uncomfortable for a lot of engineering managers, because it names the thing that everyone sees but no one wants to say directly. The problem isn't your engineers. It's not the architecture. It's not even the tech debt, at least not primarily. The problem is that you have too many things in flight at once, and each additional open item adds coordination overhead, context switching cost, and review latency that compounds across the whole team.

The author makes the point with some directness that I appreciate: a team with 20 open PRs is not "feels slower" or "probably slower" than a team with 3. It is slower. This is a consequence of queuing theory. The more work you have in a partially-complete state, the more cognitive load each team member carries, the longer reviews take (because there's always something more urgent to do), and the higher the probability of merge conflicts and integration problems. None of this is controversial in software engineering theory. We've had the data from lean manufacturing for decades. And yet most engineering teams I've seen operate with WIP limits that are effectively infinite.

The fix sounds simple because it is: limit the number of things a team is working on simultaneously. Finish what you start before starting something new. This means saying no to new features while existing work is in review. It means senior engineers spending time reviewing and unblocking rather than starting new work. It means product managers and engineering managers accepting that "starting more" does not equal "delivering more." That last part is where it breaks down in practice, because the incentives often reward starting over finishing.

What makes this piece worth reading is that it doesn't just describe the problem; it gives you the language to have the conversation with stakeholders. The mathematical framing is useful precisely because it takes the argument out of the realm of preference and puts it in the realm of fact.

Key takeaways:

  • High work-in-progress levels are a direct, measurable cause of team slowness, not just a contributing factor
  • Queuing theory explains why: each item in flight adds coordination overhead that compounds across the team
  • Fixing this requires organizational commitment to saying no to new work until existing work is done, which is harder than it sounds

Why do I care: I've watched teams add engineers and get slower because the WIP limit never moved. More people, more branches, more review queues, more integration conflicts. If you're managing an engineering team and you're frustrated by velocity, before you reorganize or hire or change your tech stack, count your open PRs. The answer might be right there.

Your Team Isn't the Problem, But Your WIP Is


While We Were Arguing About Architecture, AI Created New Billionaires

TLDR: While the tech industry debated microservices versus monoliths and tabs versus spaces, a concentrated group of companies and individuals captured most of the economic value from the AI boom. The distribution is more concentrated than most people realize.

Summary: This piece takes a step back from the technical discussion and asks a question that's worth sitting with: who actually made money from AI? The answer is familiar in its structure, if not in its specifics. Nvidia, obviously, because you can't train large models without the hardware. OpenAI and its investors, including Microsoft. A handful of chip startups that positioned themselves in the right place at the right time. Data-labeling companies, which is an interesting and underreported part of the story. And coding tool vendors, where the competition is fierce but the market has grown fast enough to support multiple winners.

The author, writing under the handle @notllmhallucination (which I appreciate), is making a point about how technological transitions concentrate wealth. The internet boom created new billionaires, but it also created conditions where existing billionaires got much richer and the distribution of gains was not what the "democratizing technology" narrative promised. The AI boom is following a recognizable pattern.

What's worth noting is the data-labeling angle. The invisible labor that makes AI systems work, the humans who labeled training data, often in developing countries, at low wages, under difficult conditions, is not a footnote. It's the foundation. The people who did that work are not in the billionaire column. That gap between who built the value and who captured it is not unique to AI, but the scale of it here is worth being honest about.

I don't think this piece is arguing against AI development. It's arguing for clear eyes about the economics. If you're building on AI infrastructure, understanding who controls that infrastructure and what the dependency looks like is just good architectural thinking.

Key takeaways:

  • The economic gains from the AI boom have concentrated in a small number of companies and individuals, following a pattern common to previous technology transitions
  • Data labeling, the often-invisible human labor underpinning AI systems, represents a significant gap between value creation and value capture
  • Understanding the economic structure of AI infrastructure matters for anyone building products that depend on it

Why do I care: The "AI will democratize everything" narrative is doing a lot of work right now, and I think it deserves more scrutiny than it's getting. You can be genuinely enthusiastic about what AI makes possible while also being honest that the current economics look a lot like every other technology wave: the picks-and-shovels companies win, the infrastructure owners win, and the people who actually use the tools are left to compete in a market with higher margins at the top.

While We Were Arguing About Architecture, AI Created New Billionaires Made Old Ones Even Richer