Published on 25.02.2026
TLDR: AI agents can process everything and still make decisions a junior employee would avoid. The author proposes "judgment architecture" as a new discipline: extracting your team's unwritten business rules and encoding them into machine-actionable parameters so AI systems learn not just what to do, but how to weigh trade-offs.
The article opens with a viscerally relatable example. The author is building an open-source AI executive assistant called Claudia. She was handling follow-up emails for interview scheduling — not cold outreach, but internal employees who had already agreed to participate. First follow-up, perfect. Second, solid. Third? She wrote a full paragraph re-explaining everything from scratch. Every detail technically correct. Exactly the kind of email that gets archived without reading. The AI had the context to know this was a third touch. It could not make the judgment call that any experienced professional makes instinctively: the more you follow up, the shorter the message. This is not a prompt engineering problem. It is not a context window problem. It is a judgment problem, and we do not have a discipline for it yet.
The author then reaches for a more consequential example. Air Canada's chatbot promised a passenger a bereavement discount that did not exist. A Canadian tribunal held the airline liable, rejecting the defense that the bot was a "separate legal entity." Nobody had encoded a simple escalation rule: high-risk promises about money or legal policy must go to a human. A hallucination became legal exposure. Customer service bots across industries optimize for deflection rate — keeping users away from human agents — instead of resolution quality. They treat escalation as failure when it should be a success condition the moment confidence drops. These systems are not stupid. They are making thousands of judgment calls per day with zero judgment architecture telling them which trade-offs actually matter.
Here is what I find genuinely interesting — and what the author dances around without fully confronting. The concept of "judgment architecture" is essentially knowledge engineering rebranded for the LLM era. Expert systems in the 1980s and 1990s tried to capture exactly this: the tacit knowledge of domain experts encoded into rules. They failed not because the idea was bad, but because the extraction process was brutally expensive and the rules were brittle. The author's approach of "meditation sessions" where the AI reviews patterns in how the human overrides its suggestions is actually a more elegant extraction mechanism than the old-school knowledge engineer with a clipboard. But the hard question remains unanswered: does this scale beyond a single user's preferences? When you have fifty people on a team, whose judgment do you encode? The senior person who is right 90% of the time, or the consensus that is right 70% of the time? The author is building for a solo practitioner use case and generalizing to enterprise without addressing the politics of whose unwritten rules become the encoded ones.
For architects and teams evaluating AI agent deployments, the practical takeaway here is real even if the framing is somewhat self-promotional. The gap between "what the AI optimizes for" and "what your top performer optimizes for" is where your risk lives. If you are deploying autonomous agents that run for days without human review, you need to ask yourself: what are the escalation conditions? What trade-offs does the system make by default, and are those the trade-offs you would make? The three pillars the author promises to detail later — objective translation, decision limits, and alignment feedback loops — map roughly to what responsible AI frameworks already call goal specification, guardrails, and RLHF. The new packaging might be useful for getting buy-in from business stakeholders who glaze over at technical terminology, but do not mistake new naming for new thinking.
The honest gap in this piece is that it frames judgment architecture as something you build once and iterate on, when in reality organizational judgment is contextual, contradictory, and political. Your best salesperson's judgment about when to discount conflicts with your CFO's judgment about margin protection. Teaching AI to judge means choosing whose judgment wins, and that is an organizational design problem, not an engineering one. The author is right that this matters enormously. The question is whether the solution is as clean as a /meditate command.