Human-in-the-Loop: Why Full Automation Is the Wrong Goal
The best AI systems aren't autonomous. They're collaborative. Full automation is a seductive goal that breaks on contact with reality — the winning pattern is bounded autonomy.
In February 2024, a British Columbia tribunal ruled that Air Canada was liable for advice its customer service chatbot had invented. Jake Moffatt, grieving a family member’s death, asked the chatbot about bereavement fares. The chatbot confidently described a policy that didn’t exist — book now, apply for a discount within ninety days. The actual policy required applications before booking. Moffatt followed the chatbot’s advice, booked the flights, applied for the refund, and was denied. The tribunal’s ruling was blunt: the chatbot is part of Air Canada’s website, and Air Canada is responsible for the information on its website. The chatbot didn’t know it was wrong. It never does.
The instinct, reading a story like that, is to add more guardrails to the model. Better prompts. Retrieval-augmented generation. Fine-tuning. And those help — but they miss the deeper problem. The chatbot was autonomous. It could answer questions, make promises, and shape customer expectations without any human ever seeing what it said. The failure wasn’t in the model’s accuracy. It was in the architecture’s assumption that accuracy would be enough.
The autonomy paradox
Anthropic published research in February 2026 — “Measuring AI agent autonomy in practice” — analysing nearly a million tool calls across Claude Code and their API.
New users auto-approve about 20% of agent actions. By 750 sessions, that figure exceeds 40%. Trust grows with experience. But here’s the part that matters: experienced users don’t just approve more. They interrupt more. New users review every step before it happens. Experienced users let the agent run and intervene when something feels wrong. They’ve shifted from gatekeeping to monitoring — from checking every line to watching the trajectory.
That’s not less oversight. It’s better oversight. And it’s the pattern that most teams building with AI agents get backwards.
The default assumption is a binary: either a human reviews everything (slow, doesn’t scale) or the agent runs autonomously (fast, occasionally catastrophic). The teams that actually ship reliable agent systems reject that binary entirely. They build for graduated autonomy — tight oversight at the start, loosening as trust is earned, but never reaching zero.
Five levels, one principle
The spectrum I use — borrowed from a workshop I will run in June with Stefan Angelov, an engineering leader I’ve worked with across multiple projects — looks like this:
Audit only. The agent acts. You log everything. Review happens after the fact, if at all. This is for actions that are fully reversible and low-risk — formatting, linting, boilerplate generation.
Notify after. The agent acts and tells you what it did. You review asynchronously. Non-critical updates, draft creation, routine data enrichment.
Confirm before. The agent proposes. You approve or reject before it executes. Configuration changes, data modifications — anything partially reversible where the cost of getting it wrong exceeds the cost of waiting.
Approve required. The agent cannot proceed without explicit human approval. Deployments, financial operations, anything touching production state. The approval request includes context: what, why, estimated impact, alternatives.
Human executes. The agent advises, the human acts. Production database migrations, security-critical operations, regulatory submissions. The agent provides the plan; the human holds the keyboard.
The principle underneath is simple: the severity of oversight should match the criticality of the action. Nuclear power plants work the same way — routine monitoring is automated, but a reactor shutdown requires human confirmation from two operators. Nobody argues that’s inefficient. The stakes demand it. A CSS change gets audit-only. A production database migration gets explicit approval. An emergency trading halt gets dual approval from two live operators. This is the same principle as tiered change management — the response is proportional to what’s at stake.
Where it goes wrong
Three failure modes show up consistently.
The rubber-stamp trap. You build approval flows. Nobody reads them. In the first iteration of a multi-agent system I built, approval messages said things like “Agent wants to modify config.” No context. No explanation of what the config change would do. Approvers clicked “approve” because the request gave them nothing to evaluate. The fix isn’t removing approvals — it’s making them useful. Every approval request needs four things: what the agent wants to do, why, what the impact will be, and what the alternatives are.
The bottleneck trap. Kief Morris, writing on Fowler’s site in “Humans and Agents in Software Engineering Loops,” describes nested loops in software engineering — what I’d summarise as the “why loop” and the “how loop.” The why loop — what to build, whether it’s working, what to change — is fundamentally human. The how loop — writing code, running tests, deploying — is where agents excel. When humans insert themselves as gatekeepers at the innermost code-generation loop, reviewing every line the agent produces, they become a bottleneck. The agent generates code faster than the human can inspect it. Productivity gains evaporate. The better model: humans at the outer loops — setting direction, reviewing phase transitions, evaluating outcomes. Agents at the inner loops — implementation, testing, iteration.
The timeout trap. You set five-minute timeouts on approvals because you don’t want the system to block forever. At 3 a.m., during a production incident, the timeout expires. The agent takes its default action — which, in your case, makes things worse. The lesson is expensive and simple: for critical actions, no timeout. Block until a human responds. Escalate aggressively — if the primary approver doesn’t respond in two minutes, page the secondary. If the secondary doesn’t respond, page the on-call manager. But never let the system decide on its own that silence means consent.
Bounded autonomy in practice
The workflow that survived in the multi-agent system I described in an earlier post wasn’t “human reviews every agent action.” It was humans approving phase transitions. A Business Analyst agent challenges assumptions. A PM structures the findings. A human reviews the PRD before anyone designs anything. An Architect proposes the system. A Product Owner pushes back. A human approves the design before anyone writes code. Engineers implement in parallel. A human reviews the output and outcomes before anything ships.
The agents have full autonomy within each phase. The humans own the gates between phases. That’s not micromanagement — it’s architecture. And it maps directly to how some of the best engineering teams already work, with or without AI.
In a trading system I’ve worked on, the approval tiers are explicit: emergency commands like a trading halt require two live approvers. Most configuration changes require one. Routine queries require none. The levels aren’t arbitrary — they’re calibrated to the blast radius of each action. The same principle applies whether the actor is an AI agent or a junior engineer. Risk determines oversight. Trust determines how much friction the oversight introduces.
The 2025 DORA report frames AI’s role precisely: it’s an amplifier. Strong teams with mature practices use AI to become faster and more reliable. Teams without those foundations use AI to generate more technical debt, more quickly. McKinsey’s research on agent governance tells a similar story — 80% of organisations deploying AI agents have already encountered risky agent behaviour. Their advice amounts to designing for trust first and speed second. Bounded autonomy with clear escalation paths isn’t a constraint on productivity. It’s what makes productivity possible at all.
The goal (at least our goal) was never full automation. Full automation is what you aim for when you haven’t thought carefully about where human judgment actually adds value. The goal is the right level of autonomy — agents handling what they’re good at, humans deciding what matters — with the boundary between them defined by risk, not by convenience.
Draw the line by risk. Earn trust gradually. And, I repeat, never let the system decide that silence means consent.