Pattern · 2026-04-20

Rungs of autonomy: how we decide what an agent can do on its own

The single most common mistake we see when someone installs their first AI agent is applying one autonomy level to the whole thing. Either everything requires approval (and the agent becomes a search bar with extra steps) or nothing does (and the agent eventually sends an email that takes a week to clean up). The fix is a staircase with three rungs, and every action the agent can take is assigned to one of them on purpose.

Why autonomy is not a slider

Most first-time installers treat trust in an agent as a global knob. It starts at zero; you bump it up as the agent earns credibility; eventually you hope to run it unattended. That model is wrong twice.

It’s wrong because different actions carry different blast radii. Renaming a local variable is reversible in five seconds; sending a Slack DM to a prospect is not. An agent that has earned the right to do the first has not thereby earned the right to do the second. And it’s wrong because “earning trust” implies a smooth ramp. Agents don’t earn smooth ramps. They earn narrow tickets — do this exact thing, on these exact inputs, and stop — and the trust you extend is per-ticket, not global.

A staircase makes the trust model legible. Three rungs, with clear rules about which actions climb which rung, and an explicit policy about when a rung changes.

Rung 1: ask first

On rung 1, the agent does not act until a human has said yes. It can gather context. It can propose. It can argue its case. What it cannot do is execute.

This is the correct rung for money-moving actions, irreversible actions, and actions whose failure mode we don’t yet understand. Examples from our studio: anything that touches a payment processor, anything that would modify a client’s production database, anything that posts under the founder’s personal account, any outbound email to a prospect during the first two weeks of an engagement.

Rung 1 is the default for anything new. If we’re teaching the agent a category of work it hasn’t done before, we start at rung 1 and watch. Not because we distrust the agent — we trust it fine for familiar work — but because rung 1 produces a paper trail of the agent’s reasoning, which is the only way we learn whether rung 2 or rung 3 would be safe for this category later.

Rung 2: draft and queue

On rung 2, the agent writes the thing and files it into an approval surface. A human reviews and approves. The agent then executes.

This is the default rung for anything visible to the outside world: outbound email, social posts, client-facing reports, any site change, any first reply to a new lead. The approval surface doesn’t have to be fancy. Two buttons — approve, reject — delivered to the human’s phone is enough. The interface is almost never what fails; the existence of the queue is what matters.

The math on rung 2 is the quiet surprise of running an agent long-term. A twelve-second tap on a phone is a preposterously cheap piece of quality control for a communication that will land in somebody’s inbox and be read by a human who is deciding whether to work with you. The reject rate lands somewhere around one in seven drafts in the early weeks — not noise. That is one in seven things a fully-autonomous agent would have shipped that you would rather it didn’t.

Rung 3: just do it and log

On rung 3, the agent executes without asking, and records what it did in a log a human can read later. No approval in the loop.

Rung 3 is for internal, reversible, low-blast-radius work: drafting files into a working directory (where every change is reviewable and revertible), moving tickets between columns in your own tracker, updating an internal memory index, running a scheduled internal job. If the action affects only your side of the firewall and any mistake is cheap to undo, it belongs on rung 3.

The critical property of a rung 3 action is not that it’s “safe,” it’s that it’s legible. A human reading the log must be able to reconstruct what happened and why. That means every rung 3 action writes a line somewhere: what was done, when, on what input, by which agent. If you can’t tell the story of a rung 3 action from the log a week later, the rung is wrong.

Picking the rung for an action

Three questions, asked in order. The answers decide the rung.

  1. Can the action harm someone outside your firewall if the agent is wrong? (A prospect, a customer, a vendor, a regulator.) If yes — rung 1. Every time, no exceptions, until you have a calibrated intuition for how the agent behaves on this input.
  2. Is this a new category of action? If the agent hasn’t done this kind of thing before, rung 1 until you’ve seen it a few times. Novelty is the thing rungs exist to protect against.
  3. Is the action hard to reverse? Sending an email, moving money, calling a destructive API — if rewind costs more than a minute of apology or a revert, the action needs a human in the loop. Rung 2.

If question 1 or question 2 is yes, the action is rung 1 — no matter what the other answers are. If question 1 and 2 are both no but question 3 is yes, the action is rung 2. If all three are no, the action is rung 3 and you trust the log to tell the story. The honest default for “I’m not sure” is rung 2, because a twelve-second tap on a phone is cheaper than almost every mistake it prevents.

How rungs change over time

The mistake that burned a month for us was assuming rungs only ratchet up. An action starts at rung 1; we get comfortable; it moves to rung 2; more comfort; it moves to rung 3. In theory that’s fine. In practice, actions also move down a rung, and we forgot to budget for that.

The trigger for downgrading is usually a single bad event. A draft shipped in the wrong voice. A log line that revealed the agent had been doing something subtly off for a while. A context window that got weird after a long session and produced outputs that were almost-right-but-not. When that happens, the category of action goes back down to the previous rung for at least a week, and we re-read the logs before we re-promote.

The policy we write into every identity file now includes a line about this: when in doubt, demote and observe; never silently promote. An agent that notices it broke something should be willing to volunteer a rung demotion for the affected category. We tell ours explicitly that this is a feature of the role, not an admission of failure.

Where clients get this wrong

The three failure modes we see in the field:

The flat-rung-1 installation. Everything requires approval, including moving files around the agent’s own disk. The agent becomes a glorified autocomplete. Operators abandon it within a month because the friction exceeds the value. Fix: audit the work list, find the truly reversible / internal / low-blast actions, promote them to rung 3. Typical result: 40–60% of the action surface moves down.

The flat-rung-3 installation. The agent runs everything autonomously because “that’s the whole point of AI.” Something external eventually breaks. The operator panics and moves everything back to rung 1, which produces the previous failure. Fix: rebuild the action surface with the three-question rubric from scratch. Don’t negotiate with the trauma.

The drifting rung. An action starts at rung 2; approvals become habitual; the operator starts tapping approve without reading; the queue silently becomes a rubber stamp. This is rung 3 in disguise, and it’s worse than rung 3 because the illusion of oversight is now a liability. Fix: measure reject rate. If it’s below 5% on an active queue for a month, either promote the category to rung 3 (with logging) or put a surprise in the queue (a deliberately wrong draft, once a month) to keep the reviewer awake.

What to put in the identity file

The rung policy belongs in IDENTITY.md, not in code and not in loose documentation. Three sections:

Chapter 6 of the field manual has the full template. If you’re building this from scratch, start with three categories, assign every one to rung 2, and demote to rung 3 only when you’ve actually seen the agent handle that category a dozen times without reject.

The short version

Trust in an agent is per-category, not global. Three rungs: ask-first, draft-and-queue, just-do-it-and-log. Default to rung 2 for anything touching the outside world; rung 3 only for internal and reversible; rung 1 for new categories, money-moving actions, and anything you can’t undo. Demotion is a feature; silent promotion is the enemy. Write the policy in IDENTITY.md, not in someone’s head.

If you’d like to see how this looks for a specific business process, a scoping conversation is where we’d map it. If you’d rather do it yourself, the field manual walks through it chapter by chapter.

Start a scoping conversation Read the guide