Loops
Let me start with the word itself, because everything on this page hangs off it. A loop is the round trip between you and a machine working for you: you say what you want, the agent produces something, you check it, and what you learned feeds the next ask. If you have ever rephrased a prompt because the first answer missed the point, you have closed a loop. The size of a loop is how much work happens between two of your looks: a paragraph, a feature, a week of output. Everything below, the three zones, the sizing criteria, the closure question, is about choosing that size deliberately instead of inheriting it from habit.
None of this is new, and I would rather say so myself than have someone say it for me. The feedback loop is the beating heart of every iterative method we have practiced since the Agile Manifesto put it in writing in 2001: build something small, look at it, adjust, go again. A sprint is a loop. Continuous delivery is the same loop run tighter. Build, measure, learn is the same shape with a different label. The loop is old. What changed is who works inside it and how fast one person can open one. When the worker is a machine you summon on demand, you choose loop size per task, where your team’s calendar used to decide it for you. The far end of that range, days of unattended agent work, is the part we genuinely have not done before.
The question worth taking seriously here is the one this whole framework circles: if the tight loop you probably ran this morning (ask, read, wince, rephrase) works so well, how far can it stretch? Tight is the one mode everyone knows from their own hands. The other two zones are the same loop with more line paid out, and every extra meter changes what has to be true around the work.
Three zones, none of them a ladder
Tight runs in minutes. You are embedded in the cycle, reviewing every turn, checking hypotheses as they form, working in something close to pair-coding mode. The cognitive load is high and the control is maximal, and that trade is exactly right for a whole class of work: high ambiguity, high risk, strategizing, legacy brownfield where the documentation lies, problems nobody has seen before, and situations where you are the one who needs to learn something from the contact with the material.
Elastic runs in hours. You hand over a bounded chunk of work, structured delegation with guardrails and checkpoints, and you come back at agreed points instead of every turn. The work that belongs here is understood enough to hand over, but not safe enough to ignore. (That sentence carries more weight than it looks; most delegation failures I see come from misjudging one of its two halves.)
Loose runs in days. Agents or agent teams work asynchronously under policy constraints, and the human reviews outcomes rather than turns.
I want to be blunt about a misreading the three zones invite: tight is not a beginner mode you graduate out of, and loose is not a badge of being advanced. A senior engineer who keeps a highly regulated change in a tight loop is sizing correctly! A team that pushes everything loose because it feels like progress is about to discover why the criteria below exist. The only question the zones answer is how much loop this particular task can carry.
Are your loop sizes choices or accidents?
People keep collapsing two questions into one, so let me keep them apart.
1. What state are the loops you already run in?
Four strategic questions get at it:
- How volatile is the domain?
- How fresh is the context the agents work from?
- Where does trust actually live in the organization?
- And which metrics would catch drift fast enough to matter?
Ask these about a team, and you learn whether its current loop sizes are choices or accidents.
2. The loop size of the task in front of you
For that I use seven sizing criteria.
- Ambiguity: unclear intent pulls the loop tighter.
- Risk and blast radius: the bigger the damage a wrong move can do, the tighter the loop and the stronger the gates.
- Context freshness: undocumented or volatile territory pulls tighter.
- Verification quality: strong tests, evals, and scenarios are what permit looser loops in the first place.
- Reversibility: cheap rollback buys you elasticity.
- Learning goal: if a human needs to understand this deeply afterward (comprehension, or cognitive debt is a thing), do not over-delegate it away from them.
- Agent capability: long task horizons only pay off when the closure infrastructure can keep up with them.
Diagnosis tells you where you stand, sizing tells you what to do with the next task, and mixing them is how teams end up arguing about maturity when they should be arguing about a specific blast radius.
Every zone has to close the loop
Whatever size you pick, the loop has to close: the outcome gets checked, and what was learned survives. What changes per zone is the machinery doing the closing.
In the tight loop, you are the iteration mechanism. Hypothesis, attempt, check, correction, over and over, backed by LSP, compiler, linter, formatter, and diff feedback, with direct material contact through an interactive agent harness like Claude Code, Codex, or Pi, and through tests, UI, and logs.
In the elastic loop, intent-carrying artifacts like specs and user stories become the steering artifacts, carrying boundaries, constraints, and risks. Around it: reproducible setup, CI, acceptance criteria, PR review, reviewer subagents, and checkpoint inspections at the agreed points.
In the loose loop, the machinery has to stand in for your absence: sandboxes and worktrees, Outcome gradingJudging whether a result is good, precisely enough that the judgment can be applied again and again, by a person or a machine.You already do the simplest version every time you write a test or a definition of done. (rubrics, scenarios, golden examples, counterexamples), adversarial agent reviews with approval gates in their own loops, drift monitoring, automated regression gates, rollback paths, and legibility artifacts like screenshots, screen recordings, and trace summaries that make the final human check cheaper. That last item matters more than it sounds: how many loose loops one person can run in parallel is capped less by containment and more by the cost of verifying each one at the gate (the Harness page takes this apart).
And here is the sharpening that took me a while to see. In the tight loop you supply the micro-iterations implicitly, just by sitting there. In the loose loop nobody supplies them. So the harness, the machinery around the agent described above, has to encode those cycles in advance, decompose, parallelize, verify, iterate, as explicit architecture rather than an emergent property of human presence. Cursor’s First Proof and OpenAI’s harness engineering work both arrived at exactly this shape independently.
Could an agent even work here yet?
One more thing, and it sits underneath everything above. Context behaves differently from the choices we have been making so far. Nobody chooses bad context, because it is a maturity state rather than a decision, whether you like it or not. A tight loop survives thin context, because you backfill the missing knowledge turn by turn from your own head; a loose loop has nobody doing that. So bad context gives you no alternative valid cell in the grid; it collapses the entire loose column out of reach, no matter how good your BackpressureThe resistance an agent works against while it builds, well before any review at the end: a failing test or type error on the technical side, an acceptance scenario or rubric on the product side. Some of it reaches the agent automatically as a signal in the loop; some it imposes on itself by following a discipline set at the start, like writing the failing test first and working until it goes green. The more of it you can encode, the longer you can let the loop run.A red build is backpressure; the agent reads it and fixes the code. So are acceptance criteria: write them well and the agent works against your definition of good as it goes, instead of a person catching the miss at the end. is.
That is why I treat context as a gate with two layers.
- Layer 1 is agent readiness: can the agent even work here, with the context quality, freshness, availability, and relevance it would need?
- Layer 2 is loop health: does the team trust itself enough to let go? You can fail the second while passing the first, and plenty of teams do. But failing the first while pretending otherwise is how loose loops produce confident nonsense for days.
So before you size your next task loose, ask the readiness question honestly: could an agent work here at all yet? Could a new flesh-and-blood colleague work here at all, without endless getting-knowledge-out-of-people’s-heads sessions? And if it could, here is the harder question: would you let it?