Kira Learn

Supervisor Loop vs Ralph Loop

Two useful control patterns for agent systems: one makes work finish; the other makes noisy information usable.

First pass: explain it like you’re 10

Imagine you are building a Lego spaceship while your room is full of toys, books, and random notes.

A Supervisor loop is like a careful coach standing next to you. The coach says: “Build the wing. Check if it fits. If it doesn’t, try again. When it’s good, move on.” Its job is to make sure a task gets completed properly.

A Ralph loop is like a smart friend cleaning the table before you start. Ralph looks at the messy room, ignores the stuff that does not matter, groups the useful pieces, and says: “These are the parts you should pay attention to.” Its job is to turn noise into signal.

Supervisor loop

Do → evaluate → retry → done.
Best when the question is: “Did the agent finish the job correctly?”

Ralph loop

Scan → filter → compress → decide.
Best when the question is: “What deserves attention in this mess?”

So the difference is simple: Supervisor controls completion. Ralph controls attention.

Supervisor

Do→ Evaluate→ Retry→ Done

Ralph

Scan→ Filter→ Compress→ Decide

Stacked together: Ralph chooses the right thing to work on; Supervisor makes sure that work actually gets finished.

Second pass: deeper technical version

Agent systems fail in two different ways. Sometimes they know the task but do not complete it reliably. Sometimes they are surrounded by too much input and choose badly. These are different failure modes, so they need different loops.

1. The Supervisor loop is completion control

The Supervisor loop wraps an action with judgment. An agent attempts the work, an evaluator checks the result against criteria, and the system either accepts it or sends it back for another pass. The loop ends when the output satisfies the standard or hits a stop condition.

Typical stop conditions include: success, retry limit, time limit, budget limit, or “needs human decision.” This makes the Supervisor loop useful for coding tasks, research briefs, data extraction, document drafting, QA, and any workflow where “mostly done” is not good enough.

Its central question is: Is this output acceptable yet?

2. The Ralph loop is signal control

The Ralph loop sits earlier in the pipeline. It assumes the world is noisy: feeds, chats, meetings, logs, emails, screenshots, bookmarks, alerts. Ralph scans broadly, filters irrelevant material, compresses what remains into a usable representation, and decides what should become an action, artifact, brief, or ignore.

This is not just summarization. Summarization makes text shorter. Ralph makes attention sharper. It decides which signals deserve space in the operating system.

Its central question is: What is worth acting on?

3. Eli’s X Hotlist is a Ralph-loop artifact

Eli’s X Hotlist is a concrete Ralph output. X is high-noise by design: many posts, mixed quality, duplicated ideas, outrage, jokes, half-formed opportunities, and occasional gems. A Ralph loop scans that stream, filters for relevance, compresses repeated themes, and turns the result into a hotlist: a small set of people, posts, ideas, or angles worth attention.

The hotlist is not the final work. It is an attention artifact. It says: “From the entire messy feed, these are the signals we should consider using.”

4. How the loops stack

The clean architecture is not Supervisor versus Ralph. It is Ralph before Supervisor, and sometimes Ralph inside Supervisor.

Ralph before Supervisor: scan the environment, pick the best opportunity, then hand it to a Supervisor loop to execute.
Supervisor around Ralph: evaluate whether the hotlist itself is good enough. If it misses obvious signals or includes junk, rerun Ralph with better filters.
Ralph inside Supervisor: during retries, scan failures and compress feedback so the next attempt improves intelligently.

Practical rule: use Ralph when the input is messy and the problem is attention. Use Supervisor when the goal is known and the problem is reliable completion.

The shortest useful distinction

Ralph finds the right target. Supervisor makes the shot land. A mature agent system usually needs both: signal control to avoid working on the wrong thing, and completion control to avoid leaving the right thing half-done.