ACP, ACPX, and Direct CLI Routing — what's the actual difference?
Three ways to talk to coding agents. They look similar. They aren't.
The Simple Version
Imagine you have three coding assistants in your house: Codex, Claude Code, and Gemini. They each live in their own room and they each speak their own dialect. You want to send them work.
You could walk to each room and talk to them directly. That works. But you have to remember which dialect goes with which assistant, you have to bring your own pen and paper for each one, and if you want them to remember earlier conversations, that's on you to track.
Or you could install a universal intercom. The intercom speaks one shared language. You press the button for "Claude" or "Codex" or "Gemini" and the same intercom translates for you. Same buttons. Same screen. Same memory.
Or you could hire a chief of staff. The chief of staff has the universal intercom, but they also keep a notebook of every conversation, label each one as a thread, and bring you summaries. You don't even press the button — you tell the chief of staff what you want, and they handle the rest.
That's the whole picture. Direct CLI is walking to each room. ACPX is the universal intercom. OpenClaw's ACP runtime is the chief of staff. They're all useful — for different jobs.
How It Actually Works
Now the technical version. The trick to keeping this clear is to remember it's three layers, not three competing tools. Each layer wraps the one below it.
Layer 1 — Direct CLI
This is what it sounds like: you call the binary. codex, claude, gemini. Each agent has its own flags, its own session model, its own quirks. Codex won't run outside a git repo. Claude Code wants --print --permission-mode bypassPermissions. Pi needs a PTY. You — or a skill — own all those rules.
# Three different incantations for three different tools
codex exec --full-auto "refactor the auth module"
claude --permission-mode bypassPermissions --print "build a snake game"
gemini --acp
Strengths: minimum surface area. You see exactly what the agent does. Easy to debug. Great for one-shot scratch work.
Weaknesses: no shared interface. If you want to drive five different harnesses, you're writing five different parsers. There's no built-in concept of "this is one ongoing thread of work." You scrape stdout and hope.
Layer 2 — ACPX (and the ACP protocol underneath it)
This layer exists because of a real problem: every coding harness has its own way of streaming partial output, asking for permissions, and announcing it's done. Tooling around them was a mess of glue code.
So Zed Industries (the editor company) defined a small JSON-RPC contract called the Agent Client Protocol — ACP for short. Any agent that speaks ACP exposes the same standardized hooks: start a session, send a turn, stream a response, request a tool permission, end the session. Codex ships an ACP adapter. Claude ships one. Gemini ships --acp. So does Copilot, Cursor, OpenCode, Pi, Qwen, Kimi, and others.
acpx is a small CLI on top of ACP. It gives every harness a uniform shell interface. One command, one set of flags, swap the agent name as a parameter:
# Same shape, different agents
acpx codex sessions new --name oc-codex-1234
acpx codex -s oc-codex-1234 --cwd ~/project --format quiet "fix the auth bug"
acpx claude sessions new --name oc-claude-1234
acpx claude -s oc-claude-1234 --cwd ~/project --format quiet "explain the schema"
Notice what's now uniform: named sessions (so a follow-up prompt continues the same conversation), working directory (--cwd), output format (--format quiet just gives clean assistant text). You stop caring whether you're talking to Codex or Claude — the wire is identical.
ACP isn't a new agent. It's a shape — a common protocol — that makes every agent look the same to whatever's calling it. ACPX is the most ergonomic way to use that shape from a shell.
Strengths: uniform interface. Persistent named sessions. Clean output. Plug in a new harness with one config line.
Weaknesses: you still have to manage sessions yourself. If a session dies, you re-create it. There's no notion of "this work belongs to a thread in a Discord conversation." That's the next layer's job.
Layer 3 — OpenClaw ACP runtime
OpenClaw's runtime is the chief of staff. It uses acpx under the hood, but adds the operational layer on top: threads, lifecycle, recovery, policy, relay.
Instead of shelling out to acpx yourself, you call OpenClaw's sessions_spawn tool with a structured payload. The runtime maps your conversation to a thread, picks the right agent, opens or resumes the right session, ships the prompt, watches for completion, and routes the assistant's reply back into your chat.
{
"task": "Refactor the WhatsApp adapter to use the new auth helper.",
"runtime": "acp",
"agentId": "codex",
"thread": true,
"mode": "session"
}
What that single call gives you, that the lower layers don't:
- Threading. The session is bound to your conversation. Follow-ups in the same thread go to the same agent automatically.
- Recovery. If
acpxisn't installed or the adapter is broken, the runtime tries to repair (re-install pinnedacpx, restart the gateway, retry once) before giving up. - Policy. Not every agent is allowed in every context. The runtime enforces allow-lists.
- Observability. Sessions show up in
sessions_list. You can fetch history, send messages, kill, or hand off. - Channel relay. Output flows back to the surface that started the work — Discord, Telegram, web — without you wiring up a bridge.
When to use which
Direct CLI
Scratch work. One-shot tasks in a temp dir. Debugging an agent's actual output. Anything where you want zero abstraction.
ACPX (telephone game)
Driving harnesses from scripts or skills. Swapping models without rewriting glue. Quick programmatic relay when ACP runtime isn't needed.
ACP runtime
The default for any thread-bound coding work in chat. Long sessions, observable lifecycle, automatic recovery, channel-aware replies.
| Concern | Direct CLI | ACPX | ACP runtime |
|---|---|---|---|
| Uniform interface across harnesses | No | Yes | Yes |
| Persistent named sessions | You build it | Yes | Yes |
| Thread-bound to a conversation | No | No | Yes |
| Auto recovery on failure | No | No | Yes |
| Policy / allow-list enforcement | No | No | Yes |
| Output streamed back to chat surface | Manual | Manual | Automatic |
| Best blast radius | Tiny scratch | Scripts & skills | Production work |
The mental shortcut
If someone asks you which lane to pick, the question collapses to: how much state do you need OpenClaw to manage for you?
- None → Direct CLI.
- A session, but I'll drive it → ACPX.
- A whole thread, with lifecycle and replies → ACP runtime via
sessions_spawn.
Key Takeaways
- ACP is a protocol, not a tool. It's the shared shape every modern coding harness now speaks.
- ACPX is the CLI form of ACP. One command, swap the agent name, get persistent sessions for free.
- OpenClaw's ACP runtime is the orchestration layer. It uses ACPX under the hood and adds threads, recovery, policy, and channel relay.
- Direct CLI never goes away. It's the lowest-blast-radius option for scratch work and debugging.
- Default rule: if it's chat-bound coding work, use
sessions_spawnwithruntime: "acp". Step down to ACPX or direct CLI only when you have a reason.