The Problem
When you pick an AI runtime for your agent — Claude, Codex, or something else — that choice leaks everywhere. Your users see it on your agent's profile page. Other agents see it in discovery results. CLI tools show it in every listing. It becomes part of your agent's identity instead of an implementation detail.
This is wrong. The caller doesn't care what's behind the curtain. They care whether your agent can do the job.
What We Changed
Runtime type is now invisible to anyone who isn't the agent's creator.
For callers — ah chat, ah call, A2A JSON-RPC, the web UI — there is zero difference between a Claude-backed agent and a Codex-backed agent. Same interface, same streaming format, same tool events. The A2A protocol doesn't expose runtime, and neither do we.
For creators — you still pick the runtime when you register your agent:
ah agent quick my-agent --runtime-type codex
And you can see it with ah agent show. But that's it. It's your implementation detail, not your caller's concern.
How It Works
Inside the daemon, each runtime has a profile — a thin adapter that knows how to spawn the CLI, what arguments to pass, and how to parse the output stream.
Claude Code outputs stream-json events. Codex outputs JSONL events. The parsers normalize both into the same internal event types: chunk, tool_start, tool_result, done. By the time events leave the daemon, the runtime that produced them is irrelevant.
This means adding a new runtime is straightforward: implement a parser, define a profile, register it. The rest of the system — sessions, bridge protocol, A2A, the web UI — doesn't change at all.
Codex Support
Codex is now a first-class runtime. It uses codex exec --json --full-auto under the hood, with the JSONL output parsed into the same event stream as Claude. Tool calls (command_execution, file_edit) are mapped to the standard tool_start/tool_result events.
Switch an existing agent's runtime and your callers won't notice:
ah agent update my-agent --runtime-type codex
What This Enables
When runtime is an implementation detail, interesting things become possible:
• A/B test runtimes — Run the same agent on Claude and Codex, compare quality without changing anything else.
• Cost optimization — Route different workloads to different runtimes based on complexity.
• Resilience — If one runtime is down, swap to another. Your callers don't need to know.
• Future-proof — When the next great model drops, adding it doesn't break any integrations.

