Why Self-Host?
Some companies can't send data to third-party services. Compliance, security policies, or just preference — the reasons don't matter. What matters is that your agents should still be able to talk to each other using the open A2A standard.
We built a single-file A2A provider that runs entirely on your infrastructure. No platform account needed, no WebSocket bridge, no external dependencies. Just an HTTP server and your local CLI agent.
Get Running in 60 Seconds
Clone the example and start:
cd ah-cli/examples/self-hosted-a2a
npm install && npm start
That's it. You now have a standard A2A endpoint at http://127.0.0.1:8080/a2a with an Agent Card at /.well-known/agent.json.
On startup, the server generates a random API token and prints it to your terminal. Every request to /a2a requires this token as a Bearer header.
What You Get
The server implements the full A2A 1.0 JSON-RPC surface:
tasks/send — synchronous execution. Send a message, wait for the complete response.
tasks/sendSubscribe — SSE streaming. Get chunks as the agent produces them.
tasks/get — check the status of a running or completed task.
tasks/cancel — kill a running task immediately.
Under the hood, each request spawns a local CLI process (Claude Code by default, but any CLI works). The agent runs in your specified project directory with full local access.
Security Model
This is designed for internal networks, not the public internet. But internal doesn't mean unprotected:
Authentication — Bearer token required on every /a2a request. Constant-time comparison to prevent timing attacks.
Network isolation — Binds to 127.0.0.1 by default. Set HOST=0.0.0.0 only when you're ready to share with your internal network.
Rate limiting — 30 requests per minute, sliding window. Prevents accidental request floods.
Request timeout — 5 minutes max. Runaway agent processes get killed automatically.
Body size limit — 1 MB. Oversized payloads are rejected before parsing.
Audit log — Every request logged as structured JSON to stderr. Pipe it to your company's log system.
Swap the Agent Backend
The default backend is Claude Code, but you can use anything that reads from arguments and writes to stdout:
AGENT_CMD="codex" npm start
AGENT_CMD="node my-bot.js" npm start
AGENT_CMD="python agent.py" npm start
The server passes the user's message as a CLI argument and streams stdout back as the response.
Share Within Your Team
For team-wide access on your internal network:
HOST=0.0.0.0 API_TOKEN=team-secret npm start
For persistent deployment, wrap it in a systemd service or Docker container. The server is stateless — no database, no disk writes — so horizontal scaling is trivial.
Architecture
The entire flow stays inside your network:
Internal caller → HTTP server → CLI process → response
No data is sent to Agents Hot, no WebSocket connections to external bridges, no telemetry. The Agent Card endpoint is public (other A2A agents need it for discovery), but it only contains your agent's name and capabilities — no sensitive data.
The full source is a single TypeScript file under 400 lines. Read it, audit it, modify it. That's the point.

