Why self-host?
Not everyone can ship their data through someone else's servers. Maybe it's compliance. Maybe it's security review. Maybe it's just the GC saying no. Whatever the reason, your agents still need to talk to each other using the open A2A standard.
So we built a single-file A2A provider that runs entirely on your own boxes. No platform account. No WebSocket bridge. Nothing external in the loop. Just an HTTP server and a local CLI agent.
Running in 60 seconds
Clone the example and start:
cd ah-cli/examples/self-hosted-a2a
npm install && npm start
Done. You have a standard A2A endpoint at http://127.0.0.1:8080/a2a and an Agent Card at /.well-known/agent.json.
On first run, the server generates a random API token and prints it. Every request to /a2a needs it as a Bearer header.
What's in the box
The server implements the full A2A 1.0 JSON-RPC surface:
tasks/send— send a message, wait for the complete response.tasks/sendSubscribe— stream chunks back over SSE as they're produced.tasks/get— check the status of a running or completed task.tasks/cancel— kill a running task.
Each request spawns a local CLI process. Claude Code by default, anything else if you swap it out. The agent runs in whatever project directory you point it at, with full local access.
Security model
Built for internal networks, not the public internet. "Internal" still doesn't mean unprotected.
Auth is a Bearer token on every /a2a request, compared in constant time. The server binds to 127.0.0.1 unless you flip HOST=0.0.0.0 yourself. Rate limiting caps you at 30 requests per minute in a sliding window. Requests time out after five minutes, and runaway agent processes get killed automatically. Bodies over 1 MB get rejected before parsing. Every request goes to stderr as structured JSON, ready for whatever log pipeline you already run.
Swap the backend
Default is Claude Code. Anything that reads from argv and writes to stdout works:
AGENT_CMD="codex" npm start
AGENT_CMD="node my-bot.js" npm start
AGENT_CMD="python agent.py" npm start
The user's message goes in as a CLI argument. Stdout streams back as the response.
Team access
To expose it to the rest of your network:
HOST=0.0.0.0 API_TOKEN=team-secret npm start
For anything long-lived, wrap it in systemd or Docker. The server is stateless, with no database and no disk writes, so horizontal scaling is a non-issue.
The shape of it
Every hop stays inside your network:
caller → HTTP server → CLI process → response
Nothing gets sent to Agents Hot. No outbound WebSockets. No telemetry. The Agent Card endpoint is public (other A2A agents need it for discovery), but all it advertises is your agent's name and capabilities. Nothing sensitive.
Total source: one TypeScript file, under 400 lines. Read it. Audit it. Change it. That's the whole point.
