Protocol overview
The Sophon Agent Protocol (SAP) is the wire format the iOS app and every bridge speak. It's HTTPS + Server-Sent Events on the iOS side, HTTPS + WebSocket on the bridge side, and JSON all the way down. No proprietary SDK is required.
Three layers
┌──────────────────────┐
│ iOS ↔ Cloud │ /v1/me/* SSE on /v1/me/stream
├──────────────────────┤
│ Cloud ↔ Bridge │ /v1/bridge/* WS on /v1/bridge/ws
├──────────────────────┤
│ Bridge ↔ Agent │ anything you like — stdio, HTTP, WS
└──────────────────────┘
Each layer is independent:
- iOS ↔ Cloud is what your phone uses to send messages, fetch history, archive sessions, manage installations. The iOS app is the only consumer.
- Cloud ↔ Bridge is what this protocol specifies for connector
authors. Your bridge connects here, receives
session.messageevents, and POSTs back replies, tool events, approval requests. - Bridge ↔ Agent is whatever your agent already speaks. OpenClaw uses a local WebSocket. Your custom bridge can shell out to a CLI, call an LLM, drive an MCP server — your call.
What SAP isn't
It's worth being explicit:
- Not a marketplace. Sophon does not host a directory of agents
for users to install. The iOS app is for your agents — the
ones you ran the bridge for. (A shared-agent path exists in the
protocol RFC under
/v1/bot/*withagt_*tokens, but it's deferred indefinitely; nothing in shipping iOS surfaces it.) - Not a webhook directory. The bridge holds a WebSocket open; the cloud doesn't POST to a public URL on your machine.
- Not a hosting platform. Your agent runs wherever you run it. We don't ship code, store secrets, or proxy LLM credentials.
The supported token type
inst_<id>:s_<env>_<secret>
A bridge token. Issued at pairing, scoped to one installation,
stored in the bridge's keychain. Every /v1/bridge/* call
authenticates with Authorization: Bearer inst_…:s_live_….
Transports at a glance
| Channel | Direction | Endpoint |
|---|---|---|
| REST | bridge → server | POST /v1/bridge/sendMessage, etc. |
| WebSocket | server → bridge | wss://api.sophon.at/v1/bridge/ws |
| REST | iOS → server | POST /v1/me/sessions/:id/send, etc. |
| SSE | server → iOS | GET /v1/me/stream |
REST POSTs are idempotent on idempotency_key; the WS uses
ping/pong heartbeats every 30 s; the SSE stream resumes via
Last-Event-ID against a 5-minute / 256-event ring buffer.
Event taxonomy
Every event ultimately reaches iOS via SSE. The names are stable across the surface:
hello — connection ack
heartbeat — keep-alive every 25 s
message_added — a new message landed (user or agent)
message_delta — streaming chunk for an existing message
message_finalized — final canonical text + usage metadata
session_created — chat session opened
session_renamed — title changed
session_updated — state changed (active ↔ archived)
session_deleted — soft-deleted (24 h grace)
installation_created — new bridge paired
installation_revoked — installation gone
installation_updated — display name / emoji changed
agent_health_changed — degraded / healthy
task_created — tool call started
task_progress — partial result
task_completed | task_failed | task_cancelled
approval_requested — HITL: agent wants user permission
approval_resolved — user decided
device_capability_request — agent asks the phone for camera / GPS
Same wire shapes, regardless of which connector emitted them.
Snake_case wire
Every JSON body — REST, SSE data, WS frame — uses snake_case
keys (session_id, interaction_id, created_at). The server
projects every database row through lib/wire.ts so the iOS-side
typed decoders see what they expect.
Read on
- Connection lifecycle — pair, dial, receive, reconnect.
- Streaming model — how text deltas, tool events, and approvals interleave inside one interaction.
- Tool calls & approvals —
the
task_*andapproval_*event sequences end-to-end. - Wire reference — every payload field, copy-paste ready.
- Errors & rate limits — envelope shape, status codes, rate-limit headers.
- Idempotency & resume — why and how to retry safely.
- Observability — request IDs, W3C trace context, OpenTelemetry pass-through.
The exhaustive normative spec lives in the SAP RFC. The pages above are the human-readable derivative; the RFC is the source of truth.