Sophon

Protocol overview

The Sophon Agent Protocol (SAP) is the wire format the iOS app and every bridge speak. It's HTTPS + Server-Sent Events on the iOS side, HTTPS + WebSocket on the bridge side, and JSON all the way down. No proprietary SDK is required.

Three layers

┌──────────────────────┐
│  iOS  ↔  Cloud       │   /v1/me/*       SSE on /v1/me/stream
├──────────────────────┤
│  Cloud ↔  Bridge     │   /v1/bridge/*   WS on /v1/bridge/ws
├──────────────────────┤
│  Bridge ↔  Agent     │   anything you like — stdio, HTTP, WS
└──────────────────────┘

Each layer is independent:

  • iOS ↔ Cloud is what your phone uses to send messages, fetch history, archive sessions, manage installations. The iOS app is the only consumer.
  • Cloud ↔ Bridge is what this protocol specifies for connector authors. Your bridge connects here, receives session.message events, and POSTs back replies, tool events, approval requests.
  • Bridge ↔ Agent is whatever your agent already speaks. OpenClaw uses a local WebSocket. Your custom bridge can shell out to a CLI, call an LLM, drive an MCP server — your call.

What SAP isn't

It's worth being explicit:

  • Not a marketplace. Sophon does not host a directory of agents for users to install. The iOS app is for your agents — the ones you ran the bridge for. (A shared-agent path exists in the protocol RFC under /v1/bot/* with agt_* tokens, but it's deferred indefinitely; nothing in shipping iOS surfaces it.)
  • Not a webhook directory. The bridge holds a WebSocket open; the cloud doesn't POST to a public URL on your machine.
  • Not a hosting platform. Your agent runs wherever you run it. We don't ship code, store secrets, or proxy LLM credentials.

The supported token type

inst_<id>:s_<env>_<secret>

A bridge token. Issued at pairing, scoped to one installation, stored in the bridge's keychain. Every /v1/bridge/* call authenticates with Authorization: Bearer inst_…:s_live_….

Transports at a glance

ChannelDirectionEndpoint
RESTbridge → serverPOST /v1/bridge/sendMessage, etc.
WebSocketserver → bridgewss://api.sophon.at/v1/bridge/ws
RESTiOS → serverPOST /v1/me/sessions/:id/send, etc.
SSEserver → iOSGET /v1/me/stream

REST POSTs are idempotent on idempotency_key; the WS uses ping/pong heartbeats every 30 s; the SSE stream resumes via Last-Event-ID against a 5-minute / 256-event ring buffer.

Event taxonomy

Every event ultimately reaches iOS via SSE. The names are stable across the surface:

hello                       — connection ack
heartbeat                   — keep-alive every 25 s
message_added               — a new message landed (user or agent)
message_delta               — streaming chunk for an existing message
message_finalized           — final canonical text + usage metadata
session_created             — chat session opened
session_renamed             — title changed
session_updated             — state changed (active ↔ archived)
session_deleted             — soft-deleted (24 h grace)
installation_created        — new bridge paired
installation_revoked        — installation gone
installation_updated        — display name / emoji changed
agent_health_changed        — degraded / healthy
task_created                — tool call started
task_progress               — partial result
task_completed | task_failed | task_cancelled
approval_requested          — HITL: agent wants user permission
approval_resolved           — user decided
device_capability_request   — agent asks the phone for camera / GPS

Same wire shapes, regardless of which connector emitted them.

Snake_case wire

Every JSON body — REST, SSE data, WS frame — uses snake_case keys (session_id, interaction_id, created_at). The server projects every database row through lib/wire.ts so the iOS-side typed decoders see what they expect.

Read on

The exhaustive normative spec lives in the SAP RFC. The pages above are the human-readable derivative; the RFC is the source of truth.