Streaming model
A single agent turn is one interaction, identified by an
interaction_id. Every event SAP fires during that turn carries
the same id, so the iOS client can route deltas to the right
bubble.
Event order in a normal text-only turn
1. POST /v1/me/sessions/:id/send → returns { interaction_id, message_id }
2. message_added (role=user, interaction_id, …) ← server echo
3. message_added (role=agent, text=" ", interaction_id) ← placeholder bubble
4. message_delta × N (interaction_id, delta="…")
5. message_finalized (interaction_id, text=full, usage)
Bullet 3 is the placeholder bubble — your bridge POSTs it via
/v1/bridge/sendMessage so iOS can show "Thinking…" with the
right bubble shape immediately. Without it the user stares at a
blank chat for the whole LLM round-trip.
Tool calls inside a turn
Tool invocations slot between bullets 4 and 5 (or alongside deltas, if the model streams while the tool runs):
…
4a. task_created (task_id, kind=exec, status_label="ls -la")
4b. task_progress (task_id, partial_result?)
4c. task_completed | task_failed | task_cancelled
5. message_finalized
iOS coalesces consecutive task_* events for the same interaction
into a single ToolGroupView ("Ran 3 commands") collapsible row.
Tap to drill into per-tool args and result. The same data also
surfaces inline as AssistantSegment.toolCall /
AssistantSegment.toolResult on the agent bubble, so a user
opening the chat next month can still see what happened.
Approvals (HITL)
If the agent needs user permission mid-turn it pauses and emits
an approval_requested:
4a. task_created (task_id, kind=exec)
4b. approval_requested (approval_id, command, severity, …)
↓ user taps Allow / Allow always / Deny in iOS
4c. POST /v1/me/approvals/:id { decision }
4d. approval_resolved (approval_id, decision)
4e. task_completed | task_failed
The bridge listens for approval.resolved on the bus and unblocks
the agent. See Tool calls & approvals
for the full HITL contract.
Resume on reconnect
The server keeps a 5-minute / 256-event ring buffer per user.
When the iOS app reconnects, it sends Last-Event-ID automatically;
the server replays everything past that id, then continues live.
For gaps older than 5 minutes — the user backgrounded the app for
half an hour, the LTE connection died on a train — sessions and
messages are already durable (/v1/me + /v1/me/sessions/:id/messages
on cold launch). What the ring buffer drops is live ephemeral
state: the "may I run this command?" approval that fired while
the app was killed. The cold-launch snapshot endpoint plugs
that gap:
GET /v1/me/snapshot
→ { ts, pending_approvals: [...] }
iOS calls this in refreshSnapshot() between refreshMe() and the
SSE attach. Each entry folds through the same handler the live
approval_requested event uses, so re-emits dedupe by
approval_id. See the Idempotency & resume
page for the exact response shape.
Pending-run resume on the iOS side
When you kill the app mid-stream, iOS persists (session_id, bubble_id, run_id) to disk. On cold launch tryResumePendingRuns()
walks each record:
- Run id present —
adapter.resumeRun(sessionId, runId)opens a fresh continuation. A watchdog pollsGET /v1/me/sessions/:id/messagesevery 3 s for up to 24 s; if the agent reply with thatinteraction_idis already in the DB we resolve as if SSE delivered. - Run id absent — user killed the app inside the
/sessions/:id/sendround-trip. We just reload history; the server-stored turn surfaces on the nextloadHistory().
Either way the user reopens the chat and sees what the agent was doing, even if the SSE stream missed every event in the gap.
Backpressure
Bridges should serialise their writes back to Sophon, so a
fast burst of deltas doesn't race the placeholder-bubble
create. A simple in-flight queue per (session_id, interaction_id)
is enough — the OpenClaw bridge in
connectors/openclaw-bridge/src/sophon.ts does exactly this.
If you push deltas faster than the rate-limit bucket allows
(delta bucket: 200 capacity, 100/s refill), you'll start
getting 429 rate_limited with Retry-After. See
Errors & rate limits for the bucket table.