Build your first agent
By the end of this page you'll have a Node.js bridge running on your laptop that talks to a real LLM (we use Claude as the example) and streams replies into the Sophon iOS app on your phone. ~80 lines of code, 15 minutes.
This is a tutorial, not a reference. We'll write the actual files step by step. If you want the full SAP wire surface in one place, read Write your own connector — this page's job is to get you to a working agent first.
What you need
- The Sophon iOS app on your phone. Download link coming soon: Download Sophon for iOS.
- Node 22+ on your laptop.
- An Anthropic API key (
sk-ant-…). The agent loop is plain HTTP, so swapping in OpenAI / Gemini / a local model is one function — we'll mark the spot.
You don't need to clone the s-chat repo. Everything fits in one new directory.
1. Pair a bridge from your phone
Open Sophon on iPhone, tap Settings → Connect a bridge → Custom
Bridge. iOS shows you a 7-letter code like 9FP9SVT and waits.
Important: that code pairs any bridge that POSTs the right
shape against /v1/pairing/start — it isn't locked to a
particular npm package. We're going to write our own bridge
and use that code in step 3.
Leave the iOS screen open. Copy the code; we'll paste it into the terminal in a moment.
2. Scaffold the project
mkdir my-first-agent && cd my-first-agent
npm init -y && npm pkg set type=module
npm install ws @anthropic-ai/sdk
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
echo ".bridge-token" >> .gitignore
echo ".env" >> .gitignoreTwo dependencies: ws for the Sophon WebSocket, and Anthropic's
SDK for the LLM. That's it.
3. Claim the pairing code → bridge token
The pairing flow is two HTTP calls: start mints a code, poll
long-polls until the user types it on iOS. Since iOS already
generated the code in step 1, don't call start — just poll
with the code your phone showed you.
Save this as pair.mjs:
// pair.mjs — run once, persist the bot token, never run again.
import { writeFileSync } from 'node:fs'
const SOPHON_BASE = 'https://api.sophon.at'
const code = process.argv[2]?.trim().toUpperCase()
if (!code) {
console.error('usage: node pair.mjs <7-letter-code-from-ios>')
process.exit(2)
}
// /v1/pairing/poll long-polls up to 25 s. Loop until the iOS user
// taps Pair (status: 'claimed') or the code expires (~120 s TTL).
const startedAt = Date.now()
while (Date.now() - startedAt < 120_000) {
const url = new URL(`${SOPHON_BASE}/v1/pairing/poll`)
url.searchParams.set('code', code)
url.searchParams.set('timeout', '25')
const r = await fetch(url)
const body = await r.json()
if (!body.ok) throw new Error(`poll failed: ${JSON.stringify(body)}`)
const { status, bot_token, sophon_ws_url, installation_id } = body.result
if (status === 'pending') continue // 25 s elapsed, loop
if (status === 'expired') throw new Error('code expired — get a fresh one from iOS')
if (status === 'claimed') {
writeFileSync('.bridge-token', bot_token, { mode: 0o600 })
writeFileSync('.bridge-meta.json', JSON.stringify({
installation_id, sophon_ws_url,
}, null, 2))
console.log(`✓ paired — installation ${installation_id}`)
console.log(` token saved to .bridge-token (0600)`)
process.exit(0)
}
throw new Error(`unexpected status: ${status}`)
}
throw new Error('pairing window timed out')Run it:
node pair.mjs 9FP9SVT # ← your code from iOSThe script blocks on the first poll. Switch back to your phone,
tap Pair. The terminal prints:
✓ paired — installation inst_q9w8e7r6t5y4u3i2
token saved to .bridge-token (0600)
The .bridge-token file holds a string like
inst_q9w8e7r6t5y4u3i2:s_live_AbC123…. Treat it like an SSH key.
You only run pair.mjs once — every subsequent process reads the
file.
4. Open the WebSocket
Save this as bridge.mjs and don't add the LLM yet — first we
just want to see frames arrive.
// bridge.mjs
import { readFileSync } from 'node:fs'
import { WebSocket } from 'ws'
const TOKEN = readFileSync('.bridge-token', 'utf8').trim()
const SOPHON_BASE = 'https://api.sophon.at'
const WS_URL = SOPHON_BASE.replace(/^http/, 'ws') + '/v1/bridge/ws'
const ws = new WebSocket(WS_URL, {
headers: { Authorization: `Bearer ${TOKEN}` },
})
ws.on('open', () => console.log('[ws] open'))
ws.on('close', () => console.log('[ws] close'))
ws.on('error', (e) => console.error('[ws] error', e.message))
ws.on('message', (raw) => {
const frame = JSON.parse(raw.toString())
console.log('[ws] frame:', frame.type)
// Heartbeat: respond to ping within 10 s or the server kicks us.
if (frame.type === 'ping') ws.send(JSON.stringify({ type: 'pong' }))
})Run it:
node bridge.mjsYou should see:
[ws] open
[ws] frame: ready
ready is the server's handshake — it confirms your token is
valid and binds the connection to your installation_id. Every
~30 s you'll also see a ping (your client replies with
pong automatically). Leave the bridge running.
5. Listen for session.message
The interesting frame is update with update.type === 'session.message' — that's the user typing on their phone.
Replace the ws.on('message', …) handler with this:
ws.on('message', async (raw) => {
const frame = JSON.parse(raw.toString())
if (frame.type === 'ready') return
if (frame.type === 'ping') return ws.send(JSON.stringify({ type: 'pong' }))
if (frame.type !== 'update') return
const update = frame.update
// ACK on receipt — the server replays unack'd updates every ~5 s,
// and your LLM call may take 30 s+. ACKing now means a slow LLM
// never causes duplicate runs. (Trade: if we crash mid-stream the
// partial reply is what the user sees.)
ws.send(JSON.stringify({ type: 'ack', up_to_update_id: update.update_id }))
if (update.type !== 'session.message') return
const { session, message, interaction_id } = update.payload
console.log(`[msg] session=${session.id} text=${JSON.stringify(message.text)}`)
// (next step: wire to the LLM)
})Test it: in iOS, tap the compose pencil → pick Custom Bridge from the chip strip → type "hello". You'll see in your terminal:
[ws] frame: update
[msg] session=sess_… text="hello"
The bubble in iOS will sit there forever — we haven't replied yet. That's step 6.
6. Stream a reply from Claude
This is the agent loop: open a placeholder bubble in iOS, stream
tokens from Anthropic into it, finalise. Three POSTs:
sendMessage (open the bubble), sendMessageDelta (per token
batch), sendMessageEnd (lock it).
Add to the top of bridge.mjs:
import Anthropic from '@anthropic-ai/sdk'
import { randomUUID } from 'node:crypto'
const anthropic = new Anthropic() // reads ANTHROPIC_API_KEY from env
// POST /v1/bridge/* helper. Every call carries the bridge token.
async function sap(path, body) {
const r = await fetch(`${SOPHON_BASE}${path}`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${TOKEN}`,
'Content-Type': 'application/json',
},
body: JSON.stringify(body),
})
const data = await r.json()
if (!r.ok || !data.ok) throw new Error(`${path} → ${r.status} ${JSON.stringify(data)}`)
return data.result
}Then add handleUserMessage:
async function handleUserMessage({ sessionId, interactionId, text }) {
// 1. Open a placeholder agent bubble. iOS shows "Thinking…" the
// instant this returns, so the user isn't staring at silence
// while Anthropic warms up.
const { message_id } = await sap('/v1/bridge/sendMessage', {
session_id: sessionId,
interaction_id: interactionId,
text: ' ', // server requires non-empty
idempotency_key: randomUUID(),
})
// 2. Stream from Anthropic. We batch tokens into ~80 ms windows
// so we send fewer, fatter delta POSTs (each one round-trips
// through Sophon Cloud → iOS — too small a batch and the
// network overhead dominates).
let accumulated = ''
let pending = ''
let lastFlush = Date.now()
const flush = async () => {
if (!pending) return
const delta = pending
pending = ''
lastFlush = Date.now()
await sap('/v1/bridge/sendMessageDelta', {
message_id,
delta,
idempotency_key: randomUUID(),
})
}
const stream = await anthropic.messages.stream({
model: 'claude-sonnet-4-5',
max_tokens: 1024,
messages: [{ role: 'user', content: text }],
})
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
pending += event.delta.text
accumulated += event.delta.text
if (pending.length >= 32 || Date.now() - lastFlush >= 80) await flush()
}
}
await flush() // tail tokens
// 3. Finalise. After this iOS locks the bubble and the chat
// list shows the snippet. `text` is the canonical full body
// (deltas are advisory; end is authoritative).
const finalMsg = await stream.finalMessage()
await sap('/v1/bridge/sendMessageEnd', {
message_id,
text: accumulated,
usage: {
input_tokens: finalMsg.usage.input_tokens,
output_tokens: finalMsg.usage.output_tokens,
model: 'claude-sonnet-4-5',
provider: 'anthropic',
},
finish_reason: 'stop',
idempotency_key: randomUUID(),
})
}Wire it into the WS handler — replace the console.log('[msg]…')
line with:
handleUserMessage({
sessionId: session.id,
interactionId: interaction_id,
text: message.text,
}).catch((err) => console.error('[handler]', err))Restart node bridge.mjs.
Swapping the LLM. The only Anthropic-specific code is the
anthropic.messages.stream(…)call and theusagerow. To use OpenAI you swap the import, call theirchat.completionswithstream: true, and emitevent.choices[0].delta.contentas yourdelta. Everything else — the three SAP POSTs, the batching — is identical.
7. Send the user a message
Switch to iOS. The chat with Custom Bridge is still open from step 5. Type "summarise the plot of Macbeth in three lines".
What you'll see, in order:
- Your bubble appears immediately.
- A "Thinking…" placeholder for the agent (that's the empty bubble from step 6.1).
- Tokens stream in token by token — each POST in your terminal is one delta in the bubble.
- The bubble locks. The chat list shows the snippet.
Background the iOS app, force-quit it, reopen — the chat is still there, fully reconstructed. That's the SSE resume layer doing its job; you got it for free.
8. (bonus) Add a tool
Tool calls render as cards in the chat — taps drill into args +
result. Let's give the agent one tool: current_time.
The flow on the wire is createTask → finishTask (use
updateTask for live progress). The Anthropic SDK's tool_use
stop reason tells you when the model wants to call your tool.
Add a tool definition + replace the streaming loop:
const tools = [{
name: 'current_time',
description: 'Returns the current ISO-8601 timestamp in UTC.',
input_schema: { type: 'object', properties: {}, required: [] },
}]
async function runWithTools({ sessionId, interactionId, message_id, userText }) {
const messages = [{ role: 'user', content: userText }]
let accumulated = ''
// Loop: call Claude → if it wants a tool, run it, append the
// result, call again. Most agent runs need 1-2 iterations.
while (true) {
const stream = await anthropic.messages.stream({
model: 'claude-sonnet-4-5',
max_tokens: 1024,
tools,
messages,
})
// Stream text deltas as before.
for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
accumulated += event.delta.text
await sap('/v1/bridge/sendMessageDelta', {
message_id, delta: event.delta.text, idempotency_key: randomUUID(),
})
}
}
const final = await stream.finalMessage()
messages.push({ role: 'assistant', content: final.content })
if (final.stop_reason !== 'tool_use') return { accumulated, final }
// Claude wants to call tool(s). Emit task_* events for each
// and accumulate results to feed back in the next turn.
const toolResults = []
for (const block of final.content) {
if (block.type !== 'tool_use') continue
// task_id IS the idempotency key — same shape twice → same card.
await sap('/v1/bridge/createTask', {
session_id: sessionId,
interaction_id: interactionId,
task_id: block.id,
kind: block.name,
status_label: block.name,
args: block.input,
})
let result, ok = true
try {
if (block.name === 'current_time') result = { iso: new Date().toISOString() }
else throw new Error(`unknown tool: ${block.name}`)
} catch (err) { ok = false; result = String(err) }
await sap('/v1/bridge/finishTask', {
session_id: sessionId,
interaction_id: interactionId,
task_id: block.id,
status: ok ? 'completed' : 'failed',
...(ok ? { result } : { error: result }),
})
toolResults.push({
type: 'tool_result',
tool_use_id: block.id,
content: JSON.stringify(result),
})
}
messages.push({ role: 'user', content: toolResults })
}
}Then, in handleUserMessage, replace the streaming block (steps 2
and 3) with:
const { accumulated, final } = await runWithTools({
sessionId, interactionId, message_id, userText: text,
})
await sap('/v1/bridge/sendMessageEnd', {
message_id,
text: accumulated,
usage: {
input_tokens: final.usage.input_tokens,
output_tokens: final.usage.output_tokens,
model: 'claude-sonnet-4-5', provider: 'anthropic',
},
finish_reason: 'stop',
idempotency_key: randomUUID(),
})Restart the bridge. In iOS, send "what time is it?" — the agent
decides to call current_time, you see a tool card slot in mid-bubble
showing current_time → { iso: "2026-…" }, and the reply continues
underneath using the result.
What just happened
iPhone Sophon Cloud your laptop
│ │ │
│ user types ──────────▶│ session.message ───▶│ ws frame
│ │ │ sendMessage →
│ │ │ (bubble id)
│ │ ◀───────────────────│ sendMessageDelta × N
│ tokens render ◀───────│ │ (Anthropic stream)
│ │ ◀───────────────────│ createTask
│ tool card ◀───────────│ │ (Claude tool_use)
│ │ ◀───────────────────│ finishTask
│ │ ◀───────────────────│ sendMessageEnd
│ bubble locks ◀────────│ │
You've written a full agent in under 100 lines of JavaScript: WS in, REST out, idempotency keys on every write, ack-on-receipt semantics, streaming tokens, a tool card. The same code shape scales — replace the LLM with Claude Code, GPT, a local Ollama, or your own pipeline, and add tools as you need them.
What's next
- Add an approval before destructive tools — your agent pauses, iOS surfaces an action sheet, you resume on the user's decision. Wire shape: Tool calls & approvals.
- Read Idempotency & resume — the
retry semantics behind the
idempotency_keyfield. Worth knowing before you put a bridge on a flaky network. - Skim Write your own connector — the same surface as a reference, with every field documented.
- Read the production reference at
s-chat-cloud/connectors/openclaw-bridge/(~600 LOC). Same wire shape, plus reconnect / backpressure / attachments / approvals — the bits we glossed over above.