Build your first agent

By the end of this page you'll have a Node.js bridge running on your laptop that talks to a real LLM (we use Claude as the example) and streams replies into the Sophon iOS app on your phone. ~80 lines of code, 15 minutes.

This is a tutorial, not a reference. We'll write the actual files step by step. If you want the full SAP wire surface in one place, read Write your own connector — this page's job is to get you to a working agent first.

What you need

The Sophon iOS app on your phone. Download link coming soon: Download Sophon for iOS.
Node 22+ on your laptop.
An Anthropic API key (sk-ant-…). The agent loop is plain HTTP, so swapping in OpenAI / Gemini / a local model is one function — we'll mark the spot.

You don't need to clone the s-chat repo. Everything fits in one new directory.

1. Pair a bridge from your phone

Open Sophon on iPhone, tap Settings → Connect a bridge → Custom Bridge. iOS shows you a 7-letter code like 9FP9SVT and waits.

Important: that code pairs any bridge that POSTs the right shape against /v1/pairing/start — it isn't locked to a particular npm package. We're going to write our own bridge and use that code in step 3.

Leave the iOS screen open. Copy the code; we'll paste it into the terminal in a moment.

2. Scaffold the project

mkdir my-first-agent && cd my-first-agent
npm init -y && npm pkg set type=module
npm install ws @anthropic-ai/sdk
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
echo ".bridge-token" >> .gitignore
echo ".env"          >> .gitignore

Two dependencies: ws for the Sophon WebSocket, and Anthropic's SDK for the LLM. That's it.

3. Claim the pairing code → bridge token

The pairing flow is two HTTP calls: start mints a code, poll long-polls until the user types it on iOS. Since iOS already generated the code in step 1, don't call start — just poll with the code your phone showed you.

Save this as pair.mjs:

// pair.mjs — run once, persist the bot token, never run again.
import { writeFileSync } from 'node:fs'
 
const SOPHON_BASE = 'https://api.sophon.at'
const code = process.argv[2]?.trim().toUpperCase()
if (!code) {
  console.error('usage: node pair.mjs <7-letter-code-from-ios>')
  process.exit(2)
}
 
// /v1/pairing/poll long-polls up to 25 s. Loop until the iOS user
// taps Pair (status: 'claimed') or the code expires (~120 s TTL).
const startedAt = Date.now()
while (Date.now() - startedAt < 120_000) {
  const url = new URL(`${SOPHON_BASE}/v1/pairing/poll`)
  url.searchParams.set('code', code)
  url.searchParams.set('timeout', '25')
 
  const r = await fetch(url)
  const body = await r.json()
  if (!body.ok) throw new Error(`poll failed: ${JSON.stringify(body)}`)
 
  const { status, bot_token, sophon_ws_url, installation_id } = body.result
  if (status === 'pending') continue                 // 25 s elapsed, loop
  if (status === 'expired') throw new Error('code expired — get a fresh one from iOS')
  if (status === 'claimed') {
    writeFileSync('.bridge-token', bot_token, { mode: 0o600 })
    writeFileSync('.bridge-meta.json', JSON.stringify({
      installation_id, sophon_ws_url,
    }, null, 2))
    console.log(`✓ paired — installation ${installation_id}`)
    console.log(`  token saved to .bridge-token (0600)`)
    process.exit(0)
  }
  throw new Error(`unexpected status: ${status}`)
}
throw new Error('pairing window timed out')

Run it:

node pair.mjs 9FP9SVT          # ← your code from iOS

The script blocks on the first poll. Switch back to your phone, tap Pair. The terminal prints:

✓ paired — installation inst_q9w8e7r6t5y4u3i2
  token saved to .bridge-token (0600)

The .bridge-token file holds a string like inst_q9w8e7r6t5y4u3i2:s_live_AbC123…. Treat it like an SSH key. You only run pair.mjs once — every subsequent process reads the file.

4. Open the WebSocket

Save this as bridge.mjs and don't add the LLM yet — first we just want to see frames arrive.

// bridge.mjs
import { readFileSync } from 'node:fs'
import { WebSocket } from 'ws'
 
const TOKEN = readFileSync('.bridge-token', 'utf8').trim()
const SOPHON_BASE = 'https://api.sophon.at'
const WS_URL = SOPHON_BASE.replace(/^http/, 'ws') + '/v1/bridge/ws'
 
const ws = new WebSocket(WS_URL, {
  headers: { Authorization: `Bearer ${TOKEN}` },
})
 
ws.on('open',  () => console.log('[ws] open'))
ws.on('close', () => console.log('[ws] close'))
ws.on('error', (e) => console.error('[ws] error', e.message))
 
ws.on('message', (raw) => {
  const frame = JSON.parse(raw.toString())
  console.log('[ws] frame:', frame.type)
  // Heartbeat: respond to ping within 10 s or the server kicks us.
  if (frame.type === 'ping') ws.send(JSON.stringify({ type: 'pong' }))
})

Run it:

node bridge.mjs

You should see:

[ws] open
[ws] frame: ready

ready is the server's handshake — it confirms your token is valid and binds the connection to your installation_id. Every ~30 s you'll also see a ping (your client replies with pong automatically). Leave the bridge running.

5. Listen for `session.message`

The interesting frame is update with update.type === 'session.message' — that's the user typing on their phone. Replace the ws.on('message', …) handler with this:

ws.on('message', async (raw) => {
  const frame = JSON.parse(raw.toString())
  if (frame.type === 'ready') return
  if (frame.type === 'ping')  return ws.send(JSON.stringify({ type: 'pong' }))
  if (frame.type !== 'update') return
 
  const update = frame.update
  // ACK on receipt — the server replays unack'd updates every ~5 s,
  // and your LLM call may take 30 s+. ACKing now means a slow LLM
  // never causes duplicate runs. (Trade: if we crash mid-stream the
  // partial reply is what the user sees.)
  ws.send(JSON.stringify({ type: 'ack', up_to_update_id: update.update_id }))
 
  if (update.type !== 'session.message') return
  const { session, message, interaction_id } = update.payload
 
  console.log(`[msg] session=${session.id} text=${JSON.stringify(message.text)}`)
  // (next step: wire to the LLM)
})

Test it: in iOS, tap the compose pencil → pick Custom Bridge from the chip strip → type "hello". You'll see in your terminal:

[ws] frame: update
[msg] session=sess_… text="hello"

The bubble in iOS will sit there forever — we haven't replied yet. That's step 6.

6. Stream a reply from Claude

This is the agent loop: open a placeholder bubble in iOS, stream tokens from Anthropic into it, finalise. Three POSTs: sendMessage (open the bubble), sendMessageDelta (per token batch), sendMessageEnd (lock it).

Add to the top of bridge.mjs:

import Anthropic from '@anthropic-ai/sdk'
import { randomUUID } from 'node:crypto'
 
const anthropic = new Anthropic() // reads ANTHROPIC_API_KEY from env
 
// POST /v1/bridge/* helper. Every call carries the bridge token.
async function sap(path, body) {
  const r = await fetch(`${SOPHON_BASE}${path}`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${TOKEN}`,
      'Content-Type':  'application/json',
    },
    body: JSON.stringify(body),
  })
  const data = await r.json()
  if (!r.ok || !data.ok) throw new Error(`${path} → ${r.status} ${JSON.stringify(data)}`)
  return data.result
}

Then add handleUserMessage:

async function handleUserMessage({ sessionId, interactionId, text }) {
  // 1. Open a placeholder agent bubble. iOS shows "Thinking…" the
  //    instant this returns, so the user isn't staring at silence
  //    while Anthropic warms up.
  const { message_id } = await sap('/v1/bridge/sendMessage', {
    session_id: sessionId,
    interaction_id: interactionId,
    text: ' ',                                // server requires non-empty
    idempotency_key: randomUUID(),
  })
 
  // 2. Stream from Anthropic. We batch tokens into ~80 ms windows
  //    so we send fewer, fatter delta POSTs (each one round-trips
  //    through Sophon Cloud → iOS — too small a batch and the
  //    network overhead dominates).
  let accumulated = ''
  let pending = ''
  let lastFlush = Date.now()
  const flush = async () => {
    if (!pending) return
    const delta = pending
    pending = ''
    lastFlush = Date.now()
    await sap('/v1/bridge/sendMessageDelta', {
      message_id,
      delta,
      idempotency_key: randomUUID(),
    })
  }
 
  const stream = await anthropic.messages.stream({
    model: 'claude-sonnet-4-5',
    max_tokens: 1024,
    messages: [{ role: 'user', content: text }],
  })
  for await (const event of stream) {
    if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
      pending     += event.delta.text
      accumulated += event.delta.text
      if (pending.length >= 32 || Date.now() - lastFlush >= 80) await flush()
    }
  }
  await flush()                               // tail tokens
 
  // 3. Finalise. After this iOS locks the bubble and the chat
  //    list shows the snippet. `text` is the canonical full body
  //    (deltas are advisory; end is authoritative).
  const finalMsg = await stream.finalMessage()
  await sap('/v1/bridge/sendMessageEnd', {
    message_id,
    text: accumulated,
    usage: {
      input_tokens:  finalMsg.usage.input_tokens,
      output_tokens: finalMsg.usage.output_tokens,
      model:         'claude-sonnet-4-5',
      provider:      'anthropic',
    },
    finish_reason: 'stop',
    idempotency_key: randomUUID(),
  })
}

Wire it into the WS handler — replace the console.log('[msg]…') line with:

handleUserMessage({
  sessionId:     session.id,
  interactionId: interaction_id,
  text:          message.text,
}).catch((err) => console.error('[handler]', err))

Restart node bridge.mjs.

Swapping the LLM. The only Anthropic-specific code is the anthropic.messages.stream(…) call and the usage row. To use OpenAI you swap the import, call their chat.completions with stream: true, and emit event.choices[0].delta.content as your delta. Everything else — the three SAP POSTs, the batching — is identical.

7. Send the user a message

Switch to iOS. The chat with Custom Bridge is still open from step 5. Type "summarise the plot of Macbeth in three lines".

What you'll see, in order:

Your bubble appears immediately.
A "Thinking…" placeholder for the agent (that's the empty bubble from step 6.1).
Tokens stream in token by token — each POST in your terminal is one delta in the bubble.
The bubble locks. The chat list shows the snippet.

Background the iOS app, force-quit it, reopen — the chat is still there, fully reconstructed. That's the SSE resume layer doing its job; you got it for free.

8. (bonus) Add a tool

Tool calls render as cards in the chat — taps drill into args + result. Let's give the agent one tool: current_time.

The flow on the wire is createTask → finishTask (use updateTask for live progress). The Anthropic SDK's tool_use stop reason tells you when the model wants to call your tool.

Add a tool definition + replace the streaming loop:

const tools = [{
  name: 'current_time',
  description: 'Returns the current ISO-8601 timestamp in UTC.',
  input_schema: { type: 'object', properties: {}, required: [] },
}]
 
async function runWithTools({ sessionId, interactionId, message_id, userText }) {
  const messages = [{ role: 'user', content: userText }]
  let accumulated = ''
 
  // Loop: call Claude → if it wants a tool, run it, append the
  // result, call again. Most agent runs need 1-2 iterations.
  while (true) {
    const stream = await anthropic.messages.stream({
      model: 'claude-sonnet-4-5',
      max_tokens: 1024,
      tools,
      messages,
    })
 
    // Stream text deltas as before.
    for await (const event of stream) {
      if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
        accumulated += event.delta.text
        await sap('/v1/bridge/sendMessageDelta', {
          message_id, delta: event.delta.text, idempotency_key: randomUUID(),
        })
      }
    }
    const final = await stream.finalMessage()
    messages.push({ role: 'assistant', content: final.content })
 
    if (final.stop_reason !== 'tool_use') return { accumulated, final }
 
    // Claude wants to call tool(s). Emit task_* events for each
    // and accumulate results to feed back in the next turn.
    const toolResults = []
    for (const block of final.content) {
      if (block.type !== 'tool_use') continue
 
      // task_id IS the idempotency key — same shape twice → same card.
      await sap('/v1/bridge/createTask', {
        session_id:     sessionId,
        interaction_id: interactionId,
        task_id:        block.id,
        kind:           block.name,
        status_label:   block.name,
        args:           block.input,
      })
 
      let result, ok = true
      try {
        if (block.name === 'current_time') result = { iso: new Date().toISOString() }
        else throw new Error(`unknown tool: ${block.name}`)
      } catch (err) { ok = false; result = String(err) }
 
      await sap('/v1/bridge/finishTask', {
        session_id:     sessionId,
        interaction_id: interactionId,
        task_id:        block.id,
        status:         ok ? 'completed' : 'failed',
        ...(ok ? { result } : { error: result }),
      })
 
      toolResults.push({
        type: 'tool_result',
        tool_use_id: block.id,
        content: JSON.stringify(result),
      })
    }
    messages.push({ role: 'user', content: toolResults })
  }
}

Then, in handleUserMessage, replace the streaming block (steps 2 and 3) with:

const { accumulated, final } = await runWithTools({
  sessionId, interactionId, message_id, userText: text,
})
await sap('/v1/bridge/sendMessageEnd', {
  message_id,
  text: accumulated,
  usage: {
    input_tokens:  final.usage.input_tokens,
    output_tokens: final.usage.output_tokens,
    model: 'claude-sonnet-4-5', provider: 'anthropic',
  },
  finish_reason: 'stop',
  idempotency_key: randomUUID(),
})

Restart the bridge. In iOS, send "what time is it?" — the agent decides to call current_time, you see a tool card slot in mid-bubble showing current_time → { iso: "2026-…" }, and the reply continues underneath using the result.

What just happened

iPhone               Sophon Cloud           your laptop
  │                       │                     │
  │ user types ──────────▶│ session.message ───▶│ ws frame
  │                       │                     │ sendMessage    →
  │                       │                     │ (bubble id)
  │                       │ ◀───────────────────│ sendMessageDelta × N
  │ tokens render ◀───────│                     │  (Anthropic stream)
  │                       │ ◀───────────────────│ createTask
  │ tool card ◀───────────│                     │  (Claude tool_use)
  │                       │ ◀───────────────────│ finishTask
  │                       │ ◀───────────────────│ sendMessageEnd
  │ bubble locks ◀────────│                     │

You've written a full agent in under 100 lines of JavaScript: WS in, REST out, idempotency keys on every write, ack-on-receipt semantics, streaming tokens, a tool card. The same code shape scales — replace the LLM with Claude Code, GPT, a local Ollama, or your own pipeline, and add tools as you need them.

What's next

Add an approval before destructive tools — your agent pauses, iOS surfaces an action sheet, you resume on the user's decision. Wire shape: Tool calls & approvals.
Read Idempotency & resume — the retry semantics behind the idempotency_key field. Worth knowing before you put a bridge on a flaky network.
Skim Write your own connector — the same surface as a reference, with every field documented.
Read the production reference at s-chat-cloud/connectors/openclaw-bridge/ (~600 LOC). Same wire shape, plus reconnect / backpressure / attachments / approvals — the bits we glossed over above.