Authoring collaborative widgets with host.collab
A community-author's guide to building multiplayer widgets that live on a Robutler canvas. Companion docs: host-collab.md (SDK reference) and multiplayer-tictactoe-walkthrough.md (tutorial). All three sit on top of the v3 Plan 1 platform.
This guide walks through the eight things you need to know to ship
a real multiplayer widget against Robutler's host.collab primitive.
Two reference widgets ship in the box —
multiplayer-tictactoe
and multi-party-rtc — and
the walkthroughs in this guide refer back to those files line by
line. If you're building anything that two or more people poke at
the same time, start by copy-pasting one of those.
1. Anatomy of a collab widget
A Robutler widget is a single HTML file that lives at
public/widgets/<your-widget-id>/index.html. The portal serves it
inside a sandboxed iframe (CSP and Permissions-Policy applied per
the registry entry). There is no bundler, no framework, no node-
modules — every dependency you need is either inlined or pulled at
runtime from a CDN listed in your widget's CSP carve-out.
A collab widget has three structural pieces:
<!doctype html>
<html>
<head>
<meta name="robutler:widget" content='{"name":"my-widget", ...}' />
<script src="/widgets/sdk.v2.js"></script>
<style>/* inline CSS — external sheets blocked by base CSP */</style>
</head>
<body>
<div id="root"><!-- your UI --></div>
<script type="module">
// 1) wait for the bridge
await window.host.ready();
// 2) mint a collab JWT for the workspace room
const tok = await window.host.collab.getToken(
'workspace',
window.host.workspace.workspaceId,
);
// 3) load Yjs + Hocuspocus from the CDN whitelisted in CSP
const Y = await import('https://cdn.jsdelivr.net/npm/yjs@13.6.30/+esm');
const { HocuspocusProvider } = await import(
'https://cdn.jsdelivr.net/npm/@hocuspocus/provider@3.1.4/+esm'
);
// 4) join the room
const ydoc = new Y.Doc();
const provider = new HocuspocusProvider({
url: tok.wsUrl,
name: tok.roomId,
token: tok.token,
document: ydoc,
});
// 5) … do real work …
</script>
</body>
</html>That's the entire skeleton. The same five steps appear in every collab widget; the rest of this guide is about the data-modeling and correctness decisions you make on top of that skeleton.
Why dynamic ESM import? The widget iframe runs under a strict
CSP that blocks bundlers. Yjs + Hocuspocus are loaded directly from
cdn.jsdelivr.net (whitelisted in the per-widget CSP carve-out). You
don't have to ship them yourself.
Why pure HTML? The widget bundle is a static asset served by the portal's Next.js host. No build step, no transpile, no chance of "my deploy broke production". A community-published widget is literally one HTML file you upload.
Failure modes you should handle
host.collab.getTokenrejects with 503 if the collab pod is not enabled in the current environment (kill-switch, or Plan 1 hasn't shipped yet). Show a friendly notice; don't crash.- The dynamic import can fail (CDN outage, offline). Same advice.
provider.on('synced', …)may fire seconds late on cold rooms. Render a "Connecting…" state until then.
2. Picking your Yjs data structure
host.collab rooms expose a Y.Doc. You allocate top-level shared
types on that doc — getMap('foo'), getArray('bar'), getText('baz'),
or nest subdocs. Picking the right one is the single most consequential
decision you make.
| Shape | When to use | CRDT semantics |
|---|---|---|
Y.Map<K, V> | Bounded set of keys with last-write-wins per key — game state, settings, slot assignments, tile occupants. | Concurrent writes to different keys merge; concurrent writes to the same key keep the one with the larger Lamport clock. |
Y.Array<T> | Append-mostly logs — chat history, drawing strokes, event timeline. | Inserts at the same index by different peers preserve all inserts (ordered by clock). Deletes are tombstoned. |
Y.Text | Rich-text or code editor buffers. | Character-level operational transform; ideal for prose. |
Subdoc (new Y.Doc() placed in a parent Map) | Sharded large collections — one subdoc per item with lazy load. | Each subdoc is independently persisted; the parent map just references them. |
Eventual consistency: what it means in practice
All Yjs operations are eventually consistent. That has three practical implications you should bake into your widget's UX:
- No global ordering. "Player X moved before Player Y" is true only on the peer that observed both writes. Don't build logic that requires a total order — use additive game state where "move sequence" is irrelevant (each cell click is independent).
- Last write per key. If two peers write
game.set('nextPlayer', 'X')concurrently, one wins. The loser's local UI flickers for one tick. This is fine for game state because both writes were trying to express the same thing. - No transactions across peers. Yjs transactions (
ydoc.transact(...)) bundle writes for observation atomicity within one peer — observers see one consistent snapshot. They do NOT prevent another peer from writing in between. For genuine cross-peer serialization you need a different primitive (an agent skill call, for example).
3. Awareness vs persisted state
Yjs has two completely different stores per room:
ydoc (persisted) | awareness (ephemeral) | |
|---|---|---|
| Survives all peers leaving? | Yes | No |
| Survives one peer reloading? | Yes | No (their state vanishes; comes back blank) |
| Latency | Sub-second (CRDT update batch) | Sub-second (broadcast) |
| Quota | Plan 1 hard cap on doc bytes | Plan 1 alert at 16KB p99 per write |
| Right answer for | Game state, document content, settings | Cursors, "typing now", selection highlight, WebRTC signaling |
Heuristic: if losing the data on a reload would matter, put it
in ydoc. If losing it on reload is expected (cursor goes away),
put it in awareness.
Reserved awareness namespaces
Plan 1 server-side enforcement reserves these awareness keys for specific producers. Writing them from outside the reserved producer is rejected by Hocuspocus:
user.*— populated by Plan 1 from the JWT; you read this.presence.*— your widget can write this (cursor, hover, selection).comment.*— reserved for the canvas comment widget.webrtc.*— reserved for the multi-party RTC pattern (§5).
Use presence.* for everything that doesn't have a more specific
reserved namespace. The reference tic-tac-toe widget puts cursor +
hover + slot-claim intent under presence.* — all three are
ephemeral and tied to a specific peer's intent.
4. Conflict-free patterns
The reference widgets demonstrate three patterns worth memorizing.
Additive ops over destructive ops
Cell clicks in tic-tac-toe are additive — peer A writes
board[3] = 'X', peer B writes board[6] = 'O'. Both succeed. The
final board has both moves. Compare to a hypothetical "rotate the
board 90°" operation: that's a global mutation; two concurrent
rotations would race and produce undefined intermediate states.
Avoid global mutations except when documented as destructive (see
the drawing widget's "Clear canvas" button — explicitly destructive,
explicitly documented, single button click guarded by a button
press).
Idempotent transactions
Inside ydoc.transact(...), write what should be true rather than
what should change:
ydoc.transact(() => {
if (!game.has('board')) game.set('board', new Array(9).fill(null));
if (!game.has('nextPlayer')) game.set('nextPlayer', 'X');
});Multiple peers running this on first-join converge to the same state. No "first peer initializes, others read" race.
Intent → claim → confirm (slot races)
When two peers race to fill a single slot, write the intent to awareness first, then write the claim to ydoc:
// 1) Stake intent — visible to all peers immediately.
provider.awareness.setLocalStateField('presence', {
...prev,
claimingSlot: 'X',
});
// 2) Wait a frame so a tying peer can see our intent.
setTimeout(() => {
// 3) Commit — last writer wins via CRDT.
const cur = game.get('playerSlots') || {};
if (cur.X) return; // someone already there; fall back to spectator
ydoc.transact(() => game.set('playerSlots', { ...cur, X: me }));
}, 16);The intent stage reduces — but does not eliminate — collisions. CRDT last-write-wins is the fallback; the loser sees the slot go to the other peer and re-renders as spectator.
5. Multi-party WebRTC pattern
This is the canonical recipe for any N↔N peer-to-peer media use
case. The reference is
multi-party-rtc/index.html;
copy-paste it whenever you need shared audio, video, or screen
streams.
Mesh topology
Each peer in the room maintains one RTCPeerConnection per other
peer. Bandwidth at the edge is O(N²); CPU per peer is O(N). The
ceiling is 8 peers (ADR-v3-19); above that, the widget shows
a "Maximum 8 participants" banner and the 9th joiner stays as a
spectator.
Signaling via the webrtc.* awareness namespace
The widget never opens its own signaling websocket — it rides the collab room's awareness layer:
// Send an offer to peer B.
const prev = provider.awareness.getLocalState().webrtc || {};
mySeq += 1;
provider.awareness.setLocalStateField('webrtc', {
...prev,
[remotePeerId]: { type: 'offer', sdp, seq: mySeq, from: myClientId },
});Peer B observes the awareness change, reads
state.webrtc[String(myClientId)], dedupes against its own
lastSeen[A], and proceeds with the standard offer/answer dance.
Polite-peer rule
The peer with the larger clientID is "polite" and backs off on glare; the peer with the smaller clientID is impolite and proceeds. This is the W3C perfect-negotiation pattern — no bespoke handshake required.
TURN config from the JWT
The collab JWT embeds tok.turn = { url, username, credential, expiresAt }.
Build your ICE config in one shot at startup; no separate REST
round-trip needed:
const iceServers = [{ urls: 'stun:stun.l.google.com:19302' }];
if (tok.turn?.url) {
iceServers.push({
urls: tok.turn.url,
username: tok.turn.username,
credential: tok.turn.credential,
});
}
new RTCPeerConnection({ iceServers, bundlePolicy: 'max-bundle' });Cleanup on peer leave
When a peer's awareness state disappears (they navigated away, lost
network, closed the tab), tear down their RTCPeerConnection:
provider.awareness.on('change', () => {
const seen = new Set();
provider.awareness.getStates().forEach((s, cid) => seen.add(cid));
for (const cid of pcs.keys()) {
if (!seen.has(cid)) tearDownPeer(cid); // close pc, remove tile
}
});6. Permissions + meta flags
The portal applies a strict Permissions-Policy and CSP to every
widget iframe by default. To unlock browser capabilities your widget
needs, declare them in the <meta name="robutler:widget" ...> block:
| Flag | Unlocks | Use when |
|---|---|---|
allowMic | Permissions-Policy: microphone on the iframe | You call getUserMedia({ audio: true }) |
allowCamera | Permissions-Policy: camera | You call getUserMedia({ video: true }) |
allowScreen | Permissions-Policy: display-capture | You call getDisplayMedia() |
allowGpu | Permissions-Policy: webgpu + connect-src carve-out for first-party model CDNs | On-device foundation models (seeded widgets only — see anti-patterns) |
Example:
<meta name="robutler:widget" content='{ ..., "permissions": ["allowMic","allowCamera","allowScreen"] }' />The browser may still prompt the user for explicit consent on first use — that's intentional, not a bug.
CSP carve-outs
If your widget loads dependencies from a CDN (Yjs, foundation-model
weights, etc.) or talks to a non-portal websocket, declare those in
your registry entry's csp field:
'my-widget': {
kind: 'iframe',
entry: '/widgets/my-widget/index.html',
csp: {
connectSrc: ['https://cdn.jsdelivr.net', 'wss://collab.robutler.local'],
},
},The composer in lib/workspaces/widget-csp.ts merges your carve-out
with the strict baseline.
7. Testing
Fake Yjs doc for unit tests
Yjs runs identically in node and browser. For per-widget unit tests:
import * as Y from 'yjs';
import { test } from 'vitest';
test('tic-tac-toe slot claim race', () => {
const docA = new Y.Doc();
const docB = new Y.Doc();
const gameA = docA.getMap('game');
const gameB = docB.getMap('game');
// Race: both peers claim X simultaneously.
docA.transact(() => gameA.set('playerSlots', { X: 'A' }));
docB.transact(() => gameB.set('playerSlots', { X: 'B' }));
// Sync.
Y.applyUpdate(docB, Y.encodeStateAsUpdate(docA));
Y.applyUpdate(docA, Y.encodeStateAsUpdate(docB));
// Both peers converge — one of A or B wins. Last-write-per-clock.
expect(gameA.get('playerSlots')).toEqual(gameB.get('playerSlots'));
});Multi-peer e2e patterns
Playwright with multiple browser contexts is the right tool for multi-peer e2e:
const a = await browser.newContext();
const b = await browser.newContext();
await Promise.all([
a.newPage().goto(`/workspace/${ws}#tictactoe`),
b.newPage().goto(`/workspace/${ws}#tictactoe`),
]);
// Drive peer A's clicks, assert peer B's board updates.The reference tests live in tests/e2e/widgets/.
8. Anti-patterns
Specific things not to do:
- Don't store secrets in awareness. Awareness broadcasts to every
workspace member. API keys, OAuth tokens, anything you'd put in
an env var — never goes here. Use
host.kv(per-widget-instance, workspace-member-only) or skip storage entirely. - Don't use awareness for durable state. Awareness vanishes when
the last peer leaves. If you want it to survive a reload, it goes
in
ydoc. Period. - Don't bypass reserved namespaces. Writing
user.*,comment.*, orwebrtc.*(outside the multi-party-rtc pattern) is rejected server-side. Usepresence.*for ephemeral peer state. - Don't ship community widgets with
allowGpu. The CSP relaxation for GPU/on-device-model CDNs requires the widget's path to be in theWIDGET_ALLOWGPU_SEEDED_PATHSenv var, per ADR-v3-07. That's a deliberate first-party-only carve-out. Community widgets get the Permissions-Policy bit (so WebGPU works in principle) but no CDN connect-src — you can't fetch model weights from outside the whitelisted origins. - Don't poll
provider.awareness.getStates()on a timer. Subscribe to thechangeevent instead. Polling burns CPU and introduces UI jitter. - Don't run an AnalyserNode per peer at the highest fftSize. 256 is plenty for active-speaker detection; higher values just cost battery on mobile.
- Don't open a separate websocket for signaling. The collab
room is already a signaling channel. Reuse it via the reserved
webrtc.*namespace (orpresence.*for non-RTC signals). - Don't assume a global clock. Yjs has no global time — only
Lamport clocks per peer. If you need "newer wins by wall clock",
store
Date.now()in the value and compare on the consumer side.
Further reading
host-collab.md— full SDK reference for thehost.collabnamespace.multiplayer-tictactoe-walkthrough.md— step-by-step build of the tic-tac-toe reference widget./docs/internal/adr/v3-19-multi-party-mesh-rtc.md— design rationale for mesh-vs-SFU-vs-proxy./docs/internal/runbooks/multi-party-rtc-debugging.md— operator playbook for production debugging.