Agent UI widgets
An agent on the canvas no longer has to render the built-in chat pane. Its
owner can pick any widget — a custom chat, a voice UI, a dashboard — to be
the agent's UI. That widget connects to the agent over the host.agents
SDK surface: a permission-gated handle that speaks multimodal UAMP, reads chat
history + non-secret config, and can drive (and be driven by) the agent.
This guide covers the host.get / host.agents factory, the AgentHandle
surface, the connection grant + capability query, and the trust model.
Phase 1 is owner-only: a custom UI resolves only when the viewer is the agent's owner (a non-owner / public visitor always gets the built-in chat). Public exposure is a separately-reviewed later phase.
Resolving an agent handle
host.get(kind, id) is a generic, permission-gated resource-handle factory.
host.agents.get(id) is typed sugar for kind: 'agent':
await host.ready();
// Which agents is this widget connected to? `list()` reflects ONLY the
// granted set (never an enumeration of every agent).
const [agentRef] = await host.agents.list(); // [{ kind:'agent', id }]
const agent = await host.agents.get(agentRef.id);When your widget is installed as an agent's UI, the host pre-grants its
agent, so host.agents.list() returns it. You can also read the id from your
widget config (initialKv) if the installer seeded it.
host.get(kind, id) / a handle op rejects any (kind, id) the widget was
not granted with a permission error — and the underlying server routes
independently authorize the viewer, so the grant is a convenience gate, not the
security boundary.
The AgentHandle surface
interface AgentHandle {
id: string;
kind: 'agent';
// Realtime, multimodal UAMP bus (cookie-authed canonical endpoint).
uamp: {
send(event): Promise<{ ok: true }>; // raw UAMP event(s)
on(cb): () => void; // streamed response.delta events
turn(content, { chatId?, onDelta?, onDone? }): Promise<{ ok: true }>;
};
// Chat history (reads); a turn is sent via `uamp`/`chats.send`.
chats: {
list(): Promise<…>; // the agent's chats
get(chatId, { limit?, before? }): Promise<…>; // message history
send(chatId, content): Promise<{ ok: true }>;
subscribe(chatId, cb): () => void; // ambient peer msgs (Phase 1b)
};
// Non-secret config: read, + a NARROW owner-only write.
config: { get(): Promise<…>; update(patch): Promise<…> };
// Modality + execution — drives UI auto-config (e.g. STT/TTS choice).
capabilities(): Promise<{ mode, input, output, execution }>;
// Convenience subscriptions.
onDelta(cb): () => void; // streamed response.delta
onMessage(cb): () => void; // message.created / message.updated
onPresent(cb): () => void; // agent `present` payloads (from the delta stream)
}Converse over UAMP
// One text turn, streaming the reply.
await agent.uamp.turn('Summarize my last chat', {
onDelta: (delta) => appendToken(delta?.text ?? ''),
onDone: () => markComplete(),
});turn(content) builds a UAMP session.create → input.* → response.create and
streams the server's response.delta events back. Pass an object with a UAMP
type (e.g. { type: 'input.audio', audio, format }) for non-text modalities.
The transport is the canonical, cookie-authed POST /api/agents/:id/uamp — the
same protocol A2A and the NLI skill speak. The bridge holds the connection on
your behalf; the sandbox never sees credentials.
Render a custom chat
const { chats } = await agent.chats.list();
const { messages } = await agent.chats.get(chats[0].id, { limit: 50 });
renderHistory(messages);
agent.onDelta((ev) => streamIntoBubble(ev.delta));
agent.onPresent((p) => renderPresented(p)); // agent showed a widget/card
await agent.chats.send(chats[0].id, 'hello');Auto-configure from capabilities
capabilities() is the UAMP capability handshake — host.agents sends a
capabilities.query over the agent's UAMP channel and the agent replies with a
unified, non-secret capabilities (its model's real modalities + identity +
voice). Not a bespoke endpoint, not a guess:
const caps = await agent.capabilities();
// caps.input / caps.output : ['text','audio',…] ← what the agent accepts / emits
// caps.execution : 'cloud' | 'realtime' | 'local'
// caps.avatarUrl / caps.displayName : agent identity (render the avatar / name)
// caps.voiceId? : a default voice
if (caps.input.includes('audio') && caps.execution === 'realtime') {
streamAudioToAgent(); // provider does STT+TTS end-to-end
} else if (caps.input.includes('audio')) {
sendAudioPerTurn(); // multimodal agent transcribes server-side
} else {
sttThenSendText(); // text agent → on-device STT
}
if (!caps.output.includes('audio')) connectTts(); // text out → speak with a TTS modelThe widget picks its pipeline from what the agent actually does. (Server: the
agent UAMP route answers capabilities.query via buildUampAgentCapabilities()
— resolveAgent(id).getCapabilities() + identity + configured voice; the legacy
voice-config is only a fallback.)
Edit non-secret config (owner-only)
// Allowed: ui (→ the agent's custom-UI choice), greetingMessage, suggestedActions.
await agent.config.update({ greetingMessage: 'Hey there 👋' });config.update is a narrow allowlist, enforced server-side. It deliberately
cannot change the agent's model, instructions, enabledTools, talkTo, or
pricing — "non-secret" is not the same as "safe to let an in-page widget
repoint the agent's brain". Those stay owner-edited through the normal settings.
The agent can drive your widget too
The channel is symmetric. Beyond the UAMP response stream (which already lets
the agent decide what you render), the agent can call named commands your
widget declares via host.commands.handle(...) — reusing the existing
WidgetCommandBus. Declare a command interface in your widget registry entry
and the connected agent can invoke it (e.g. setTab, highlight) and call it
as a tool. (Client-declared, LLM-native tools — WebMCP-shaped — are a Phase 1b
addition.)
Trust model (how access is gated)
- Owner-only (Phase 1). A custom UI resolves only for the agent's owner.
- Server-side authorization is the boundary. Every
host.agentsop runs as the authenticated viewer over a same-origin route that authorizes ownership / chat-participation.host-side grants (state.connectedResources) are a UX / defence-in-depth filter, never the sole gate. - Grants are written server-side at install. A sandboxed widget cannot author its own grant.
- Secrets never cross.
config.getreturns non-secret fields only; the agent's provider keys + server tools stay server-side (ADR-v3-12). - The sandbox is the credential boundary. The iframe runs
allow-scriptson an opaque origin; the bridge holds all cookies/tokens.
Installing a widget as an agent's UI
The choice lives on the agent config at metadata.ui = { widgetType?, url?, initialKv? }.
Set it any of three ways:
- the "Agent UI" picker in the agent's settings (select a registry widget or a custom URL);
- programmatically (owner) via
agent.config.update({ ui: { widgetType: 'voice' } }); - a config-editing agent through the factory
update_agenttool'suifield.
Clear it (ui: null, or pick "Default" in the picker) to restore the built-in
chat. The custom UI renders for the owner whenever metadata.ui is set;
featureFlags.customUI === false is an explicit per-agent kill-switch (default on).