WebSocket Protocol

Full protocol reference for the streaming agent interface. File: navi/api/websocket.py.

Connection

ws://host/ws/sessions/{session_id}

The session must exist before connecting (create via POST /sessions). If the session is not found, the WebSocket closes with code 4004.

Messages: client → server

{
    "type": "message",
    "content": "user text",
    "images": ["base64string", ...],
    "files": [{"name": "file.pdf", "path": "/abs/path"}]
}

type must be "message". Other types return an error frame.
content is required and must be non-empty.
images: optional list of base64-encoded images (data URIs accepted; the data:...;base64, prefix is stripped server-side).
files: optional list of uploaded file references (appended to content as [Uploaded files on disk: ...]).

Messages: server → client

All frames are JSON objects with a type field.

Stream lifecycle

Frame	When
`{"type": "stream_start"}`	Before any agent output begins
`{"type": "stream_end", "content": "...", "context_tokens": N, "max_context_tokens": N}`	After final text, before workers
`{"type": "stream_stopped"}`	If the user stopped generation
`{"type": "error", "message": "..."}`	On any unhandled error

Thinking (reasoning)

Frame	When
`{"type": "thinking_delta", "delta": "..."}`	Reasoning chunk during streaming
`{"type": "thinking_end"}`	Reasoning phase complete
`{"type": "turn_thinking", "thinking": "...", "is_subagent": bool}`	Full reasoning block from a tool-calling turn (complete(), non-streaming)

Thinking blocks are collapsible in the UI: open during reasoning, auto-collapsed on thinking_end.

Planning

Frame	When
`{"type": "plan_ready", "plan": "..."}`	Before tool-calling loop if `planning_enabled` and a plan was generated

Rendered as a collapsible plan card in the UI.

Tool calls

Frame	When
`{"type": "tool_started", "tool": "name", "args": {...}, "is_subagent": bool}`	Immediately when a tool call begins (before execution)
`{"type": "tool_call", "tool": "name", "args": {...}, "result": "...", "success": bool, "is_subagent": bool}`	When the tool finishes

is_subagent: true indicates the tool call was made by a nested subagent, not the top-level agent.

Text output

Frame	When
`{"type": "stream_delta", "delta": "..."}`	Text chunk of the final response

Other events

Frame	When
`{"type": "context_compressed", "messages_before": N, "messages_after": N}`	After context compression runs
`{"type": "profile_switched", "profile_id": "...", "profile_name": "..."}`	When `switch_profile` tool succeeds

Stopping generation

POST /sessions/{session_id}/stop

Sets _AgentRun.stop_event. The agent checks this event:

Before each LLM call
During streaming (breaks out, calls aclose() on the generator)
After tool execution

The client sends this via fetch(), not over the WebSocket, to avoid corrupting the WebSocket receive state.

Response: {"ok": true} if a run was active, {"ok": false, "reason": "no active run"} otherwise.

Reconnection

If the client reconnects to an in-progress run (e.g. page reload mid-stream), websocket_session() detects an existing _AgentRun in _runs and subscribes a new queue to it. The client resumes receiving events from that point forward.

Run state management

_runs: dict[str, _AgentRun] — global dict of active runs, keyed by session ID.

_AgentRun holds:

task: asyncio.Task — the running agent task
stop_event: asyncio.Event — cooperative stop signal
subscribers: list[Queue] — one queue per connected WebSocket client

Events are broadcast to all subscribers. When the run finishes, _runs.pop(session_id) is called from the finally block.