Base URL: http://localhost:8000
GET /health{ "status": "ok" }
GET /agents/profilesList of available agent profiles.
[
{
"id": "secretary",
"name": "Personal Secretary",
"description": "General-purpose assistant for research, writing, and everyday tasks.",
"enabled_tools": ["todo", "web_search", "filesystem", "..."],
"llm_backend": "ollama",
"model": "gemma4:26b-a4b-it-q4_K_M"
}
]
GET /agents/toolsAll registered tools (built-in + user-defined).
[
{ "name": "web_search", "description": "Search the web using DuckDuckGo." },
{ "name": "filesystem", "description": "Read, write and list files." }
]
A session is a persistent conversation container tied to a profile. It stores the full message history and survives server restarts.
POST /sessionsCreate a new session.
Request
{ "profile_id": "secretary" }
Response 201
{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"profile_id": "secretary",
"created_at": "2026-04-10T18:00:00+00:00"
}
Errors: 404 — profile not found
GET /sessionsAll sessions, sorted by activity (pinned first).
Response 200
[
{
"session_id": "550e8400-...",
"profile_id": "secretary",
"message_count": 12,
"preview": "Last 60 chars of the last message",
"pinned": false,
"created_at": "2026-04-10T15:00:00+00:00",
"last_active": "2026-04-10T18:00:00+00:00"
}
]
GET /sessions/{session_id}Full session info including display message history.
Response 200
{
"session_id": "550e8400-...",
"profile_id": "secretary",
"created_at": "...",
"last_active": "...",
"messages": [
{
"role": "user",
"content": "Hello",
"created_at": "2026-04-10T18:00:00+00:00"
},
{
"role": "assistant",
"content": "Hi. What can I do for you?",
"created_at": "2026-04-10T18:00:05+00:00"
},
{
"role": "assistant",
"tool_calls": [
{ "id": "abc123", "name": "web_search", "arguments": { "query": "..." } }
]
},
{
"role": "tool",
"content": "tool result text",
"tool_call_id": "abc123",
"name": "web_search"
}
]
}
Message fields (all optional except role):
| Field | Type | Description | |||
|---|---|---|---|---|---|
role |
`user\ | assistant\ | tool\ | system` | Author |
content |
`string\ | null` | Message text | ||
images |
string[] |
Base64 images (user/assistant) | |||
tool_calls |
ToolCall[] |
Tool invocations (assistant turn) | |||
tool_call_id |
string |
Reference to the tool call this answers (tool turn) | |||
name |
string |
Tool name (tool turn) | |||
created_at |
string ISO 8601 |
Timestamp | |||
is_summary |
bool |
Compressed history block (injected by compressor) |
Errors: 404 — session not found
DELETE /sessions/{session_id}Delete session and its files.
Response 204 — no body
PATCH /sessions/{session_id}/pinPin or unpin a session.
Request
{ "pinned": true }
Response 200
{ "session_id": "...", "pinned": true }
GET /sessions/{session_id}/contextThe LLM context (what the model actually sees). May differ from messages — the compressor replaces older messages with a summary. Debug endpoint.
Response 200
{
"session_id": "...",
"profile_id": "secretary",
"message_count": 8,
"total_chars": 4200,
"context": [ ...same format as messages... ]
}
POST /sessions/{session_id}/filesUpload a file for a session. Call this before sending a message that references the file.
Request: multipart/form-data, field file.
Limits
.exe, .dll, .so, .sh, .bat, .cmd, .ps1, .vbs, .bin, .elffile_1.txt, file_2.txt, ...Response 201
{
"name": "report.pdf",
"size": 102400,
"path": "session_files/550e8400-.../report.pdf",
"content_type": "application/pdf"
}
Errors: 400 blocked extension · 404 session not found · 413 too large
POST /sessions/{session_id}/stopStop the currently running generation. The agent checks the stop signal cooperatively — it breaks out of streaming and tool loops cleanly.
Response 200
{ "ok": true }
// or
{ "ok": false, "reason": "no active run" }
Send this via fetch(), not over WebSocket, to avoid corrupting the WebSocket receive state.
POST /sessions/{session_id}/messagesSend a message and wait for the full response synchronously. Blocks until the entire agent loop finishes.
Request
{ "content": "How many stars are in a galaxy?" }
Response 200
{ "role": "assistant", "content": "Estimates range from 100 to 400 billion." }
Prefer WebSocket for real usage — it gives streaming, tool progress, and thinking visibility.
WS /ws/sessions/{session_id}The primary real-time channel. Supports streaming text, reasoning (thinking), tool events, file and image attachments.
Connection: if session not found, server closes with code 4004.
Reconnect: if the client reconnects during an active run (e.g. page reload mid-stream), the server automatically re-subscribes and forwards missed events. No extra handshake needed.
All messages are JSON.
{
"type": "message",
"content": "Message text",
"images": ["base64string..."],
"files": [
{ "name": "report.pdf", "size": 102400, "path": "session_files/.../report.pdf" }
]
}
| Field | Required | Description |
|---|---|---|
type |
yes | Always "message" |
content |
yes | Non-empty text |
images |
no | Base64 strings. Both raw base64 and data:image/...;base64,... accepted — server strips the prefix |
files |
no | Files uploaded via POST /sessions/{id}/files. Server appends their paths to the message content so the agent knows about them |
Events arrive in order. All are JSON with a type field.
stream_start{ "type": "stream_start" }
Generation has begun. Disable the input field.
thinking_delta{ "type": "thinking_delta", "delta": "reasoning fragment..." }
Streaming chunk of the model's internal reasoning. Only emitted if the model has thinking enabled. Accumulate delta values until thinking_end.
thinking_end{ "type": "thinking_end" }
Reasoning phase complete. Next event will be either text (stream_delta) or tool calls.
turn_thinking{
"type": "turn_thinking",
"thinking": "full reasoning text...",
"is_subagent": false
}
Complete reasoning block from a tool-selection turn (non-streaming, arrives whole). is_subagent: true means it came from a nested sub-agent inside spawn_agent.
plan_ready{ "type": "plan_ready", "plan": "## Plan\n\n**Task:** ...\n\n**Steps:**\n1. ..." }
A step-by-step execution plan generated before the tool loop starts. Only sent when the profile has planning_enabled: true and the task is non-trivial. Render as a collapsible card before the tool calls.
tool_started{
"type": "tool_started",
"tool": "web_search",
"args": { "query": "weather in moscow" },
"is_subagent": false
}
A tool call has begun (before execution). Use for showing a spinner/pending card. is_subagent: true — called from within a sub-agent.
tool_call{
"type": "tool_call",
"tool": "web_search",
"args": { "query": "weather in moscow" },
"result": "Today in Moscow: +12°C, cloudy.",
"success": true,
"is_subagent": false
}
Tool finished. Replaces the pending card from tool_started. success: false — tool returned an error.
stream_delta{ "type": "stream_delta", "delta": "response fragment..." }
Streaming chunk of the final text response. Accumulate into a string.
stream_end{
"type": "stream_end",
"content": "full response text",
"context_tokens": 4913,
"max_context_tokens": 65536
}
Agent done. content is the complete accumulated response (sum of all stream_delta values). Re-enable the input field. context_tokens can be used to show a context usage indicator.
stream_stopped{ "type": "stream_stopped" }
Generation was stopped by the user (POST /sessions/{id}/stop). Re-enable input.
profile_switched{
"type": "profile_switched",
"profile_id": "server_admin",
"profile_name": "Server Administrator"
}
The agent switched profiles (via the switch_profile tool). The new profile takes effect from the next user message. Update the UI profile indicator immediately when this event arrives — it comes before the tool_call for switch_profile completes.
context_compressed{
"type": "context_compressed",
"messages_before": 42,
"messages_after": 12
}
The LLM context was automatically compressed after this response (triggers at ≥80% context window fill). Informational — the display history (GET /sessions/{id}) is unaffected.
error{ "type": "error", "message": "Session not found" }
An error occurred. Some errors are recoverable (stream continues), others terminate the run.
Simple question (no tools):
stream_start thinking_delta × N (if model has thinking enabled) thinking_end stream_delta × N stream_end
With tool calls:
stream_start plan_ready (if planning enabled and task is non-trivial) turn_thinking (reasoning before tool selection, if any) tool_started tool_call turn_thinking (reasoning before next tool, if any) tool_started tool_call thinking_delta × N (reasoning during final response) thinking_end stream_delta × N stream_end context_compressed (optional, after response)
With sub-agent (spawn_agent):
stream_start tool_started (spawn_agent, is_subagent=false) turn_thinking (is_subagent=true) tool_started (sub-agent's tool, is_subagent=true) tool_call (is_subagent=true) ... tool_call (spawn_agent done, is_subagent=false) stream_delta × N stream_end
Profile switch:
stream_start tool_started (switch_profile) profile_switched ← update UI here tool_call (switch_profile done) stream_delta × N (Navi announces the switch) stream_end
Client static: GET /static/** — served from client/ directory. Cache-Control: no-store.
Session files: stored in session_files/{session_id}/. Accessed by the agent via filesystem tool. Auto-deleted after 24h of session inactivity or on session delete.
| Code | Cause |
|---|---|
HTTP 400 |
Blocked file type |
HTTP 404 |
Session or profile not found |
HTTP 413 |
File exceeds 200 MB |
HTTP 500 |
Internal agent error |
WS 4004 |
Session not found on WebSocket connect |