diff --git a/docs/agent.md b/docs/agent.md index cbceaf6..dbe3944 100644 --- a/docs/agent.md +++ b/docs/agent.md @@ -10,13 +10,17 @@ ### `run(session_id, user_message)` → `str` Non-streaming. Full tool-calling loop, returns final text. No planning phase. -### `run_ephemeral(user_message, profile_id)` → `str` -Non-persistent subagent. No DB reads/writes. Temporary in-memory context. Called by `SpawnAgentTool`. +### `run_ephemeral(user_message, profile_id)` → `tuple[str, bool]` +Non-persistent subagent. Temporary in-memory context. Called by `SpawnAgentTool`. + +Returns `(result_text, completed_normally)`. `completed_normally` is `False` if the subagent hit the iteration limit or timed out. `spawn_agent.profile_id` is optional. If omitted, `SpawnAgentTool` resolves the parent session's current profile. If provided, the subagent uses the selected profile's model, `subagent_system_prompt`, planning flags, and tool set. Its tools come from that profile's `subagent_tools`, falling back to `enabled_tools` when `subagent_tools` is empty. When spawned from a persistent parent session, session-aware tools run under the parent session id so file tools resolve the user's session directory rather than a `subagent_*` directory. +`run_ephemeral` reads the parent session from the DB when `parent_session_id` is provided, so session-aware tools (filesystem, todo, scratchpad) operate on the parent's data. + --- ## Planning phase (`_run_planning`) diff --git a/docs/api.md b/docs/api.md index fb64dc1..f30fed2 100644 --- a/docs/api.md +++ b/docs/api.md @@ -14,7 +14,27 @@ **Response `200`** ```json -{ "status": "ok" } +{ + "status": "ok", + "embed": { + "status": "ok", + "model": "nomic-embed-text", + "dimensions": 768 + } +} +``` + +#### `GET /health/embed` + +Embedding model health check. Returns the first vector of a test string to verify the embed backend is responsive. + +**Response `200`** +```json +{ + "status": "ok", + "model": "nomic-embed-text", + "dimensions": 768 +} ``` --- @@ -27,6 +47,10 @@ Redirect to gnexus-auth OAuth authorization endpoint. Sets PKCE + state internally. +**Query params** +- `return_to` — URL to redirect back to after login (default: `/`) +- `platform` — `browser` (default) or `android` (affects redirect after callback) + **Response `302`** → Location: gnexus-auth `/oauth/authorize` --- @@ -45,6 +69,7 @@ **Errors** - `400` — invalid state, PKCE failure, or token exchange failed +- `503` — OAuth is not configured (missing `gnexus_auth_client_id` or `gnexus_auth_client_secret`) --- @@ -80,6 +105,16 @@ "id": "user-uuid", "email": "user@example.com", "display_name": "User Name", + "username": "username", + "first_name": "First", + "last_name": "Last", + "phone": "+1234567890", + "birth_date": "1990-01-01", + "country": "US", + "city": "New York", + "locale": "en-US", + "avatar_url": "https://...", + "profile_url": "https://...", "role": "admin", "permissions": ["navi.sessions.read_all", "navi.memory.read_all"] } @@ -90,11 +125,22 @@ --- +#### `GET /auth/status` + +Check if the user is currently authenticated without returning full profile. + +**Response `200`** +```json +{ "authenticated": true } +``` + +--- + ### Profiles & Tools #### `GET /agents/profiles` -List available agent profiles. +List available agent profiles. Non-admin users do not see `is_admin_only` profiles. **Response `200`** ```json @@ -105,7 +151,15 @@ "description": "General-purpose assistant", "enabled_tools": ["todo", "web_search", "filesystem", "..."], "llm_backend": "ollama", - "model": "gemma4:31b-cloud" + "model": ["gemma4:31b-cloud", "gemma4:26b-a4b-it-q4_K_M"], + "temperature": 0.65, + "top_k": null, + "top_p": null, + "max_iterations": 10, + "iteration_budget_enabled": true, + "think_enabled": true, + "subagent_think_enabled": null, + "mcp_servers": {"gnexus-book": ["read", "write"]} } ] ``` @@ -117,19 +171,62 @@ **Response `200`** ```json [ - { "name": "web_search", "description": "Search the web using DuckDuckGo." }, - { "name": "filesystem", "description": "Read, write and list files." } + { + "name": "web_search", + "description": "Search the web using DuckDuckGo.", + "parameters": {"type": "object", "properties": {...}, "required": [...]} + }, + { + "name": "filesystem", + "description": "Read, write and list files.", + "parameters": {"type": "object", "properties": {...}, "required": [...]} + } ] ``` --- +#### `GET /agents/prompts` + +Return the fully resolved system prompt for each profile (persona + profile system_prompt + context provider injections + MCP server instructions). + +**Response `200`** +```json +{ + "secretary": "system prompt text...", + "server_admin": "system prompt text..." +} +``` + +--- + +#### `GET /agents/mcp_servers` + +Return all configured MCP servers with their resolved tools per profile. + +**Response `200`** +```json +{ + "gnexus-book": { + "connected": true, + "tools": [ + {"name": "gnexus-book_list_inventory", "description": "..."} + ], + "instructions": "MANDATORY: Before answering ANY question..." + } +} +``` + +--- + ### Sessions #### `POST /sessions` Create a new session. +**Auth**: requires authenticated user. + **Request body** ```json { "profile_id": "secretary" } @@ -145,6 +242,7 @@ ``` **Errors** +- `401` — not authenticated - `404` — profile not found --- @@ -153,7 +251,38 @@ List all sessions sorted by activity (pinned first). -**Response `200`** +**Auth**: requires authenticated user. + +**Query params** +| Param | Default | Description | +|---|---|---| +| `limit` | `50` | Page size | +| `offset` | `0` | Items to skip | +| `profile_id` | — | Filter by profile | + +**Response `200`** (when pagination params provided) +```json +{ + "items": [ + { + "session_id": "550e8400-...", + "profile_id": "secretary", + "name": "Research task", + "message_count": 12, + "preview": "Last 60 chars of the most recent message", + "pinned": false, + "created_at": "2026-04-10T15:00:00+00:00", + "last_active": "2026-04-10T18:00:00+00:00" + } + ], + "limit": 50, + "offset": 0, + "has_more": true, + "next_offset": 50 +} +``` + +**Response `200`** (plain list when no pagination params) ```json [ { @@ -177,6 +306,8 @@ Full session with message history (display history — never compressed). +**Auth**: requires authenticated user (or ownership of the session). + **Response `200`** ```json { @@ -290,6 +421,8 @@ LLM context (what the model actually sees). May differ from `messages` — compressed history replaces old turns with a summary. Debug endpoint. +**Auth**: requires `admin` role. + **Response `200`** ```json { @@ -301,17 +434,50 @@ } ``` +**Errors** +- `403` — not admin + --- #### `GET /sessions/{session_id}/planning` All planning phase debug logs for the session. Each entry is one planning run. +**Auth**: requires `admin` role. + **Response `200`** ```json { "session_id": "...", "logs": [ { "phase": "...", "output": "..." }, ... ] } ``` +**Errors** +- `403` — not admin + +--- + +#### `GET /sessions/{session_id}/content` + +List published session content (artifacts registered via `content_publish` tool). + +**Response `200`** +```json +{ + "content": [ + { + "id": "...", + "filename": "report.html", + "path": "/abs/path/to/content/...", + "size": 102400, + "content_type": "text/html", + "created_at": "2026-04-10T18:00:00+00:00" + } + ] +} +``` + +**Errors** +- `404` — session not found + --- #### `POST /sessions/{session_id}/files` @@ -330,7 +496,7 @@ { "name": "report.pdf", "size": 102400, - "path": "session_files/550e8400-.../report.pdf", + "path": "/abs/path/to/session_files/550e8400-.../report.pdf", "content_type": "application/pdf" } ``` @@ -344,7 +510,10 @@ #### `GET /sessions/{session_id}/files/{filename}` -Download or view an uploaded file. Images, PDFs and plain text are served inline; everything else as an attachment. +Download or view an uploaded file. Images, PDFs, plain text and HTML are served inline; everything else as an attachment. + +**Query params** +- `download` — force attachment download regardless of content type **Response `200`** — file bytes @@ -795,6 +964,165 @@ --- +#### `GET /admin/sessions/{session_id}` + +Full session details including messages. Bypasses ownership check. + +**Response `200`** +```json +{ + "session_id": "...", + "profile_id": "secretary", + "user_id": "user-uuid", + "name": "Research task", + "messages": [...], + "context_token_count": 4913, + "max_context_tokens": 65536, + "pinned": false, + "created_at": "...", + "last_active": "..." +} +``` + +**Errors** +- `404` — session not found + +--- + +#### `DELETE /admin/sessions/{session_id}` + +Delete any session (bypasses ownership). Also deletes session files. + +**Response `204`** — no body + +**Errors** +- `404` — session not found + +--- + +#### `GET /admin/users/{user_id}` + +Single user details. + +**Response `200`** +```json +{ + "id": "user-uuid", + "email": "user@example.com", + "display_name": "User Name", + "role": "admin", + "permissions": ["navi.sessions.read_all"], + "created_at": "...", + "updated_at": "..." +} +``` + +**Errors** +- `404` — user not found + +--- + +#### `GET /admin/users/{user_id}/sessions` + +Sessions owned by a specific user. + +**Response `200`** +```json +[ + { + "session_id": "...", + "profile_id": "secretary", + "name": "Research task", + "message_count": 12, + "pinned": false, + "created_at": "...", + "last_active": "..." + } +] +``` + +--- + +#### `POST /admin/ollama/clear-blacklists` + +Manually clear dead-server and dead-model blacklists for the Ollama fallback backend. Useful when a transient failure caused a 5-minute blacklist and you want immediate recovery. + +**Response `204`** — no body + +--- + +#### `GET /admin/profiles/{profile_id}` + +Full profile configuration including system prompt. + +**Response `200`** +```json +{ + "id": "secretary", + "name": "Personal Secretary", + "description": "General-purpose assistant", + "short_description": "...", + "full_description": {"specialization": "...", "when_to_use": "...", "key_tools": [...]}, + "system_prompt": "...", + "subagent_system_prompt": "...", + "llm_backend": "ollama", + "model": ["gemma4:31b-cloud", "gemma4:26b-a4b-it-q4_K_M"], + "temperature": 0.65, + "top_k": null, + "top_p": null, + "num_thread": null, + "max_iterations": 10, + "planning_enabled": false, + "planning_mandatory": false, + "planning_phase1_enabled": true, + "planning_phase2_enabled": false, + "planning_phase3_enabled": true, + "think_enabled": true, + "iteration_budget_enabled": true, + "goal_anchoring_enabled": true, + "goal_anchoring_interval": 5, + "anti_stall_enabled": true, + "anti_stall_threshold": 8, + "step_validation_enabled": false, + "adaptive_replan_enabled": false, + "subagent_tools": [...], + "subagent_planning_enabled": false, + "subagent_think_enabled": null, + "enabled_tools": [...], + "context_providers": [], + "is_admin_only": false +} +``` + +**Errors** +- `404` — profile not found + +--- + +#### `PUT /admin/profiles/{profile_id}` + +Update profile configuration on disk and in-memory. Accepts partial updates — only provided fields are modified. + +**Request body** (partial) +```json +{ + "temperature": 0.5, + "max_iterations": 20, + "planning_enabled": true +} +``` + +**Response `200`** +```json +{ "ok": true } +``` + +**Errors** +- `400` — invalid profile data +- `404` — profile not found + +--- + #### `PATCH /admin/users/{user_id}/role` Update cached role. Requires admin. @@ -822,7 +1150,7 @@ **Response `200`** ```json -{ "ok": true, "note": "Profile availability is managed via profile config files" } +{ "ok": true } ``` --- @@ -840,6 +1168,7 @@ **Errors** - `400` — invalid payload +- `503` — OAuth not configured --- diff --git a/docs/config.md b/docs/config.md index 01a2c9c..4631859 100644 --- a/docs/config.md +++ b/docs/config.md @@ -13,13 +13,24 @@ | `OLLAMA_THINK` | bool | `true` | Enable extended reasoning (thinking) | | `OLLAMA_BACKENDS_FILE` | str | `""` | Path to JSON file with multi-server config (see below). When set, overrides `OLLAMA_HOST`/`OLLAMA_API_KEY`. | | `OLLAMA_REQUEST_TIMEOUT` | int | `30` | Seconds before Ollama request times out (affects fallback speed) | +| `EMBEDDING_OLLAMA_HOST` | str | `""` | Ollama server for embedding model (falls back to `OLLAMA_HOST` if empty) | +| `EMBEDDING_OLLAMA_API_KEY` | str | `""` | API key for embedding Ollama server | +| `EMBEDDING_MODEL` | str | `nomic-embed-text:latest` | Embedding model for memory vector search | +| `EMBEDDING_DIMENSIONS` | int | `768` | Vector dimensionality for embeddings | | `OPENAI_API_KEY` | str | `""` | OpenAI API key (if using OpenAI backend) | | `OPENAI_MODEL` | str | `"gpt-4"` | Default model for OpenAI backend | -| `OPENAI_BASE_URL` | str | `None` | Custom base URL for OpenAI-compatible endpoints (e.g. vLLM, LM Studio) | -| `ANTHROPIC_API_KEY` | str | `""` | Anthropic API key (if using Anthropic backend) | +| `OPENAI_BASE_URL` | str \| None | `None` | Custom base URL for OpenAI-compatible endpoints (e.g. vLLM, LM Studio) | +| `ANTHROPIC_API_KEY` | str | `""` | Reserved — no Anthropic backend implemented yet | For direct Ollama Cloud access without fallback: set `OLLAMA_HOST=https://ollama.com` and `OLLAMA_API_KEY=`. +## Web Search + +| Variable | Type | Default | Description | +|---|---|---|---| +| `BRAVE_SEARCH_API_KEY` | str | `""` | Brave Search API key (free tier: 2000 req/month). Used when DuckDuckGo returns no results. | +| `SEARXNG_URL` | str | `""` | Self-hosted SearXNG meta-search URL, e.g. `http://localhost:8888` | + ### Multi-server fallback (`OLLAMA_BACKENDS_FILE`) When `OLLAMA_BACKENDS_FILE` points to a JSON file, Navi uses `FallbackOllamaBackend` instead of the single-server backend. The file contains an ordered list of servers: @@ -66,6 +77,12 @@ `settings.fs_allowed_paths_list` and `settings.terminal_allowed_commands_list` are computed properties that parse the comma-separated strings into lists. +| Variable | Type | Default | Description | +|---|---|---|---| +| `TERMINAL_USER_ALLOWED_COMMANDS` | str | long allowlist | Comma-separated allowed executables for non-admin users. Admin bypasses this restriction. | + +`settings.terminal_user_allowed_commands_list` — computed property parsed from the comma-separated string. + ## Database | Variable | Type | Default | Description | @@ -83,6 +100,7 @@ | Variable | Type | Default | Description | |---|---|---|---| | `TOOLS_DIR` | str | `tools` | Directory for user-defined tools (auto-discovered at startup) | +| `CONTEXT_PROVIDERS_DIR` | str | `context_providers` | Directory for user-defined context providers (auto-discovered at startup) | ## Session files @@ -107,6 +125,7 @@ | `CONTEXT_KEEP_RECENT` | int | `8` | Number of recent conversation turns to keep verbatim | | `CONTEXT_SUMMARY_TEMPERATURE` | float | `0.3` | Temperature for the summarization LLM call | | `CONTEXT_SUMMARY_MAX_TOKENS` | int | `3000` | Max output tokens for the summary LLM call | +| `OUTPUT_RESERVE_TOKENS` | int | `2048` | Headroom reserved for model response in context size checks | ## Gmail @@ -124,6 +143,8 @@ | `GNAUTH_CLIENT_SECRET` | str | `""` | OAuth client secret | | `GNAUTH_REDIRECT_URI` | str | `http://localhost:8000/auth/callback` | Must match redirect URI registered in gnexus-auth | | `GNAUTH_ADMIN_ROLE_SLUG` | str | `navi_admin` | Role slug that maps to Navi `admin` role | +| `GNAUTH_USER_ROLE_SLUG` | str | `navi_user` | Role slug that maps to Navi `user` role | +| `GNAUTH_PROFILE_PATH` | str | `/account/profile` | Path appended to `gnauth_base_url` for profile links | | `NAVI_AUTH_ENCRYPTION_KEY` | str | `""` | **Fernet key** (base64, 32 bytes). Generate once with `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`. Never change after first launch. | | `NAVI_AUTH_COOKIE_NAME` | str | `navi_auth_session` | Session cookie name | | `NAVI_AUTH_COOKIE_SECURE` | bool | `False` | Set `True` behind HTTPS | diff --git a/docs/memory.md b/docs/memory.md index 265e581..b400040 100644 --- a/docs/memory.md +++ b/docs/memory.md @@ -83,16 +83,16 @@ `user_id` references `navi_users(id)` with `ON DELETE CASCADE`. Facts and summaries are scoped per user. Admin with `navi.memory.read_all` can pass `user_id=None` for global search. -`MemoryStore` is initialized synchronously (creates tables), all operations are async via asyncpg (PostgreSQL). +`MemoryStore` lazily creates tables on first async operation via `_get_pool()`. All operations are async via asyncpg (PostgreSQL). ### Key operations | Method | Description | |---|---| | `upsert_fact(..., user_id=None)` | Insert or update a fact scoped to user | -| `search_facts(query, limit=15, user_id=None)` | **Vector search first** (cosine distance, cutoff 0.3), then ILIKE fallback. `user_id=None` requires admin permission for global search. | +| `search_facts(query, user_id=None, limit=15)` | **Vector search first** (cosine distance, cutoff 0.3), then ILIKE fallback. `user_id=None` requires admin permission for global search. | | `delete_fact(key, category=None, user_id=None)` | Delete by key, optionally filtered by category and user | -| `get_all_facts(limit=None, offset=0, search=None, sort_by="category", sort_order="desc", user_id=None, all_users=False)` | All facts ordered by `sort_by`. Pass `all_users=True` for admin global view. | +| `get_all_facts(user_id=None, all_users=False, limit=None, offset=0, search=None, sort_by="category", sort_order="desc")` | All facts ordered by `sort_by`. Pass `all_users=True` for admin global view. | | `get_summary(user_id=None)` | Current narrative summary text for user | | `set_summary(content, user_id=None)` | Replace the summary for user | | `mark_session_extracted(session_id)` | Record extraction timestamp | diff --git a/docs/profiles.md b/docs/profiles.md index ca55ffe..427eed2 100644 --- a/docs/profiles.md +++ b/docs/profiles.md @@ -30,14 +30,17 @@ | `llm_backend` | str | `"ollama"` | Backend key: `"ollama"`, `"openai"` | | `model` | str or list[str] | `["gemma4:31b-cloud"]` | Model priority list — first available wins. String is accepted and auto-wrapped. | | `temperature` | float | `0.7` | Sampling temperature for main loop calls | -| `max_iterations` | int | `20` | Hard cap on tool-calling iterations per turn | +| `max_iterations` | int | `10` | Hard cap on tool-calling iterations per turn | +| `top_k` | int \| None | `None` | Ollama sampling top_k | +| `top_p` | float \| None | `None` | Ollama sampling top_p | +| `num_thread` | int \| None | `None` | CPU threads for local inference. `None` = Ollama default | ### Tools | Key | Type | Default | Description | |---|---|---|---| | `enabled_tools` | list[str] | **required** | Tool names available in the main loop | -| `subagent_tools` | list[str] | `[]` | Tools available to sub-agents spawned from this profile. Falls back to `enabled_tools` if empty. | +| `subagent_tools` | list[str] | `[]` | Tools available to sub-agents spawned from this profile. Falls back to `enabled_tools` (full list) if empty. | `spawn_agent` may receive an optional `profile_id`. If omitted, the subagent uses the parent session's current profile. If provided, the subagent uses the selected profile's model, prompt, planning flags, and `subagent_tools`/`enabled_tools` fallback. @@ -51,7 +54,7 @@ | `goal_anchoring_interval` | int | `5` | N for goal anchoring. | | `anti_stall_enabled` | bool | `true` | Detect looping without todo progress and inject a hard warning. | | `anti_stall_threshold` | int | `8` | Consecutive iterations without progress before stall warning fires. | -| `step_validation_enabled` | bool | `false` | After each todo step is marked done, run a lightweight LLM check: "did the result satisfy the goal?" Adds ~1 LLM call per step. | +| `step_validation_enabled` | bool | `false` | Reserved flag — todo validation is unconditional in the current implementation. | | `adaptive_replan_enabled` | bool | `false` | When a todo step is marked failed, trigger a re-planning pass. Depends on `step_validation_enabled`. | ### Planning @@ -74,7 +77,11 @@ | Key | Type | Default | Description | |---|---|---|---| +| `subagent_think_enabled` | bool \| None | `None` | Extended reasoning for sub-agents. `None` = inherit `think_enabled` from parent profile. | | `subagent_planning_enabled` | bool | `false` | Sub-agents spawned from this profile also run the planning pipeline before their tool loop. | +| `context_providers` | list[str] | `[]` | Extra context providers to inject for this profile (by name). Global providers are always injected. | +| `mcp_servers` | dict | `{}` | MCP servers referenced by this profile. Format: `{"server_name": ["group1", "group2"]}` or `{"server_name": ["*"]}` for all tools. | +| `is_admin_only` | bool | `false` | If `true`, profile is hidden from non-admin users in the profile list. | --- diff --git a/docs/sessions.md b/docs/sessions.md index 7ddfbc4..5c5a18c 100644 --- a/docs/sessions.md +++ b/docs/sessions.md @@ -29,6 +29,7 @@ | `is_compression: bool` | Marker message injected when context compression ran | | `is_summary: bool` | A summary message replacing compressed history in `session.context` | | `thinking: str \| None` | LLM reasoning captured during a tool-calling turn | +| `metadata: dict` | Tool result metadata (e.g. `is_image`, `base64`) | ## Dual-buffer design @@ -58,6 +59,7 @@ - `count_all(user_id=None, is_admin=False, search=None)` → total matching sessions - `search_list(limit, offset, user_id=None, is_admin=False, search=None, sort_by="last_active", sort_order="desc")` → paginated, filtered, sorted sessions - `delete(session_id)` → `bool` +- `list_page(user_id=None, is_admin=False, limit=50, offset=0)` → paginated list with `has_more` flag - `set_pinned(session_id, pinned)` → `bool` - `set_name(session_id, name)` → `bool` @@ -80,7 +82,7 @@ Config values (`settings`): - `context_compression_enabled: bool = True` -- `context_compression_threshold: float = 0.80` — trigger at 80% of `ollama_num_ctx` +- `context_compression_threshold: float = 0.70` — trigger at 70% of `ollama_num_ctx` - `context_keep_recent: int = 10` — keep last N conversational turns verbatim - `context_summary_temperature: float = 0.3` diff --git a/docs/tools.md b/docs/tools.md index 691327b..1da31f8 100644 --- a/docs/tools.md +++ b/docs/tools.md @@ -24,9 +24,7 @@ | `WriteToolTool` | `write_tool` | Write a new user tool file and reload immediately | | `ListToolsTool` | `list_tools` | Return the live tool list from registry | | `ToolManualTool` | `tool_manual` | Return manuals/{name}.md or auto-generate from schema | -| `MemorySaveTool` | `memory_save` | Save a fact to long-term memory | -| `MemorySearchTool` | `memory_search` | Search long-term memory facts | -| `MemoryForgetTool` | `memory_forget` | Delete a fact from long-term memory | +| `MemoryTool` | `memory` | Unified memory tool: save, search, and forget facts | | `SpawnAgentTool` | `spawn_agent` | Spawn an isolated subagent (blocking). Optional `profile_id` selects another profile; omitted means parent profile | | `SwitchProfileTool` | `switch_profile` | Switch the active profile for a session | | `ListProfilesTool` | `list_profiles` | List all available profiles | @@ -37,6 +35,7 @@ | `Render3DTool` | `render_3d` | Render preview PNG images from an STL file (up to 3 views) | | `DeleteToolTool` | `delete_tool` | Delete a user tool file | | `TestToolTool` | `test_tool` | Run a user tool and verify its output | +| `McpStatusTool` | `mcp_status` | Check connectivity and list tools for configured MCP servers | | `ReflectTool` | `reflect` | Self-reflection and analysis | ### User tools (`tools/*.py`) diff --git a/docs/websocket.md b/docs/websocket.md index 76a671e..723628e 100644 --- a/docs/websocket.md +++ b/docs/websocket.md @@ -39,7 +39,7 @@ | Frame | When | |---|---| | `{"type": "stream_start"}` | Before any agent output begins | -| `{"type": "stream_end", "content": "...", "context_tokens": N, "max_context_tokens": N, "elapsed_seconds": N, "tool_call_count": N, "token_count": N}` | After final text, before workers | +| `{"type": "stream_end", "content": "...", "context_tokens": N, "max_context_tokens": N, "elapsed_seconds": N, "tool_call_count": N, "token_count": N, "message_index": N}` | After final text, before workers | | `{"type": "stream_stopped"}` | If the user stopped generation | | `{"type": "error", "message": "..."}` | On any unhandled error | @@ -57,7 +57,7 @@ | Frame | When | |---|---| -| `{"type": "planning_status", "phase": "analysis\|reflect\|plan", "label": "...", "is_subagent": bool}` | During planning phase — progress label for UI | +| `{"type": "planning_status", "phase": 1|2|3, "label": "...", "is_subagent": bool}` | During planning phase — progress label for UI. `phase`: 1=analysis, 2=reflect, 3=plan | | `{"type": "plan_ready", "plan": "...", "is_subagent": bool}` | Before tool-calling loop if `planning_enabled` and a plan was generated | `planning_status` frames arrive during each planning phase (analysis → optional reflect → plan). `is_subagent: true` means the planning is running inside a subagent — route it into the spawn_agent card, never into the top-level UI. @@ -69,7 +69,7 @@ | Frame | When | |---|---| | `{"type": "tool_started", "tool": "name", "args": {...}, "is_subagent": bool}` | Immediately when a tool call begins (before execution) | -| `{"type": "tool_call", "tool": "name", "args": {...}, "result": "...", "success": bool, "is_subagent": bool}` | When the tool finishes | +| `{"type": "tool_call", "tool": "name", "args": {...}, "result": "...", "success": bool, "is_subagent": bool, "metadata": {...}}` | When the tool finishes | `is_subagent: true` indicates the tool call was made by a nested subagent, not the top-level agent. @@ -83,7 +83,7 @@ | Frame | When | |---|---| -| `{"type": "context_compressed", "messages_before": N, "messages_after": N, "summary": "..."}` | After context compression runs | +| `{"type": "context_compressed", "messages_before": N, "messages_after": N, "summary": "...", "context_tokens": N, "max_context_tokens": N}` | After context compression runs | | `{"type": "profile_switched", "profile_id": "...", "profile_name": "..."}` | When `switch_profile` tool succeeds | | `{"type": "heartbeat"}` | Periodic keepalive during long silent operations (every 20 s) | | `{"type": "session_sync"}` | Client should reload session history from REST (`GET /sessions/{id}`) |