diff --git a/CLAUDE.md b/CLAUDE.md index 5bb4846..5d259bf 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -121,6 +121,24 @@ - Navi should call `list_tools` when asked about her capabilities (not generate from memory) - `no-store` cache middleware on `/static/` — safe to hard-refresh during development +## Documentation + +Detailed reference is in `docs/`. Read specific files when you need depth on a subsystem: + +| File | Covers | +|---|---| +| `docs/agent.md` | Agent loop, 3-phase planning, all thinking mechanics flags | +| `docs/profiles.md` | Profile fields, config flags, how to add a profile | +| `docs/tools.md` | Built-in tools, user tool format, hot-reload | +| `docs/sessions.md` | Session model, dual-buffer, context compression, debug endpoints | +| `docs/websocket.md` | Full WS protocol — all events, reconnect/replay, stop mechanism | +| `docs/memory.md` | Long-term memory — facts, extraction, search | +| `docs/api.md` | All REST + WS endpoints with request/response schemas | +| `docs/config.md` | All `.env` variables with types and defaults | +| `docs/architecture.md` | Component diagram, data flow, registry wiring | + +`NAVI.md` (project root) is a lightweight hub with the server command, key paths table, and a `filesystem(action="query")` pattern for querying docs at runtime. + ## What works well - Hot-reload without server restart - Thinking display in client diff --git a/NAVI.md b/NAVI.md index c3a9d19..ae8db5c 100644 --- a/NAVI.md +++ b/NAVI.md @@ -40,10 +40,11 @@ | `docs/profiles.md` | Profile fields, all config flags, how to add a profile | | `docs/tools.md` | Built-in tools, user tool format, hot-reload | | `docs/sessions.md` | Session model, dual-buffer, context compression | -| `docs/websocket.md` | WebSocket protocol, all event types | +| `docs/websocket.md` | WebSocket protocol, all event types, reconnect replay | | `docs/memory.md` | Long-term memory system | -| `docs/api.md` | REST API endpoints | +| `docs/api.md` | Full REST + WebSocket API reference with schemas | | `docs/config.md` | All `.env` variables | +| `docs/architecture.md` | Component diagram, data flow, registry wiring | ## Tool manuals diff --git a/docs/api.md b/docs/api.md index d33f5c2..57c045d 100644 --- a/docs/api.md +++ b/docs/api.md @@ -10,7 +10,7 @@ #### `GET /health` -Проверка доступности сервера. +Server availability check. **Response `200`** ```json @@ -23,25 +23,25 @@ #### `GET /agents/profiles` -Список доступных профилей агента. +List available agent profiles. **Response `200`** ```json [ { "id": "secretary", - "name": "Secretary", + "name": "Personal Secretary", "description": "General-purpose assistant", "enabled_tools": ["todo", "web_search", "filesystem", "..."], "llm_backend": "ollama", - "model": "gemma4:e2b-it-q8_0" + "model": "gemma4:26b-a4b-it-q4_K_M" } ] ``` #### `GET /agents/tools` -Список всех зарегистрированных инструментов (built-in + user tools). +List all registered tools (built-in + user tools). **Response `200`** ```json @@ -55,11 +55,9 @@ ### Sessions -Сессия — это контейнер для диалога с агентом. Каждая сессия привязана к профилю и хранит историю сообщений. - #### `POST /sessions` -Создать новую сессию. +Create a new session. **Request body** ```json @@ -76,13 +74,13 @@ ``` **Errors** -- `404` — профиль не найден +- `404` — profile not found --- #### `GET /sessions` -Список всех сессий, отсортированных по активности (закреплённые первыми). +List all sessions sorted by activity (pinned first). **Response `200`** ```json @@ -90,8 +88,9 @@ { "session_id": "550e8400-...", "profile_id": "secretary", + "name": "Research task", "message_count": 12, - "preview": "Последние 60 символов последнего сообщения", + "preview": "Last 60 chars of the most recent message", "pinned": false, "created_at": "2026-04-10T15:00:00+00:00", "last_active": "2026-04-10T18:00:00+00:00" @@ -99,28 +98,33 @@ ] ``` +`name` is `null` until `POST /sessions/{id}/generate-name` is called. + --- #### `GET /sessions/{session_id}` -Полная информация о сессии с историей сообщений (display history — никогда не сжимается). +Full session with message history (display history — never compressed). **Response `200`** ```json { "session_id": "550e8400-...", "profile_id": "secretary", + "name": "Research task", + "context_token_count": 4913, + "max_context_tokens": 65536, "created_at": "...", "last_active": "...", "messages": [ { "role": "user", - "content": "Привет", + "content": "Hello", "created_at": "2026-04-10T18:00:00+00:00" }, { "role": "assistant", - "content": "Привет. Чем помочь?", + "content": "Hi. How can I help?", "created_at": "2026-04-10T18:00:05+00:00" }, { @@ -135,7 +139,7 @@ }, { "role": "tool", - "content": "результат инструмента", + "content": "tool result", "tool_call_id": "abc123", "name": "web_search" } @@ -143,38 +147,44 @@ } ``` -Поля сообщения (`role` всегда присутствует, остальные — по наличию): +Message fields (`role` is always present, others by availability): -| Поле | Тип | Описание | -|----------------|----------------------|----------| -| `role` | `user\|assistant\|tool\|system` | Автор сообщения | -| `content` | `string\|null` | Текст сообщения | -| `images` | `string[]` | Base64-строки изображений (user/assistant) | -| `tool_calls` | `ToolCall[]` | Вызовы инструментов (assistant) | -| `tool_call_id` | `string` | ID вызова, к которому относится ответ (tool) | -| `name` | `string` | Имя инструмента (tool) | -| `created_at` | `string` (ISO 8601) | Время создания | -| `is_summary` | `bool` | Сжатый блок истории (injected by compressor) | +| Field | Type | Description | +|----------------|----------------------|-------------| +| `role` | `user\|assistant\|tool\|system` | Message author | +| `content` | `string\|null` | Text content | +| `images` | `string[]` | Base64 images (user/assistant) | +| `tool_calls` | `ToolCall[]` | Tool invocations (assistant) | +| `tool_call_id` | `string` | ID of the call this result belongs to (tool) | +| `name` | `string` | Tool name (tool messages) | +| `thinking` | `string\|null` | LLM reasoning captured during a tool-calling turn | +| `is_plan` | `bool` | Planning phase output — rendered as a plan card, not text | +| `is_compression` | `bool` | Marker injected when context compression ran | +| `is_summary` | `bool` | Summary message replacing compressed history | +| `created_at` | `string` (ISO 8601) | Creation time | +| `elapsed_seconds` | `number\|null` | Time to complete the turn (final assistant message) | +| `tool_call_count` | `number\|null` | Number of tool calls in the turn | +| `token_count` | `number\|null` | Tokens used in the turn | **Errors** -- `404` — сессия не найдена +- `404` — session not found --- #### `DELETE /sessions/{session_id}` -Удалить сессию и её файлы. +Delete a session and its files. -**Response `204`** — нет тела +**Response `204`** — no body **Errors** -- `404` — сессия не найдена +- `404` — session not found --- #### `PATCH /sessions/{session_id}/pin` -Закрепить или открепить сессию. +Pin or unpin a session. **Request body** ```json @@ -188,9 +198,26 @@ --- +#### `POST /sessions/{session_id}/generate-name` + +Generate a short display name for a session from its message history. +Called automatically by the client after the first exchange. No-op if the session already has a name. + +**Response `200`** +```json +{ "name": "Web search for recipes" } +``` + +Returns `{"name": null}` if there are no user messages yet. + +**Errors** +- `404` — session not found + +--- + #### `GET /sessions/{session_id}/context` -LLM-контекст сессии (то, что модель реально видит). Может отличаться от `messages` — сжатые истории заменяют часть сообщений. Endpoint для отладки. +LLM context (what the model actually sees). May differ from `messages` — compressed history replaces old turns with a summary. Debug endpoint. **Response `200`** ```json @@ -199,23 +226,33 @@ "profile_id": "secretary", "message_count": 8, "total_chars": 4200, - "context": [ ...сообщения в том же формате, что и messages... ] + "context": [ ...same format as messages... ] } ``` --- +#### `GET /sessions/{session_id}/planning` + +All planning phase debug logs for the session. Each entry is one planning run. + +**Response `200`** +```json +{ "session_id": "...", "logs": [ { "phase": "...", "output": "..." }, ... ] } +``` + +--- + #### `POST /sessions/{session_id}/files` -Загрузить файл для сессии. Используется перед отправкой сообщения, если нужно приложить файл. +Upload a file for a session. Call before sending a message to attach the file. -**Request**: `multipart/form-data`, поле `file`. +**Request**: `multipart/form-data`, field `file`. -**Ограничения** -- Максимальный размер: 200 MB -- Запрещённые расширения: `.exe`, `.dll`, `.so`, `.sh`, `.bat`, `.cmd`, `.ps1`, `.vbs`, `.bin`, `.elf` и другие исполняемые форматы -- При совпадении имён файл переименовывается (`file_1.txt`, `file_2.txt`, ...) -- Файлы удаляются автоматически через 24 часа неактивности сессии +**Limits** +- Max size: 200 MB +- Forbidden extensions: `.exe`, `.dll`, `.so`, `.sh`, `.bat`, `.cmd`, `.ps1`, `.vbs`, `.bin`, `.elf`, and other executable formats +- Duplicate filenames get a numeric suffix **Response `201`** ```json @@ -228,9 +265,21 @@ ``` **Errors** -- `400` — запрещённое расширение -- `404` — сессия не найдена -- `413` — файл превышает лимит +- `400` — forbidden extension +- `404` — session not found +- `413` — file exceeds limit + +--- + +#### `GET /sessions/{session_id}/files/{filename}` + +Download or view an uploaded file. Images, PDFs and plain text are served inline; everything else as an attachment. + +**Response `200`** — file bytes + +**Errors** +- `403` — path traversal attempt +- `404` — session or file not found --- @@ -238,23 +287,23 @@ #### `POST /sessions/{session_id}/messages` -Отправить сообщение и получить ответ синхронно (без стриминга). Блокирует до завершения всего цикла агента. +Send a message and receive a response synchronously (no streaming). Blocks until the full agent loop completes. **Request body** ```json -{ "content": "Сколько звёзд в галактике?" } +{ "content": "How many stars are in the galaxy?" } ``` **Response `200`** ```json -{ "role": "assistant", "content": "По оценкам, от 100 до 400 миллиардов." } +{ "role": "assistant", "content": "Estimates range from 100 to 400 billion." } ``` **Errors** -- `404` — сессия не найдена -- `500` — ошибка агента или превышен лимит итераций +- `404` — session not found +- `500` — agent error or iteration limit exceeded -> Для production-клиентов предпочтительнее WebSocket — он даёт стриминг, прогресс выполнения инструментов и мышление модели. +> For production clients prefer WebSocket — it provides streaming, tool progress, and model reasoning. --- @@ -262,24 +311,24 @@ ### `WS /ws/sessions/{session_id}` -Основной канал для общения с агентом в реальном времени. Поддерживает стриминг текста, стриминг мышления, события инструментов, прикрепление файлов и изображений. +Main channel for real-time agent interaction. Supports text streaming, thinking streaming, tool events, file and image attachment. -**Подключение**: если сессия не найдена, сервер закрывает соединение с кодом `4004`. +**Connect**: if the session is not found, the server closes with code `4004`. -**Reconnect**: если клиент переподключается во время активного стрима, сервер автоматически дошлёт пропущенные события — никаких дополнительных действий не требуется. +**On connect**: the server immediately sends `session_sync` (no active run) or starts the reconnect replay flow (run in progress). --- ### Client → Server -Все сообщения от клиента — JSON-объекты. +All client messages are JSON objects. -#### Отправка сообщения +#### Send a message ```json { "type": "message", - "content": "Текст сообщения", + "content": "Message text", "images": ["base64string...", "..."], "files": [ { "name": "report.pdf", "size": 102400, "path": "session_files/.../report.pdf" } @@ -287,32 +336,32 @@ } ``` -| Поле | Обязательно | Описание | -|-----------|-------------|----------| -| `type` | да | Всегда `"message"` | -| `content` | да | Текст сообщения (непустой) | -| `images` | нет | Список base64-строк изображений. Допускается как чистый base64, так и `data:image/...;base64,...` — сервер обрежет префикс автоматически | -| `files` | нет | Файлы, загруженные через `POST /sessions/{id}/files`. Сервер дописывает их пути в текст сообщения, чтобы агент знал об их существовании | +| Field | Required | Description | +|-----------|----------|-------------| +| `type` | yes | Always `"message"` | +| `content` | yes | Message text (non-empty) | +| `images` | no | Base64 image list. Both raw base64 and `data:image/...;base64,...` are accepted — server strips the prefix | +| `files` | no | Files uploaded via `POST /sessions/{id}/files`. Server appends their paths to the message content | --- ### Server → Client -Сервер присылает события последовательно в порядке их возникновения. +Events arrive in the order they are emitted. #### `stream_start` ```json { "type": "stream_start" } ``` -Начало обработки сообщения. Клиент должен заблокировать ввод. +Processing started. Client should block input. --- #### `thinking_delta` ```json -{ "type": "thinking_delta", "delta": "фрагмент мышления..." } +{ "type": "thinking_delta", "delta": "reasoning fragment..." } ``` -Фрагмент внутреннего рассуждения модели (streaming). Приходит только если у модели включён режим thinking. Накапливайте `delta` до `thinking_end`. +Streaming chunk of model reasoning. Accumulate until `thinking_end`. --- @@ -320,7 +369,7 @@ ```json { "type": "thinking_end" } ``` -Мышление завершено. После этого начнётся либо стриминг текста (`stream_delta`), либо вызов инструментов. +Reasoning phase complete. Next will be `stream_delta` or tool calls. --- @@ -328,11 +377,39 @@ ```json { "type": "turn_thinking", - "thinking": "полный текст рассуждения...", + "thinking": "full reasoning text...", "is_subagent": false } ``` -Блок мышления модели во время выбора инструмента. Приходит целиком (не по кускам). `is_subagent: true` означает, что мышление от субагента внутри `spawn_agent`. +Complete reasoning block from a tool-calling turn. Not streamed — arrives whole. +`is_subagent: true` means this reasoning came from a subagent inside `spawn_agent`. + +--- + +#### `planning_status` +```json +{ + "type": "planning_status", + "phase": "analysis", + "label": "Analysing request...", + "is_subagent": false +} +``` +Progress update during the planning phase. `phase` is one of `analysis`, `reflect`, `plan`. +`is_subagent: true` — route into the spawn_agent card, not the top-level UI. + +--- + +#### `plan_ready` +```json +{ + "type": "plan_ready", + "plan": "1. Step one\n2. Step two\n...", + "is_subagent": false +} +``` +Planning complete — full step list. Rendered as a collapsible plan card. +`is_subagent: true` — route into the spawn_agent card. --- @@ -341,11 +418,12 @@ { "type": "tool_started", "tool": "web_search", - "args": { "query": "погода в москве" }, + "args": { "query": "weather in moscow" }, "is_subagent": false } ``` -Агент начал выполнение инструмента. Приходит немедленно, до завершения вызова — для показа спиннера. `is_subagent: true` — вызов из субагента. +Agent started executing a tool. Arrives before execution completes — show a spinner. +`is_subagent: true` — call from a subagent. --- @@ -354,21 +432,22 @@ { "type": "tool_call", "tool": "web_search", - "args": { "query": "погода в москве" }, - "result": "Сегодня в Москве +12°C, облачно.", + "args": { "query": "weather in moscow" }, + "result": "Today +12°C, cloudy.", "success": true, "is_subagent": false } ``` -Инструмент завершил работу. Приходит после `tool_started` с тем же `tool` и `args`. `success: false` — инструмент вернул ошибку. +Tool finished. Arrives after `tool_started` with the same `tool` and `args`. +`success: false` — tool returned an error. --- #### `stream_delta` ```json -{ "type": "stream_delta", "delta": "фрагмент ответа..." } +{ "type": "stream_delta", "delta": "response fragment..." } ``` -Фрагмент финального текстового ответа агента (streaming). Накапливайте в строку. +Streaming chunk of the final text response. Accumulate into a string. --- @@ -376,12 +455,23 @@ ```json { "type": "stream_end", - "content": "полный текст ответа", + "content": "full response text", "context_tokens": 4913, - "max_context_tokens": 65536 + "max_context_tokens": 65536, + "elapsed_seconds": 12.4, + "tool_call_count": 3, + "token_count": 1842 } ``` -Агент завершил ответ. `content` — полный накопленный текст (дублирует сумму `stream_delta`). `context_tokens` — сколько токенов использовано в контексте. Клиент должен разблокировать ввод. +Agent finished. `content` is the full accumulated text (duplicates the sum of `stream_delta`). Client should unblock input. + +--- + +#### `stream_stopped` +```json +{ "type": "stream_stopped" } +``` +Generation was stopped by `POST /sessions/{id}/stop`. --- @@ -393,7 +483,7 @@ "profile_name": "Server Administrator" } ``` -Агент переключил профиль сессии через инструмент `switch_profile`. Новый профиль (набор инструментов и system prompt) вступит в силу с **следующего** пользовательского сообщения. Клиент должен обновить заголовок и прочие UI-элементы, отображающие текущий профиль. Событие приходит во время стрима — до завершения `tool_call` для `switch_profile`. +Agent switched profile via `switch_profile` tool. New profile takes effect on the **next** user message. Client should update the profile indicator. Arrives during the stream — before `tool_call` for `switch_profile`. --- @@ -402,10 +492,46 @@ { "type": "context_compressed", "messages_before": 42, - "messages_after": 12 + "messages_after": 12, + "summary": "User asked about..." } ``` -Контекст был автоматически сжат после ответа (компрессия срабатывает при заполнении ≥80% контекстного окна). Информационное событие. +Context was automatically compressed (triggers at ≥80% of context window). Informational. + +--- + +#### `heartbeat` +```json +{ "type": "heartbeat" } +``` +Keepalive ping sent every 20 s during long silent operations. Client can ignore. + +--- + +#### `session_sync` +```json +{ "type": "session_sync" } +``` +Client must reload session history from `GET /sessions/{id}`. Sent: +1. On connect when no run is active (agent may have finished while disconnected). +2. After a reconnect-replay flow completes (ensures client sees the fully saved response). + +--- + +#### `replay_start` +```json +{ "type": "replay_start", "count": 14 } +``` +About to replay `count` buffered events from a mid-stream reconnect. +Client should suppress cursor animations and in-progress effects during replay. + +--- + +#### `replay_end` +```json +{ "type": "replay_end" } +``` +Replay complete. Live events will follow. --- @@ -413,75 +539,102 @@ ```json { "type": "error", "message": "Session not found" } ``` -Ошибка обработки. После некоторых ошибок стрим продолжается, после других — завершается. +Processing error. Stream may or may not continue after this. --- -### Типичная последовательность событий +### Typical event sequences -**Простой вопрос без инструментов:** +**Simple question, no tools:** ``` stream_start -thinking_delta × N (если модель думает) +thinking_delta × N (if model has thinking enabled) thinking_end stream_delta × N stream_end ``` -**Запрос с вызовом инструментов:** +**Request with tool calls:** ``` stream_start -turn_thinking (мышление при выборе инструмента, если есть) +turn_thinking (reasoning before tool selection, if any) tool_started tool_call -turn_thinking (мышление перед следующим инструментом, если есть) +turn_thinking (before next tool, if any) tool_started tool_call -thinking_delta × N (финальный ответ) +thinking_delta × N (final response reasoning) thinking_end stream_delta × N stream_end -context_compressed (опционально, если контекст переполнен) +context_compressed (optional, if context was near full) ``` -**Запрос с субагентом (`spawn_agent`):** +**Request with planning enabled:** +``` +stream_start +planning_status (phase: analysis) +planning_status (phase: plan) +plan_ready +turn_thinking +tool_started +tool_call +... +stream_end +``` + +**Request with subagent (`spawn_agent`):** ``` stream_start tool_started (spawn_agent, is_subagent=false) turn_thinking (is_subagent=true) - tool_started (инструмент субагента, is_subagent=true) + planning_status (is_subagent=true, if subagent has planning) + plan_ready (is_subagent=true, if subagent has planning) + tool_started (subagent tool, is_subagent=true) tool_call (is_subagent=true) -tool_call (spawn_agent завершён, is_subagent=false) +tool_call (spawn_agent done, is_subagent=false) stream_delta × N stream_end ``` -**Переключение профиля (`switch_profile`):** +**Reconnect mid-stream:** +``` +stream_start +replay_start {"count": N} +ev_0 ... ev_N-1 (buffered events replayed verbatim) +replay_end +(live events continue) +... +stream_end +session_sync +``` + +**Profile switch (`switch_profile`):** ``` stream_start tool_started (switch_profile) -profile_switched (до tool_call — клиент обновляет UI здесь) -tool_call (switch_profile завершён) -stream_delta × N (Нavi сообщает о смене профиля) +profile_switched (client updates UI here — before tool_call) +tool_call (switch_profile done) +stream_delta × N stream_end ``` --- -## Файлы +## Files -**Статика клиента**: `GET /static/**` — раздаётся из директории `client/`. Заголовок `Cache-Control: no-store`. +**Client static**: `GET /static/**` — served from `client/` directory. Header `Cache-Control: no-store`. -**Загруженные файлы сессии**: хранятся в `session_files/{session_id}/`. Агент обращается к ним напрямую через инструмент `filesystem`. Удаляются через 24 часа неактивности сессии или при удалении сессии. +**Session uploaded files**: stored in `session_files/{session_id}/`. Agent accesses them via the `filesystem` tool. Deleted after 24 h of session inactivity or when the session is deleted. --- -## Коды ошибок +## Error codes -| HTTP | Причина | -|------|---------| -| `400` | Запрещённый тип файла | -| `404` | Сессия или профиль не найдены | -| `413` | Файл превышает 200 MB | -| `500` | Внутренняя ошибка агента | -| WS `4004` | Сессия не найдена при подключении | +| HTTP | Reason | +|------|--------| +| `400` | Forbidden file type | +| `404` | Session or profile not found | +| `413` | File exceeds 200 MB | +| `500` | Internal agent error | +| WS `4004` | Session not found on connect | diff --git a/docs/index.md b/docs/index.md index 94ad8bd..768ef83 100644 --- a/docs/index.md +++ b/docs/index.md @@ -41,7 +41,7 @@ | `navi/core/registry.py` | `build_default_registries()` — wires everything together | | `navi/api/websocket.py` | WebSocket handler + `POST /sessions/{id}/stop` | | `navi/config.py` | `Settings` — all config loaded from `.env` | -| `navi/profiles/` | Profile definitions (`secretary`, `server_admin`, `smart_home`) | +| `navi/profiles/` | Profile definitions (`secretary`, `server_admin`, `developer`) | | `tools/` | User-defined tools (auto-discovered at startup) | ## Stack diff --git a/docs/sessions.md b/docs/sessions.md index 1662528..e0411db 100644 --- a/docs/sessions.md +++ b/docs/sessions.md @@ -12,10 +12,23 @@ context: list[Message] # LLM context — may be replaced with summary context_token_count: int # accumulated tokens; reset to 0 after compression pinned: bool # pinned sessions appear first in sidebar + name: str | None # auto-generated display name (set after first exchange) created_at: datetime last_active: datetime + planning_logs: list[dict] # raw planning phase outputs per turn (debug) ``` +## Message flags + +Messages in `session.messages` carry optional flags beyond role/content: + +| Flag | Purpose | +|---|---| +| `is_plan: bool` | Message is a planning phase output (shown as plan card in UI, not text) | +| `is_compression: bool` | Marker message injected when context compression ran | +| `is_summary: bool` | A summary message replacing compressed history in `session.context` | +| `thinking: str \| None` | LLM reasoning captured during a tool-calling turn | + ## Dual-buffer design Two separate message lists serve different purposes: @@ -43,6 +56,7 @@ - `list_all()` → sorted by `(pinned DESC, last_active DESC)` - `delete(session_id)` → `bool` - `set_pinned(session_id, pinned)` → `bool` +- `set_name(session_id, name)` → `bool` DB path: `settings.db_path` (default: `navi.db`). @@ -106,3 +120,10 @@ ``` This lets the agent use `filesystem` or `code_exec` to access the files. + +--- + +## Debug endpoints + +- `GET /sessions/{id}/context` — returns what the LLM actually sees (may differ from `messages` after compression). +- `GET /sessions/{id}/planning` — returns `session.planning_logs`: raw planning phase outputs per turn. diff --git a/docs/tools.md b/docs/tools.md index 8cd05f4..71f476e 100644 --- a/docs/tools.md +++ b/docs/tools.md @@ -37,7 +37,7 @@ - `tools/enabled.json` — list of user tool names to include in all profiles automatically. - `tools/_template.py` — canonical format reference (not loaded). -Currently present: `get_current_datetime.py`, `user_notes.py`. +Currently present: `get_current_datetime.py`, `user_notes.py`, `text_formatter.py`, `internal_monitor.py`, `weather.py`, `gmail.py`, `instagram_engine.py`, `instagram_viewer.py`. --- diff --git a/docs/websocket.md b/docs/websocket.md index 80a7a3b..76a671e 100644 --- a/docs/websocket.md +++ b/docs/websocket.md @@ -8,6 +8,8 @@ The session must exist before connecting (create via `POST /sessions`). If the session is not found, the WebSocket closes with code `4004`. +On connect the server immediately sends either `session_sync` (no active run) or begins the reconnect flow (active run detected). + --- ## Messages: client → server @@ -37,7 +39,7 @@ | Frame | When | |---|---| | `{"type": "stream_start"}` | Before any agent output begins | -| `{"type": "stream_end", "content": "...", "context_tokens": N, "max_context_tokens": N}` | After final text, before workers | +| `{"type": "stream_end", "content": "...", "context_tokens": N, "max_context_tokens": N, "elapsed_seconds": N, "tool_call_count": N, "token_count": N}` | After final text, before workers | | `{"type": "stream_stopped"}` | If the user stopped generation | | `{"type": "error", "message": "..."}` | On any unhandled error | @@ -55,9 +57,12 @@ | Frame | When | |---|---| -| `{"type": "plan_ready", "plan": "..."}` | Before tool-calling loop if `planning_enabled` and a plan was generated | +| `{"type": "planning_status", "phase": "analysis\|reflect\|plan", "label": "...", "is_subagent": bool}` | During planning phase — progress label for UI | +| `{"type": "plan_ready", "plan": "...", "is_subagent": bool}` | Before tool-calling loop if `planning_enabled` and a plan was generated | -Rendered as a collapsible plan card in the UI. +`planning_status` frames arrive during each planning phase (analysis → optional reflect → plan). `is_subagent: true` means the planning is running inside a subagent — route it into the spawn_agent card, never into the top-level UI. + +`plan_ready` carries the formatted step list. Rendered as a collapsible plan card in the UI. ### Tool calls @@ -78,8 +83,14 @@ | Frame | When | |---|---| -| `{"type": "context_compressed", "messages_before": N, "messages_after": N}` | After context compression runs | +| `{"type": "context_compressed", "messages_before": N, "messages_after": N, "summary": "..."}` | After context compression runs | | `{"type": "profile_switched", "profile_id": "...", "profile_name": "..."}` | When `switch_profile` tool succeeds | +| `{"type": "heartbeat"}` | Periodic keepalive during long silent operations (every 20 s) | +| `{"type": "session_sync"}` | Client should reload session history from REST (`GET /sessions/{id}`) | + +`session_sync` is sent in two situations: +1. On fresh connect when no run is active — in case the agent finished while the client was disconnected. +2. After a reconnect-and-replay completes — to ensure the client sees the fully saved response. --- @@ -100,7 +111,21 @@ ## Reconnection -If the client reconnects to an in-progress run (e.g. page reload mid-stream), `websocket_session()` detects an existing `_AgentRun` in `_runs` and subscribes a new queue to it. The client resumes receiving events from that point forward. +If the client reconnects to an in-progress run (e.g. page reload mid-stream), `websocket_session()` detects the existing `_AgentRun` in `_runs` and replays the full event buffer before routing live events: + +``` +← stream_start +← replay_start {"type": "replay_start", "count": N} +← ev_0 ... ev_N (all buffered events replayed verbatim) +← replay_end {"type": "replay_end"} +← (live events continue from here) +... +← session_sync (after stream finishes — sync final saved state) +``` + +The client should suppress cursor animations and other in-progress effects while `replay_start`..`replay_end` is in flight. + +If the client reconnects after the run has already finished, there is no active `_AgentRun`, so it receives only `session_sync` and must fetch history via REST. --- @@ -112,5 +137,6 @@ - `task: asyncio.Task` — the running agent task - `stop_event: asyncio.Event` — cooperative stop signal - `subscribers: list[Queue]` — one queue per connected WebSocket client +- `events: list[dict]` — replay buffer; every serialised event dict emitted this turn -Events are broadcast to all subscribers. When the run finishes, `_runs.pop(session_id)` is called from the `finally` block. +Events are broadcast to all subscribers **and** appended to `events`. When the run finishes, `_runs.pop(session_id)` is called from the `finally` block. The subscribe-then-note-count ordering guarantees no events are missed between the two steps (single-threaded async Python).