diff --git a/docs/index.md b/docs/index.md index 5a927ba..07ca1e2 100644 --- a/docs/index.md +++ b/docs/index.md @@ -22,6 +22,7 @@ | File | What it covers | |---|---| +| [`mechanics.md`](mechanics.md) | **Master catalog of all mechanisms** — use before designing new features | | [`architecture.md`](architecture.md) | Component diagram, data flow, dependency graph | | [`agent.md`](agent.md) | Agent loop, planning phase, tool execution, subagents, workers | | [`tools.md`](tools.md) | Built-in tools, user tool format, hot-reload, self-extension | diff --git a/docs/mechanics.md b/docs/mechanics.md new file mode 100644 index 0000000..86c06a9 --- /dev/null +++ b/docs/mechanics.md @@ -0,0 +1,462 @@ +# Mechanics Catalog + +Master catalog of every mechanism, feature, behavior, and configurable flag in the Navi project. + +Use this document before designing a new feature to check whether an existing mechanism can be reused, extended, or should be replaced. + +--- + +## How to read this catalog + +| Column | Meaning | +|---|---| +| **Mechanic** | Short name. Click the source file link to see the implementation. | +| **Description** | What it does and when it triggers. | +| **Config / Flags** | `.env` variables or `config.json` profile fields that control it. | +| **Files** | Primary implementation file(s). | +| **Docs** | `✅` = documented in `docs/`. `❌` = not documented anywhere yet. `⚠️` = partially documented. | + +--- + +## Agent Loop (`navi/core/agent.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Streaming entry point** | `run_stream()` — yields `AgentEvent` objects in real time. Loads session, runs planning (if enabled), tool loop, workers. | `profile.max_iterations`, `profile.llm_backend`, `profile.model`, `profile.temperature` | `agent.py` | ✅ | +| **Non-streaming entry point** | `run()` — same loop but returns plain string. No planning phase, no events. | Same as above | `agent.py` | ✅ | +| **Streaming guard wrapper** | Wraps `llm.stream_complete()` with two safety layers: (1) polls `stop_event` every second during prefill so the Stop button works even when the model emits no chunks, and (2) hard `first_chunk_timeout`/`chunk_timeout` deadlines that close the HTTP connection to Ollama so GPU load drops. | `LLM_STREAM_FIRST_CHUNK_TIMEOUT`, `LLM_STREAM_CHUNK_TIMEOUT` | `agent.py` | ❌ | +| **Subagent thinking stall detector** | Monitors subagent streaming; if only `thinking` output is emitted for 60 s or 12 000 chars without text/tool calls, aborts the subagent to prevent endless internal-token loops on local models. | Hard-coded `_SUBAGENT_THINKING_STALL_SECONDS=60.0`, `_SUBAGENT_THINKING_STALL_CHARS=12000` | `agent.py` | ❌ | +| **Cooperative stop** | Checks `current_stop_event` (asyncio.Event) before each LLM call, during streaming, and after tool execution. Uses clean generator close — never `task.cancel()`. | None | `agent.py` | ✅ | +| **First-message forced planning** | Planning phase always runs on the first user message in a session regardless of `profile.planning_enabled`. | `profile.planning_enabled` (only affects subsequent turns) | `agent.py` | ✅ | +| **Profile reload mid-session** | After each tool execution batch, checks DB for profile ID change (e.g. from `switch_profile`). If changed, reloads profile, tools, schemas, and backend for next iteration. | None | `agent.py` | ✅ | +| **Pre-turn context compression** | Before assistant reply, checks `session.context_token_count` against threshold and compresses if exceeded. | `CONTEXT_COMPRESSION_ENABLED`, `OLLAMA_NUM_CTX`, `CONTEXT_COMPRESSION_THRESHOLD` | `agent.py` | ✅ | +| **Mid-turn context compression** | On iterations > 0, estimates tokens and triggers compression with `keep_recent_messages=max(12, CONTEXT_KEEP_RECENT*2)`. For long autonomous loops where the entire conversation is one turn. | Same as above + `CONTEXT_KEEP_RECENT` | `agent.py` | ❌ | +| **Context size check with output reserve** | Raises `ContextTooLargeError` if estimated input tokens exceed `OLLAMA_NUM_CTX - OUTPUT_RESERVE_TOKENS`. Images counted at 500 tokens each. | `OLLAMA_NUM_CTX`, `OUTPUT_RESERVE_TOKENS` | `agent.py` | ⚠️ | +| **Local token estimation** | Conservative estimate: `chars // 4 + imgs * 500`. Used for preflight context size checks. | None | `agent.py` | ❌ | +| **Anti-stall detection** | Tracks two signals: (1) consecutive iterations with no todo status change, (2) identical tool call signatures. When either hits threshold, injects a hard warning system message. | `profile.anti_stall_enabled`, `profile.anti_stall_threshold` | `agent.py` | ✅ | +| **Adaptive replan on failure** | Detects newly-failed todo steps after each tool batch and queues a re-planning system message for the next iteration. | `profile.adaptive_replan_enabled` | `agent.py` | ✅ | +| **Goal anchoring** | Injects `[Goal anchor]` system message with original request + todo state every N iterations. | `profile.goal_anchoring_enabled`, `profile.goal_anchoring_interval` | `agent.py` | ✅ | +| **Todo status snapshot** | Captures frozenset of `(task_text, status)` before each iteration so anti-stall can detect progress. | None | `agent.py` | ❌ | +| **Todo failed-steps tracking** | Captures frozenset of `(index, text)` for failed steps, used by adaptive replan. | None | `agent.py` | ❌ | +| **Todo progress message injection** | Injects compact system reminder with current todo state and discipline notes at start of every iteration. | None | `agent.py` | ❌ | +| **Memory facts deduplication** | Tracks `_injected_fact_ids` across a single `run_stream` call so the same memory fact is not injected twice in one turn. | None | `agent.py` | ❌ | +| **Context injection collection (parallel)** | Fires `_collect_context_injections` and `_memory_facts_msg` concurrently before each turn. | `profile.context_providers` | `agent.py` | ❌ | +| **MCP server group expansion** | Resolves `profile.mcp_servers`: `*` expands to all tools for that server; named groups resolve via `mcp_manager.resolve_group`. | `profile.mcp_servers` | `agent.py` | ⚠️ | +| **User-enabled tools merge** | Loads extra tool names from `tools/enabled.json` and appends to profile's `enabled_tools`. | `settings.tools_dir` | `agent.py` | ⚠️ | +| **Recall message wrapping** | When `is_recall=True`, prefixes user message with `[Scheduled recall — execute this task]\n\n`. | None | `agent.py` | ❌ | +| **Per-tool-call event sink** | Creates `asyncio.Queue` for each tool call so subagents can emit events back to parent in real time. | None | `agent.py` | ✅ | +| **Display vs context message splitting** | Accepts separate `display_message` (shown in UI) and `user_message` (sent to LLM, may contain injected hints). | None | `agent.py` | ✅ | +| **Post-turn workers** | Runs registered workers sequentially after `StreamEnd`. | None | `agent.py` | ✅ | + +## Subagent (`navi/core/agent.py` `run_ephemeral`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Ephemeral execution** | Runs tool loop without persistent session, temporary in-memory context. Returns `(result_text, completed_normally)`. | `max_iterations` param, `timeout_seconds` param | `agent.py` | ✅ | +| **Inherit system prompt** | When `inherit_system_prompt=True`, prepends parent's `profile.system_prompt` as base layer, then subagent specialization on top. | `inherit_system_prompt` param | `agent.py` | ✅ | +| **Context transfer priming** | If `context_transfer` provided, injects it as synthetic user/assistant exchange before task message. | `context_transfer` param | `agent.py` | ❌ | +| **Wall-clock timeout** | Monitors elapsed time; aborts and returns `[Sub-agent timed out]` if exceeded. | `timeout_seconds` param (default 300.0) | `agent.py` | ✅ | +| **Subagent planning phase** | Optionally runs full 3-phase planning before tool loop for subagents. | `profile.subagent_planning_enabled` | `agent.py` | ✅ | +| **Parent session ID passthrough** | Sets session ContextVar to parent's ID so session-aware tools resolve paths correctly. | `parent_session_id` param | `agent.py` | ✅ | +| **Dedicated subagent tool list** | Uses `profile.subagent_tools` if non-empty; falls back to `profile.enabled_tools`. | `profile.subagent_tools` | `agent.py` | ✅ | +| **ContextVar restoration** | Saves/restores `current_session_id`, `current_model`, `current_user_id`, `current_user_role`, `current_user_info` in `finally` block. | None | `agent.py` | ❌ | + +## Planning Pipeline (`navi/core/planning.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **3-phase planning engine** | Orchestrates Phase 1 (analysis), Phase 2 (review), Phase 3 (execution plan) as async generator. | `profile.planning_phase1_enabled`, `profile.planning_phase2_enabled`, `profile.planning_phase3_enabled`, `profile.planning_mandatory`, `profile.planning_enabled` | `planning.py` | ✅ | +| **Phase 1 — Task analysis** | LLM call reformulates task, identifies subtasks, unknowns, resources. Can output `DIRECT` to skip planning. | `profile.think_enabled`, `profile.planning_phase1_enabled` | `planning.py` | ✅ | +| **Phase 2 — Structured review** | One critique pass when `planning_phase2_enabled=True` and Phase 1 outputs `REFLECT: yes`. Returns Critic/Pragmatist/Detailer/Plan Adjustments. | `profile.planning_phase2_enabled` | `planning.py` | ✅ | +| **Phase 3 — Execution plan** | Produces milestones + numbered steps with executor assignments (`TOOL:`, `AGENT:`, `SELF`). Enforces comma-test splitting. | `profile.planning_phase3_enabled` | `planning.py` | ✅ | +| **Auto-populate todo from plan** | Parses Phase 3 steps and calls `todo.set_tasks()` to initialize session todo list. | None | `planning.py` | ✅ | +| **Plan step parser** | Regex extracts numbered step lines from `**Steps:**` section. | None | `planning.py` | ❌ | +| **Planning debug data logging** | Accumulates per-phase outputs, tokens, timestamps into `_dbg` dict and yields `PlanningDebugData`. | None | `planning.py` | ❌ | +| **Knowledge store rules** | Hard-coded prompt rules distinguishing `memory` vs MCP knowledge servers vs docs. | None | `planning.py` | ❌ | + +## Context Builder (`navi/core/context_builder.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **System prompt caching** | Caches built system prompt per profile ID to avoid rebuilding on every turn. Provides `invalidate_system_prompt_cache()`. | None | `context_builder.py` | ❌ | +| **Persona + profile construction** | Prepends global `NAVI_PERSONA` to profile's `system_prompt`, separated by `---`. | `NAVI_PERSONA_FILE` | `context_builder.py` | ✅ | +| **Cross-profile awareness** | Appends `## Available profiles` block listing all other profiles with descriptions. | None | `context_builder.py` | ⚠️ | +| **Memory summary message** | Injects `## What I remember about the user` if memory store has a summary. | None | `context_builder.py` | ✅ | +| **Memory facts message** | Searches memory facts from user message. Skips messages ≤20 chars or <2 words. Limits: 1 fact for <50 chars, 2 for ≤150, 3 otherwise. Deduplicates. | None | `context_builder.py` | ❌ | +| **Context provider injection** | Injects global providers unconditionally, profile-named providers only if listed in `profile.context_providers`. | `profile.context_providers` | `context_builder.py` | ⚠️ | +| **Goal anchor builder** | Constructs `[Goal anchor]` system message with original request + todo lines. | None | `context_builder.py` | ❌ | +| **Security policy message** | Injects `[Security policy]` based on `current_user_role`: admin = full access; user = sandbox + terminal allowlist. | `TERMINAL_ALLOWED_COMMANDS` | `context_builder.py` | ❌ | +| **User context message** | Builds `[User context]` from `current_user_info` (display_name, email, locale, etc.). | None | `context_builder.py` | ❌ | +| **MCP context message** | Combines MCP server instructions from handshake with overlay instructions from `mcp_servers.d/*.json`. | `profile.mcp_servers` | `context_builder.py` | ❌ | +| **Iteration budget message** | Appends `[Iteration N/M — K after this one]` with escalating urgency when ≤2 or ≤5 remaining. | `profile.iteration_budget_enabled` | `context_builder.py` | ✅ | +| **Session context injection** | Appends session ID and exact `session_files_dir/{session_id}/` path. | `SESSION_FILES_DIR` | `context_builder.py` | ❌ | + +## Context Compression (`navi/core/compressor.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Threshold-based trigger** | Returns `True` when `context_tokens >= max_context_tokens * threshold`. | `CONTEXT_COMPRESSION_THRESHOLD` | `compressor.py` | ✅ | +| **Turn-based partitioning** | Groups messages into turns. Keeps last `keep_recent` turns verbatim; older go to summarization. Tool call groups never split. | `CONTEXT_KEEP_RECENT` | `compressor.py` | ❌ | +| **Mid-turn fallback partitioning** | For long autonomous loops where entire conversation is one turn: keeps current request + newest N messages verbatim, summarizes older messages from same turn. | `keep_recent_messages` param | `compressor.py` | ❌ | +| **Summary input formatter** | Renders messages as plain text for summarizer: preserves summaries, notes image counts, renders tool calls compactly, collects base64 images for vision models. | None | `compressor.py` | ❌ | +| **Summary input truncate** | Hard cap of 24 000 chars on formatted input sent to summarizer LLM. | Hard-coded `_MAX_SUMMARY_INPUT_CHARS=24000` | `compressor.py` | ❌ | +| **LLM-based summarization** | Calls LLM with structured summarization prompt and replaces old messages with `is_summary=True` user message. | `CONTEXT_SUMMARY_TEMPERATURE`, `CONTEXT_SUMMARY_MAX_TOKENS` | `compressor.py` | ✅ | + +## Tool Execution (`navi/core/tool_executor.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Tool name resolution** | Resolves exact names, then falls back through 3 MCP alias heuristics: (1) bare suffix match, (2) dash→underscore, (3) old underscore format. | None | `tool_executor.py` | ❌ | +| **Sequential execution** | Gathers all tool calls with `asyncio.gather` and returns `(tool_msgs, image_msgs)`. | None | `tool_executor.py` | ❌ | +| **Streaming execution** | Same as sequential but yields `ToolEvent` objects alongside messages for UI rendering. | None | `tool_executor.py` | ❌ | +| **Middleware hooks** | Calls `before_execute` and `after_execute` on all registered middlewares around each tool call. | None | `tool_executor.py` | ❌ | +| **Image message generation** | If tool result has `metadata.is_image` + `metadata.base64`, synthesizes vision user message for next LLM call. | None | `tool_executor.py` | ❌ | + +## AI Helper (`navi/core/ai_helper.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Single-turn wrapper** | Thin reusable wrapper over `LLMBackend` for tools needing quick LLM call. Uses `current_model` ContextVar with fallback. | `default_model` param, `temperature` param (default 0.1) | `ai_helper.py` | ❌ | +| **`ask()` with timeout** | Non-streaming call with 120-second `asyncio.wait_for` timeout. | Hard-coded `120` | `ai_helper.py` | ❌ | +| **`ask_json()`** | Calls `ask()` then parses JSON, handling markdown code fences automatically. | None | `ai_helper.py` | ❌ | +| **Token usage emission** | Emits `AIHelperTokensUsed` into `current_event_sink` for parent session metrics. | None | `ai_helper.py` | ❌ | +| **JSON extractor** | Strips code fences, tries direct parse, then bracket-matching to find outermost `[]` or `{}`. | None | `ai_helper.py` | ❌ | + +## Orchestrator (`navi/core/orchestrator.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Orchestrator stub** | Placeholder class for future multi-agent orchestration. Raises `NotImplementedError` if instantiated. | None | `orchestrator.py` | ❌ | + +## Event Bus (`navi/core/event_bus.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Async pub/sub** | Type-specific subscription plus catch-all. Publishes by creating `asyncio.Task` per subscriber and gathering with `return_exceptions=True`. | None | `event_bus.py` | ❌ | +| **Global singleton** | Lazy-initialized default bus via `get_event_bus()`; replaceable via `set_event_bus()`. | None | `event_bus.py` | ❌ | + +## Registries (`navi/core/registry.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`ToolRegistry`** | Holds built-in, user-defined, and external/MCP tools. Supports hot-reload without touching builtins. | `settings.tools_dir` | `registry.py` | ✅ | +| **`ProfileRegistry`** | Holds agent profiles. Supports lookup and in-memory replacement. | None | `registry.py` | ✅ | +| **`BackendRegistry`** | Holds LLM backend instances (Ollama, OpenAI, fallback). | `OLLAMA_BACKENDS_FILE`, `OLLAMA_HOST`, `OPENAI_API_KEY`, etc. | `registry.py` | ✅ | +| **`build_default_registries()`** | Composition root. Discovers backends, creates `AIHelper`, registers all tools, loads user tools, wires cross-references. | `settings.tools_dir`, `settings.context_providers_dir` | `registry.py` | ✅ | + +--- + +## Tools (`navi/tools/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **FilesystemTool** | Read/write/append/edit/list/find/move/copy/delete/exists/mkdir + AI query/smart_edit + grep/diff. Path restrictions via allowlist. | `FS_ALLOWED_PATHS`, `FS_ALLOWED_PATHS_LIST` | `filesystem.py` | ✅ | +| **TerminalTool** | Run shell commands. Unrestricted for admins; sandbox + allowlist for users. | `TERMINAL_ALLOWED_COMMANDS`, `TERMINAL_USER_ALLOWED_COMMANDS` | `terminal.py` | ✅ | +| **SshExecTool** | SSH exec and SCP file transfer. Connection pool per-session with 20-min TTL. | `SSH_HOSTS_FILE` | `ssh_exec.py` | ✅ | +| **CodeExecTool** | Run Python in subprocess sandbox. Non-admin sandboxed to `user_data//`. | None | `code_exec.py` | ✅ | +| **ImageViewTool** | Load image from path/URL → resize to 1024px, JPEG, return base64 for LLM. | None | `image_view.py` | ✅ | +| **MemoryTool** | Save/search/forget/list user facts. Dual search: semantic (cosine) + ILIKE fallback. | None | `memory.py` | ✅ | +| **TodoTool** | Session-scoped task tracker. Set/view/update/clear. Auto-populated from planning. | None | `todo.py` | ✅ | +| **ScratchpadTool** | Session-scoped working notes. Write/append/read/clear per section. | None | `scratchpad.py` | ✅ | +| **SpawnAgentTool** | Spawn isolated subagent with own tool loop. Supports `inherit_system_prompt`, `briefing`, `profile_id`. | None | `spawn_agent.py` | ✅ | +| **SwitchProfileTool** | Switch active profile for session. Blocks `is_subagent_only` profiles. | None | `switch_profile.py` | ✅ | +| **ListProfilesTool** | List all profiles with descriptions. Shows `[subagent only]` tag. | None | `list_profiles.py` | ✅ | +| **ReflectTool** | 3 parallel AI calls (Critic/Pragmatist/Detailer) to challenge assumptions. | None | `reflect.py` | ✅ | +| **ShareFileTool** | Copy file into session directory, return public download link. | `PUBLIC_URL`, `SESSION_FILES_DIR`, `SHARE_FILE_MAX_SIZE_MB` | `share_file.py` | ✅ | +| **ContentPublishTool** | Register session file for inline viewing in chat. | `SESSION_FILES_DIR` | `content_publish.py` | ✅ | +| **ToolManualTool** | Return `manuals/{tool}.md` or auto-generate from schema. | `MANUALS_DIR` | `tool_manual.py` | ✅ | +| **ListToolsTool** | Return tools enabled for a profile, including MCP group expansion. | `tools/enabled.json` | `list_tools.py` | ✅ | +| **ReloadToolsTool** | Hot-reload user tools, context providers, and MCP servers without restart. | `settings.tools_dir`, `settings.context_providers_dir` | `reload_tools.py` | ✅ | +| **ScheduleRecallTool** | Schedule headless callback (once/recurring/immediate). | None | `schedule_recall.py` | ✅ | +| **ManageRecallTool** | Cancel/skip/list scheduled recalls. | None | `manage_recall.py` | ✅ | +| **McpStatusTool** | List MCP servers, connection status, exposed tools. | None | `mcp_status.py` | ✅ | +| **CreateMcpServerTool** | Scaffold new MCP server directory with boilerplate. | None | `create_mcp_server.py` | ✅ | +| **TestMcpToolTool** | Execute single MCP tool call in isolation for diagnostics. | None | `test_mcp_tool.py` | ✅ | + +## Tool Internals (`navi/tools/_internal/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`Tool` ABC** | Abstract base. Self-describes via `name`, `description`, `parameters`. | None | `base.py` | ✅ | +| **`ToolResult`** | Standard return container: `success`, `output`, `error`, `metadata`. | None | `base.py` | ✅ | +| **ContextVars** | Async-safe vars set before each tool call: `current_session_id`, `current_event_sink`, `current_stop_event`, `current_model`, `current_user_id`, `current_user_role`, `current_user_info`. | None | `base.py` | ✅ | +| **Tool loader** | Discovers module-level (`name`, `description`, `parameters`, `execute`) and class-based tools. Errors isolated per file. | None | `loader.py` | ✅ | +| **`ToolMiddleware`** | Pre/post execute hooks for logging/metrics/rate limiting. | None | `middleware.py` | ❌ | +| **`LoggingMiddleware`** | Logs every tool execution with duration and result summary. | None | `logging_middleware.py` | ❌ | +| **Time parser** | Parses natural-language times: ISO, relative (`2d 6h`), `in 3 hours`, `tomorrow at 09:00`. | None | `time_parser.py` | ✅ | + +## MCP Subsystem (`navi/mcp/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`McpClient`** | Official MCP SDK wrapper. stdio or SSE transport. | None | `client.py` | ⚠️ | +| **`McpServerConfig`** | Pydantic model for server config. Auto-migrates legacy `mcp_servers.json` → `mcp_servers.d/`. | `mcp_servers.d/` directory | `config.py` | ⚠️ | +| **`McpManager`** | Pool of `McpClient` instances. Lifecycle management, group resolution, tool listing. | `config_path` | `manager.py` | ⚠️ | +| **`McpTool` proxy** | `Tool` subclass forwarding to MCP server. Namespaced as `mcp::`. | None | `tools.py` | ✅ | + +## Sessions (`navi/core/session.py`, `navi/core/pg_session_store.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Session model** | Dual-buffer design: `messages` (display, never compressed) and `context` (LLM input, compressible). | None | `session.py` | ✅ | +| **`InMemorySessionStore`** | Dict-backed ephemeral store for testing. | None | `session.py` | ✅ | +| **`PgSessionStore`** | PostgreSQL-backed store. Auto-DDL, JSON serialization, full-text search. | `DATABASE_URL` | `pg_session_store.py` | ✅ | +| **Session file storage** | Per-session directory on disk. Safe filename sanitization, forbidden extension blocking, orphan cleanup. | `SESSION_FILES_DIR`, `SESSION_FILES_MAX_SIZE_MB`, `SHARE_FILE_MAX_SIZE_MB` | `session_files.py` | ✅ | +| **Content store** | Registers session files for inline viewing via DB metadata. | `PUBLIC_URL` | `content_store.py` | ⚠️ | + +## Memory System (`navi/memory/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`MemoryStore`** | Composite: `EmbeddingMixin` + `FactMixin` + `SummaryMixin` + `SessionStateMixin`. Lazy-initializes pool, auto-creates tables. | `DATABASE_URL`, `EMBEDDING_MODEL`, `EMBEDDING_DIMENSIONS` | `store.py` | ✅ | +| **`EmbeddingMixin`** | Generates embeddings via LLM backend. Single or batch. Backfills missing. | `EMBEDDING_MODEL`, `EMBEDDING_OLLAMA_HOST`, `EMBEDDING_OLLAMA_API_KEY` | `_embeddings.py` | ✅ | +| **`FactMixin`** | CRUD + dual search: vector (cosine, cutoff 0.3) then ILIKE fallback. Upsert, delete, list, count, categories. | None | `_facts.py` | ✅ | +| **`SummaryMixin`** | Per-user narrative summaries. Deterministic PK via `zlib.crc32`. | None | `_summary.py` | ✅ | +| **`SessionStateMixin`** | Tracks which sessions already processed for fact extraction. Prevents duplicates. | None | `_session_state.py` | ✅ | +| **Fact extraction** | Post-session background extraction from transcripts. LLM parses JSON, upserts facts with confidence, regenerates summary. | Temperature 0.1, summary temperature 0.3, `_MAX_TRANSCRIPT_CHARS=12000` | `extractor.py` | ✅ | +| **DDL builder** | Conditionally generates DDL based on pgvector availability. Creates tables, indexes, constraints. | `EMBEDDING_DIMENSIONS` | `_ddl.py` | ⚠️ | +| **Backfill embeddings script** | Batch-generates embeddings for existing facts without them. 8 per batch, 2s sleep. | `DATABASE_URL`, `EMBEDDING_MODEL` | `backfill_embeddings.py` | ✅ | +| **Migrate pgvector script** | Adds missing pgvector columns and indexes to existing tables. | `DATABASE_URL` | `migrate_pgvector.py` | ✅ | + +## Context Providers (`navi/context_providers/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`ContextProviderRegistry`** | Loads built-in + user providers. Validates exports. Supports hot-reload. | `CONTEXT_PROVIDERS_DIR` | `_loader.py` | ✅ | +| **`public_url` provider** | Injects server's public URL into LLM context as system message. | `PUBLIC_URL` | `public_url.py` | ✅ | + +## Workers (`navi/workers/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Worker base class** | Abstract base for post-response background tasks. Receives `WorkerContext`, may mutate session, return events. | None | `base.py` | ⚠️ | +| **`CompressionWorker`** | Post-turn compression. Replaces old context with summary, resets token count, appends marker. | `CONTEXT_COMPRESSION_ENABLED`, `CONTEXT_COMPRESSION_THRESHOLD`, `CONTEXT_KEEP_RECENT`, `CONTEXT_SUMMARY_TEMPERATURE`, `CONTEXT_SUMMARY_MAX_TOKENS` | `compressor.py` | ✅ | +| **Worker auto-discovery** | Scans `navi/workers/*.py` and auto-instantiates non-abstract `Worker` subclasses. | None | `__init__.py` | ❌ | + +## KV Store (`navi/store/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`KvStore`** | PostgreSQL-backed key-value persistence scoped by `(user_id, session_id, scope, key)`. Auto-creates table/index. | `DATABASE_URL` | `__init__.py` | ✅ | + +## WebSocket (`navi/api/websocket.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Streaming protocol** | Full-duplex: client sends `message`, server emits `stream_start`, `thinking_delta`, `stream_delta`, `tool_call`, `stream_end`, etc. | `_HEARTBEAT_INTERVAL=20.0`, `_MAX_REPLAY_EVENTS=500`, `_MAX_IMAGES=10`, `_MAX_IMAGE_BYTES=5MB` | `websocket.py` | ✅ | +| **Stop session** | `POST /sessions/{id}/stop` sets stop_event cooperatively. | None | `websocket.py` | ✅ | +| **Reconnect/replay** | On connect, if run active, replays buffered events before live stream. | `_MAX_REPLAY_EVENTS=500` | `websocket.py` | ✅ | +| **Image upload validation** | Max 10 images, 5MB each. Strips `data:...;base64,` prefix. | `_MAX_IMAGES=10`, `_MAX_IMAGE_BYTES=5242880` | `websocket.py` | ❌ | +| **Image context annotation** | Appends note telling model that N images are already in multimodal context. | None | `websocket.py` | ❌ | +| **File context annotation** | Appends `[Uploaded files on disk: ...]` to user content. | None | `websocket.py` | ⚠️ | +| **Concurrent run guard** | Rejects new messages if `_runs` or `_busy_sessions` already contains session ID. | None | `websocket.py` | ❌ | +| **Heartbeat keepalive** | Sends `heartbeat` every 20 seconds during idle. | `_HEARTBEAT_INTERVAL=20.0` | `websocket.py` | ✅ | +| **User ContextVar propagation** | Sets `current_user_id`, `current_user_role`, `current_user_info` from resolved `User` before agent run. | None | `websocket.py` | ❌ | +| **Recall update forwarding** | Subscribes to `RecallUpdate` events and forwards to open WebSockets for affected session. | None | `websocket.py` | ✅ | + +## REST API (`navi/api/routes/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **List profiles** | `GET /agents/profiles` — filters `is_admin_only` for non-admins. | None | `agents.py` | ✅ | +| **List system prompts** | `GET /agents/prompts` — returns fully built prompt per profile. | None | `agents.py` | ✅ | +| **List tools** | `GET /agents/tools` — all registered tools with schemas. | None | `agents.py` | ✅ | +| **List MCP servers** | `GET /agents/mcp_servers` — servers, groups, instructions, tools. | None | `agents.py` | ✅ | +| **OAuth login redirect** | PKCE + state, stores platform metadata for Android. | `GNAUTH_CLIENT_ID`, `GNAUTH_REDIRECT_URI` | `auth.py` | ✅ | +| **OAuth callback** | Exchanges code, fetches user, upserts `navi_users`, encrypts tokens, sets cookie or redirects to mobile bridge. | `GNAUTH_CLIENT_ID`, `GNAUTH_CLIENT_SECRET`, `NAVI_AUTH_ENCRYPTION_KEY` | `auth.py` | ✅ | +| **Mobile auth bridge** | HTML bridge page with Chrome Intent URL auto-redirect. | None | `auth.py` | ✅ | +| **Session CRUD** | Create, list (paginated/search/sort), get, pin, delete. | `DATABASE_URL` | `sessions.py` | ✅ | +| **Session context/planning (debug)** | `GET /sessions/{id}/context` and `/planning` — admin only. | None | `sessions.py` | ✅ | +| **Session file upload** | Multipart upload with size/type validation. | `SESSION_FILES_MAX_SIZE_MB` | `sessions.py` | ✅ | +| **Session file download** | Inline or attachment. Path-traversal guarded. | None | `sessions.py` | ✅ | +| **Generate session name** | Auto-generates display name from user messages via LLM. | None | `sessions.py` | ✅ | +| **Recall endpoints** | Get, cancel, skip recalls for session. | None | `sessions.py` | ✅ | +| **Admin session management** | List all sessions, full details, bypass-ownership delete. | None | `admin.py` | ✅ | +| **Admin user management** | List, detail, role update. | None | `admin.py` | ✅ | +| **Admin memory view** | All memory facts with pagination/search. Requires `navi.memory.read_all`. | None | `admin.py` | ✅ | +| **Admin profile toggle** | Toggles `is_admin_only` and persists to `profile_overrides` table. | None | `admin.py` | ✅ | +| **Admin profile detail/update** | GET/PUT profile config, writes back to disk, updates in-memory registry. | None | `admin.py` | ✅ | +| **Admin Ollama blacklist clear** | Clears dead-server and dead-model blacklists. | None | `admin.py` | ✅ | +| **Admin MCP config** | Bulk get/put, per-server CRUD. | None | `admin.py` | ✅ | +| **Admin MCP reconnect** | Drops old client, unregisters tools, connects fresh, re-registers. | None | `admin.py` | ✅ | +| **Admin MCP status/test** | Status listing and isolated tool execution. | None | `admin.py` | ✅ | +| **Admin profile MCP mapping** | GET/PUT `mcp_servers` dict per profile. | None | `admin.py` | ✅ | +| **Admin recall listing** | All scheduled recalls with pagination/filtering. | None | `admin.py` | ✅ | +| **Gnexus-auth webhook** | Receives `user.blocked/archived/deleted`, `auth.global_logout`, `session.revoked`, `client.roles_changed`. | None | `webhooks.py` | ✅ | + +## Dependency Injection (`navi/api/deps.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Lazy singleton initialization** | `get_session_store()`, `get_memory_store()`, `get_kv_store()`, `get_scheduler()`, `get_registries()`, etc. Module-level caching. | `DATABASE_URL`, `EMBEDDING_OLLAMA_HOST`, `EMBEDDING_OLLAMA_API_KEY` | `deps.py` | ⚠️ | +| **MCP manager & tool registration** | `get_mcp_manager()` lazily initializes and loads all servers. Clears and re-registers `mcp:*` tools. | None | `deps.py` | ❌ | +| **Embedding backend wiring** | Dedicated `OllamaBackend` for embeddings (if `EMBEDDING_OLLAMA_HOST` set) or falls back to main chat backend. Injected into memory store. | `EMBEDDING_OLLAMA_HOST`, `EMBEDDING_OLLAMA_API_KEY`, `EMBEDDING_MODEL` | `deps.py` | ⚠️ | + +## Scheduler / Recall (`navi/core/scheduler.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **Recall scheduling** | Inserts pending recall into `session_recalls`. One pending per session via partial unique index. | None | `scheduler.py` | ✅ | +| **Recall cancellation** | Marks pending recalls as `cancelled`. | None | `scheduler.py` | ✅ | +| **Recall skip** | Advances recurring recall by `interval_seconds` using `GREATEST(trigger_at, now)`. | None | `scheduler.py` | ✅ | +| **Recall listing** | Filter by session_id, user_id, with admin override. | None | `scheduler.py` | ✅ | +| **Pending recall queries** | `get_pending_recalls(before)`, `get_next_trigger_at()`, `get_pending_session_ids()`. | None | `scheduler.py` | ✅ | +| **Background loop** | `recall_scheduler_loop()` polls for due recalls, fires up to 3 concurrently (semaphore), sleeps until next trigger. | None | `scheduler.py` | ✅ | +| **Headless fire** | `_fire_recall()` defers if WebSocket run active, loads session, sets ContextVars, registers headless `_AgentRun`, streams events, handles success/failure/MaxIterationsReached, reschedules recurring, sends `session_sync`. | None | `scheduler.py` | ✅ | + +## Main Application (`navi/main.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **FastAPI app setup** | Creates app, includes routers, mounts static dirs. | None | `main.py` | ❌ | +| **CORS middleware** | Allows all origins, credentials, methods, headers. | None | `main.py` | ❌ | +| **Static file mounting** | `/assets`, `/images`, `/content-viewers`, `/content`, `/debug`, `/debug/eval`, `/admin`. | None | `main.py` | ❌ | +| **Startup lifecycle** | Ensures auth tables, content store tables, initializes registries, connects MCP, applies profile overrides, checks embedding health, starts file cleanup + recall scheduler. Retries table creation 5× with 2s sleep for Docker races. | None | `main.py` | ❌ | +| **Shutdown lifecycle** | Closes SSH connections, disconnects MCP servers, cancels scheduler task. | None | `main.py` | ❌ | + +## Configuration (`navi/config.py`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`Settings` model** | Pydantic-settings from `.env`. Covers LLM, embedding, web search, sandboxing, SSH, DB, logging, tools, session files, public URL, Gmail, OAuth, cookies, timeouts, compression, persona. | ~60 env vars | `config.py` | ✅ | +| **Persona file loader** | `@model_validator` reads `navi_persona_file` into `navi_persona` if inline value empty. | `NAVI_PERSONA_FILE` | `config.py` | ✅ | +| **Computed path/command lists** | Parses comma-separated strings into lists. | `FS_ALLOWED_PATHS`, `TERMINAL_ALLOWED_COMMANDS`, etc. | `config.py` | ✅ | + +## Auth System (`navi/auth/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **User model** | Pydantic `User` with id, email, profile, role, permissions, `has_permission()` helper. | None | `auth/__init__.py` | ✅ | +| **Auth DDL & boot-time migrations** | Creates `navi_users`, `user_auth_sessions`, migrates missing columns. | None | `auth/_ddl.py` | ✅ | +| **`GAuthClient` singleton** | Shared state/PKCE stores, builds per-redirect_uri `GAuthClient`. | `GNAUTH_CLIENT_ID`, `GNAUTH_CLIENT_SECRET` | `auth/client.py` | ✅ | +| **Token encryption** | Fernet encrypts/decrypts OAuth tokens before DB storage. | `NAVI_AUTH_ENCRYPTION_KEY` | `auth/encrypt.py` | ✅ | +| **Current user resolution** | Reads cookie, decrypts token, refreshes if expired, fetches from gnexus-auth, upserts `navi_users`, caches in `conn.state.user`. | None | `auth/deps.py` | ✅ | +| **Permission guards** | `require_user` (401), `require_admin` (403), `require_permission(permission)` (403). | None | `auth/deps.py` | ✅ | +| **Session access control** | `check_session_access()`: legacy sessions (user_id=None) admin-only; owned sessions accessible to owner/admin/explicit permission. | None | `auth/deps.py` | ✅ | + +## Profiles (`navi/profiles/`) + +| Mechanic | Description | Config / Flags | Files | Docs | +|---|---|---|---|---| +| **`AgentProfile` model** | Full agent config: identity, LLM settings, tools, thinking mechanics, planning flags, subagent config, visibility flags. | All fields in `config.json` | `profiles/base.py` | ✅ | +| **Profile loader** | Auto-discovers subdirs with `config.json` + `system_prompt.txt`. Validates keys, loads optional `subagent_system_prompt.txt`, migrates `planning_reflect_enabled` → `planning_phase2_enabled`. | None | `profiles/loader.py` | ✅ | +| **Profile saver** | Writes `config.json`, `system_prompt.txt`, optional `subagent_system_prompt.txt`. | None | `profiles/loader.py` | ✅ | +| **Runtime profile overrides** | `profile_overrides` table persists `is_admin_only` changes across restarts. Loaded at startup. | None | `profiles/_overrides.py` | ❌ | + +--- + +## Undocumented Mechanics Summary + +These mechanisms have **no documentation** in `docs/`: + +1. **Streaming guard wrapper** (`agent.py`) — prefill polling + hard timeouts +2. **Subagent thinking stall detector** (`agent.py`) — aborts subagent after 60s/12k chars of pure thinking +3. **Memory facts deduplication** (`agent.py`) — `_injected_fact_ids` per turn +4. **Mid-turn context compression** (`agent.py`) — compression during autonomous loops with doubled keep-recent +5. **Context injection collection (parallel)** (`agent.py`) — concurrent context providers + memory +6. **Security policy message** (`context_builder.py`) — dynamic sandbox/allowlist message +7. **User context message** (`context_builder.py`) — injects user profile data +8. **MCP context message** (`context_builder.py`) — combines handshake + overlay instructions +9. **Memory facts message (length limits)** (`context_builder.py`) — skip short messages, limit results +10. **System prompt caching** (`context_builder.py`) — per-profile cache +11. **Goal anchor builder** (`context_builder.py`) — constructs goal anchor message +12. **Session context injection** (`context_builder.py`) — injects session files path +13. **Turn-based partitioning** (`compressor.py`) — turn grouping for compression +14. **Mid-turn fallback partitioning** (`compressor.py`) — same-turn compression for long loops +15. **Summary input formatter** (`compressor.py`) — message rendering for summarizer +16. **Summary input truncate** (`compressor.py`) — 24k char cap +17. **Tool name resolution (MCP aliases)** (`tool_executor.py`) — 3 heuristic fallbacks +18. **Tool middleware hooks** (`tool_executor.py`) — before/after execute +19. **Image message generation** (`tool_executor.py`) — synthesizes vision message from tool result +20. **Streaming tool execution** (`tool_executor.py`) — yields ToolEvent alongside messages +21. **AIHelper single-turn wrapper** (`ai_helper.py`) +22. **AIHelper `ask()` timeout** (`ai_helper.py`) — 120s hard limit +23. **AIHelper `ask_json()`** (`ai_helper.py`) +24. **AIHelper token usage emission** (`ai_helper.py`) +25. **AIHelper JSON extractor** (`ai_helper.py`) +26. **Orchestrator stub** (`orchestrator.py`) +27. **EventBus pub/sub** (`event_bus.py`) +28. **Image upload validation** (`websocket.py`) — 10 images, 5MB each +29. **Image context annotation** (`websocket.py`) — note about inline images +30. **File context annotation** (`websocket.py`) — uploaded files list +31. **Concurrent run guard** (`websocket.py`) +32. **User ContextVar propagation** (`websocket.py`) +33. **MCP manager & tool registration** (`deps.py`) — lazy init + clear/re-register +34. **Worker auto-discovery** (`workers/__init__.py`) +35. **`ToolMiddleware` ABC** (`tools/_internal/middleware.py`) +36. **`LoggingMiddleware`** (`tools/_internal/logging_middleware.py`) +37. **Profile overrides persistence** (`profiles/_overrides.py`) — DB table for admin toggles +38. **Plan step parser** (`planning.py`) +39. **Planning debug data logging** (`planning.py`) +40. **Knowledge store rules** (`planning.py`) +41. **Context transfer priming** (`agent.py` subagent) +42. **ContextVar restoration** (`agent.py` subagent) +43. **FastAPI CORS** (`main.py`) +44. **Static file mounting** (`main.py`) +45. **Startup lifecycle** (`main.py`) — table creation retries, MCP connect, override apply +46. **Shutdown lifecycle** (`main.py`) +47. **Local token estimation** (`agent.py`) +48. **Recall message wrapping** (`agent.py`) +49. **Cross-profile awareness** (`context_builder.py`) +50. **MCP context message** (`context_builder.py`) + +--- + +## Cross-Reference Index + +> "I want to build X — which existing mechanics might help?" + +| Desired Feature | Reusable Mechanic(s) | +|---|---| +| **Rate-limit tool calls** | `ToolMiddleware` hooks (`before_execute`/`after_execute`) | +| **Log tool metrics** | `LoggingMiddleware` (already exists) | +| **Inject dynamic context per session** | Context providers (`ContextProviderRegistry`) | +| **Inject dynamic context per turn** | `ContextBuilder._collect_context_injections` + context providers | +| **Sandbox file access for users** | `FS_ALLOWED_PATHS` + `_check_path()` in `FilesystemTool` | +| **Sandbox shell commands for users** | `TERMINAL_ALLOWED_COMMANDS` + dangerous pattern blocklist | +| **Track task progress across turns** | `TodoTool` + planning auto-populate | +| **Detect model looping** | Anti-stall detector + todo snapshot comparison | +| **Auto-replan on failures** | Adaptive replan + todo failed-steps tracking | +| **Compress long conversations** | `CompressionWorker` + `compress_context()` | +| **Summarize old turns** | `compress_context()` with LLM-based summarization | +| **Schedule future work** | `RecallScheduler` + `schedule_recall` / `manage_recall` tools | +| **Run headless agent** | `Agent.run_stream()` with `is_recall=True` | +| **Stream events to client** | `_AgentRun` queue + WebSocket protocol | +| **Delegate to specialist agent** | `SpawnAgentTool` + `run_ephemeral()` | +| **Switch agent personality mid-chat** | `SwitchProfileTool` + profile reload mid-session | +| **Store per-session state across restarts** | `KvStore` (`session_store` table) | +| **Persist user facts across sessions** | `MemoryStore` + `FactMixin` + extractor | +| **Search memory semantically** | `EmbeddingMixin` + pgvector cosine distance | +| **Hot-reload without restart** | `ReloadToolsTool` + `ToolRegistry.reload_user_tools()` | +| **Multi-server LLM fallback** | `FallbackOllamaBackend` + blacklist clearing | +| **Auth with external OAuth** | `GAuthClient` + `_resolve_user()` + token refresh | +| **Permission-based access control** | `require_permission()` + `check_session_access()` | +| **Auto-generate tool docs** | `ToolManualTool` (manuals/ → auto-generate fallback) | +| **Add new capability without code** | `CreateMcpServerTool` + MCP proxy tools | +| **Monitor external services** | `McpStatusTool` + `McpManager.get_all_tools()` | +| **Run isolated Python code** | `CodeExecTool` with sandbox | +| **Transfer files to/from remote** | `SshExecTool` with connection pooling | +| **Process images for LLM** | `ImageViewTool` preprocessing pipeline | +| **Validate planning assumptions** | `ReflectTool` (Critic/Pragmatist/Detailer) | +| **Control planning depth** | Planning flags: `planning_phase1/2/3_enabled`, `planning_mandatory` | +| **Prevent model drift** | Goal anchoring + iteration budget injection | +| **Stop long-running generation** | Cooperative stop via `current_stop_event` | +| **Handle model prefill hangs** | Streaming guard wrapper | +| **Handle model streaming stalls** | Chunk timeout in streaming guard | +| **Abort stuck subagents** | Subagent thinking stall detector | +| **Cache expensive prompts** | System prompt caching in `ContextBuilder` | +| **Parallelize independent LLM calls** | `asyncio.gather` pattern (used in context injection, reflect tool) | +| **Handle JSON from unreliable LLM** | `AIHelper.ask_json()` + bracket-matching extractor | +| **Track token usage** | `AIHelperTokensUsed` event emission | +| **Replay events on reconnect** | WebSocket replay mechanism | +| **Keep connection alive** | WebSocket heartbeat | +| **Prevent concurrent runs** | Concurrent run guard | +| **Handle mobile auth** | Mobile auth bridge with Chrome Intent | +| **Persist admin toggles** | `profile_overrides` table | +| **Auto-apply schema migrations** | DDL builder + boot-time retries | +| **Backfill missing data** | `backfill_embeddings.py` pattern | +| **Clean up old session files** | Startup file cleanup loop | + +--- + +*Generated 2026-05-16. To update after adding a new mechanic: add a row to the appropriate section and update the cross-reference index if relevant.*