diff --git a/docs/archive/visual.html b/docs/archive/visual.html
new file mode 100644
index 0000000..4797cee
--- /dev/null
+++ b/docs/archive/visual.html
@@ -0,0 +1,1362 @@
+
+
+
+
+
+
+
+ 🧭 Project Overview
+ Navi is a personal modular AI agent system. FastAPI backend + vanilla JS client. The agent is named Navi — female personal assistant. Runs locally via Ollama.
+
+
+
+
Entry point
+
navi/main.py
+
FastAPI app
+
+
+
Run command
+
uvicorn navi.main:app
+
--reload --port 8000
+
+
+
Default model
+
gemma4:31b-cloud
+
Ollama, 2B active params
+
+
+
Context window
+
65 536 tokens
+
OLLAMA_NUM_CTX
+
+
+
Database
+
SQLite
+
navi.db via aiosqlite
+
+
+
Thinking
+
Enabled
+
OLLAMA_THINK=true
+
+
+
+
+
+
+ 📦 Stack
+
+
+ | Layer | Technology | Notes |
+ | Web framework | FastAPI + uvicorn | ASGI, async throughout |
+ | LLM backend (primary) | Ollama | Local, OllamaBackend in navi/llm/ollama.py |
+ | LLM backend (alt) | OpenAI-compatible | navi/llm/openai_backend.py |
+ | Database | aiosqlite | Sessions + memory facts in navi.db |
+ | Config | pydantic-settings | Reads .env, typed Settings object |
+ | Logging | structlog | Structured JSON-friendly logs |
+ | Client | Vanilla JS ES modules | marked.js + highlight.js via esm.sh CDN |
+ | Markdown rendering | marked.js | In browser, assistant messages |
+
+
+
+
+
+
+ 🗂️ Component Map
+
+
+
+
+
Client (browser)
+
+ WebSocket /ws/sessions/{id}
+ REST /sessions/*
+ REST /agents/*
+
+
+
+
↓
+
+
+
FastAPI — navi/main.py
+
+ api/websocket.py · _AgentRun · stop endpoint
+ routes/sessions.py
+ routes/agents.py
+ routes/messages.py
+
+
+
+
↓
+
+
+
Agent — navi/core/agent.py
+
+ run_stream() → AsyncGenerator[AgentEvent]
+ run() → str
+ run_ephemeral() → str (subagent)
+ _run_planning()
+ _run_workers()
+
+
+
+
↓
+
+
+
Registries — navi/core/registry.py · build_default_registries()
+
+ ToolRegistry
+ ProfileRegistry
+ BackendRegistry
+
+
+
+
↓
+
+
+
+
LLM Backend
+
+ OllamaBackend
+ complete()
+ stream_complete()
+
+
+
+
SessionStore (SQLite)
+
+ messages[]
+ context[]
+
+
+
+
MemoryStore (SQLite)
+
+ memory_facts
+ summary
+
+
+
+
+
+
+
+
+
+ 🔄 Request Lifecycle
+ Streaming flow from WebSocket message to final response.
+
+
+
1
+
+ Client sends message
+ {type:"message", content:"...", images:[...]} over WebSocket
+
+
+
+
2
+
+ websocket_session() creates _AgentRun
+ Subscribes a queue, launches _run_agent() as asyncio task, sends stream_start
+
+
+
+
3
+
+ Pre-turn compression check
+ If context_token_count ≥ num_ctx × threshold → compress context before LLM call
+
+
+
+
4
+
+ Planning phase
+ If profile.planning_enabled: fast non-streaming LLM call → yields plan_ready event if plan generated
+
+
+
+
5
+
+ Tool-calling loop (max_iterations)
+ Calls llm.stream_complete() → yields thinking/text/tool events. Loops until finish_reason=stop
+
+
+
+
6
+
+ StreamEnd + workers
+ Saves session to DB. Runs post-turn workers (compression). Yields context_compressed if triggered
+
+
+
+
✓
+
+ Done
+ Events broadcast from _AgentRun to all subscriber queues → sent as JSON to WebSocket
+
+
+
+
+
+
+
+ 🔗 Context Vars
+ Thread-safe async-safe state shared between Agent and tools. Defined in navi/tools/base.py.
+
+
+ | ContextVar | Type | Set by | Used by |
+
+ | current_session_id |
+ str | None |
+ Agent before each run |
+ SSH pool, scratchpad, todo — per-session state |
+
+
+ | current_event_sink |
+ Queue | None |
+ run_stream() per tool task |
+ run_ephemeral() forwards sub-agent events to parent stream |
+
+
+ | current_stop_event |
+ Event | None |
+ _run_agent() before run_stream() |
+ Agent loop checks before each LLM call and mid-stream |
+
+
+
+
+ Never use task.cancel() for stopping generation. It corrupts Starlette's WebSocket receive state. Use current_stop_event.set() via POST /sessions/{id}/stop.
+
+
+
+
+
+ ⚙️ Agent Loop
+ Three entry points in navi/core/agent.py:
+
+
+ | Method | Returns | Persistence | Planning |
+
+ run(session_id, msg) |
+ str |
+ SQLite session |
+ No |
+
+
+ run_stream(session_id, msg) |
+ AsyncGenerator[AgentEvent] |
+ SQLite session |
+ Yes (if profile.planning_enabled) |
+
+
+ run_ephemeral(msg, profile_id) |
+ str |
+ In-memory only |
+ No |
+
+
+
+
+ System prompt construction
+ Built fresh on every LLM call — never stored in session.context.
+ NAVI_PERSONA (global personality)
+───────────────────────────────────────
+profile.system_prompt (domain rules)
+───────────────────────────────────────
+[memory injection: "## What I remember about the user"]
+───────────────────────────────────────
+session.context messages (history, no system msgs)
+
+ Sub-agent isolation
+ run_ephemeral() sets current_session_id = "subagent_<uuid12>" so each subagent has its own isolated scratchpad and SSH connection pool entry.
+
+
+
+
+ 🗺️ Planning Phase
+ Runs before the tool-calling loop when profile.planning_enabled = true.
+
+
+
+
1
+
+ LLM call: decide or plan
+ Fast non-streaming call: think=False, temperature=0.3, no tools
+
+
+
+
2
+
+ Response classification
+ Starts with DIRECT → skip planning. No numbered steps found → skip. Otherwise → real plan.
+
+
+
+
3
+
+ Plan injection
+ Appended to session.context as assistant message — model continues from it naturally
+
+
+
+
4
+
+ PlanReady event emitted
+ Rendered as collapsible 🗺️ card in UI before execution begins
+
+
+
+
+
+
+
+ 💾 Sessions
+
+ Session model (navi/core/session.py)
+
+
+ | Field | Type | Description |
+ id | UUID str | Unique session identifier |
+ profile_id | str | Active profile |
+ messages | list[Message] | Full history Never compressed. Used for UI display. |
+ context | list[Message] | LLM context May be replaced by compression summary. |
+ context_token_count | int | Accumulated tokens; reset to 0 after compression |
+ pinned | bool | Pinned sessions appear first in sidebar |
+
+
+
+ Dual-buffer design
+
+ Key invariant: session.messages is the full, unmodified conversation history — always available for display. session.context is what the LLM actually sees — may contain a compression summary instead of old messages.
+
+
+ Message format
+
+
+ | Field | Present on | Type |
+ role | all | user | assistant | tool | system |
+ content | most | str | None |
+ images | user, assistant | list[str] — base64 |
+ tool_calls | assistant (when calling tools) | list[ToolCallRequest] |
+ tool_call_id | tool results | str |
+ name | tool results | tool name |
+ is_summary | compressed blocks | bool |
+ created_at | user/assistant | ISO 8601 datetime |
+
+
+
+
+
+
+ 🗜️ Context Compression
+ Keeps the LLM context within the token budget. Only session.context is modified — session.messages is never touched.
+
+ Trigger points
+
+
+
Pre-turn
+
Before LLM call in run_stream()
+
Checks context_token_count against threshold
+
+
+
Post-turn (worker)
+
After StreamEnd via CompressionWorker
+
Re-checks and compresses if still needed
+
+
+
+ Algorithm
+
+
+
1
+
+ Partition into turns
+ Keep last context_keep_recent turns verbatim. Tool call groups never split.
+
+
+
+
2
+
+ Format old turns as text
+ Tool args truncated to 120 chars, results to 300 chars. Total input capped at 12 000 chars.
+
+
+
+
3
+
+ Summarize with LLM
+ think=False, bullet-point output. Same model — no model swap or extra loading.
+
+
+
+
4
+
+ Replace with summary message
+ role=user, is_summary=True. Result: system_msgs + [summary] + recent_turns
+
+
+
+
+ Config
+
+
+ | Setting | Default | Description |
+ CONTEXT_COMPRESSION_ENABLED | true | Enable/disable |
+ CONTEXT_COMPRESSION_THRESHOLD | 0.80 | Trigger at 80% of context window |
+ CONTEXT_KEEP_RECENT | 10 | Turns kept verbatim |
+ CONTEXT_SUMMARY_TEMPERATURE | 0.3 | Summarization temperature |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 📡 WebSocket Protocol
+
+ Endpoint: ws://host/ws/sessions/{session_id}
+ Closes with code 4004 if session not found.
+
+ Client → Server
+ {
+ "type": "message", // required, always "message"
+ "content": "user text", // required, non-empty
+ "images": ["base64..."], // optional; data: URI prefix stripped server-side
+ "files": [ // optional; from POST /sessions/{id}/files
+ {"name": "file.pdf", "path": "/abs/path/..."}
+ ]
+}
+
+
+
+
+ 📬 Events Reference
+
+
+ | Type | Direction | Fields | Description |
+
+ | stream_start |
+ S→C | — |
+ Agent processing began. Block user input. |
+
+
+ | thinking_delta |
+ S→C | delta |
+ Reasoning chunk (streaming). Accumulate until thinking_end. |
+
+
+ | thinking_end |
+ S→C | — |
+ Reasoning phase complete. Auto-collapsed in UI. |
+
+
+ | turn_thinking |
+ S→C | thinking, is_subagent |
+ Full reasoning block from tool-calling turn (non-streaming). |
+
+
+ | plan_ready |
+ S→C | plan |
+ Step-by-step plan before execution. Rendered as 🗺️ card. |
+
+
+ | tool_started |
+ S→C | tool, args, is_subagent |
+ Tool call began. Shows pending spinner in UI immediately. |
+
+
+ | tool_call |
+ S→C | tool, args, result, success, is_subagent |
+ Tool finished. Pairs with preceding tool_started. |
+
+
+ | stream_delta |
+ S→C | delta |
+ Final response text chunk. Accumulate to build full content. |
+
+
+ | stream_end |
+ S→C | content, context_tokens, max_context_tokens |
+ Final response complete. Unlock user input. |
+
+
+ | stream_stopped |
+ S→C | — |
+ User stopped generation via POST /sessions/{id}/stop. |
+
+
+ | context_compressed |
+ S→C | messages_before, messages_after |
+ Context compression ran after this turn. |
+
+
+ | profile_switched |
+ S→C | profile_id, profile_name |
+ Active profile changed mid-stream by switch_profile tool. |
+
+
+ | error |
+ S→C | message |
+ Unhandled error. Some are recoverable, some terminate the stream. |
+
+
+
+
+
+
+
+ 🎬 Typical Event Sequences
+
+ Simple question (no tools)
+
+
stream_start
+
thinking_delta × N // if model reasons
+
thinking_end
+
stream_delta × N
+
stream_end
+
+
+ With planning + tools
+
+
stream_start
+
plan_ready // if planning_enabled
+
turn_thinking // reasoning before tool selection
+
tool_started
+
tool_call
+
tool_started
+
tool_call
+
thinking_delta × N
+
thinking_end
+
stream_delta × N
+
stream_end
+
context_compressed // optional, if threshold hit
+
+
+ Subagent (spawn_agent)
+
+
stream_start
+
tool_started spawn_agent is_subagent=false
+
turn_thinking is_subagent=true
+
tool_started mcp__navi_web__web_search is_subagent=true
+
tool_call mcp__navi_web__web_search is_subagent=true
+
tool_started filesystem is_subagent=true
+
tool_call filesystem is_subagent=true
+
tool_call spawn_agent is_subagent=false
+
stream_delta × N
+
stream_end
+
+
+ Profile switch
+
+
stream_start
+
tool_started switch_profile
+
profile_switched // update UI here
+
tool_call switch_profile
+
stream_delta × N
+
stream_end
+
+
+
+
+
+ 🌐 REST API
+
+
+ | Method | Path | Description |
+
+ | GET |
+ /health |
+ Health check → {"status":"ok"} |
+
+
+ | GET |
+ /agents/profiles |
+ List all available profiles |
+
+
+ | GET |
+ /agents/tools |
+ List all registered tools (builtin + user) |
+
+
+ | POST |
+ /sessions |
+ Create session → {session_id, profile_id, created_at} |
+
+
+ | GET |
+ /sessions |
+ List all sessions (sorted by pinned+last_active) |
+
+
+ | GET |
+ /sessions/{id} |
+ Full session with message history (display buffer) |
+
+
+ | GET |
+ /sessions/{id}/context |
+ LLM context (may differ from messages — for debugging) |
+
+
+ | PATCH |
+ /sessions/{id}/pin |
+ Pin or unpin a session |
+
+
+ | DEL |
+ /sessions/{id} |
+ Delete session and its uploaded files |
+
+
+ | POST |
+ /sessions/{id}/files |
+ Upload file (multipart/form-data). Max 200 MB. TTL 24h. |
+
+
+ | POST |
+ /sessions/{id}/messages |
+ Send message, wait for full response (non-streaming) |
+
+
+ | POST |
+ /sessions/{id}/stop |
+ Signal cooperative stop for running agent |
+
+
+ | WS |
+ /ws/sessions/{id} |
+ Streaming agent interface |
+
+
+
+
+
+
+
+ 👤 Profiles
+ Profiles define tools, system prompt, model, and behaviour per domain. Defined in navi/profiles/.
+
+
+
+ | Profile ID | Name | Model | Temp | Planning |
+
+ secretary | Personal Secretary |
+ gemma4:31b-cloud |
+ 0.7 |
+ Yes |
+
+
+ server_admin | Server Administrator |
+ gemma4:31b-cloud |
+ 0.2 |
+ Yes |
+
+
+ smart_home | Smart Home Assistant |
+ gemma4:31b-cloud |
+ 0.3 |
+ Yes |
+
+
+
+
+ Per-profile scratchpad sections
+
+
+ | Profile | Sections | Domain focus |
+ secretary | findings, sources, drafts | Research, writing, analysis |
+ server_admin | status, logs, errors, plan | Remote ops, monitoring |
+ smart_home | state, config, errors | Home Assistant, IoT, automations |
+
+
+
+ AgentProfile fields
+
+
+ | Field | Type | Description |
+ id | str | Unique identifier used in API and sessions |
+ name | str | Human-readable name for UI |
+ system_prompt | str | Domain-specific instructions (appended after persona) |
+ enabled_tools | list[str] | Tool names available to this profile |
+ model | str | Ollama model override (falls back to settings default) |
+ temperature | float | LLM temperature |
+ max_iterations | int | Tool-calling loop limit (default 50) |
+ planning_enabled | bool | Run planning phase before tool loop |
+ llm_backend | str | Backend key in BackendRegistry (default "ollama") |
+
+
+
+
+
+
+ 🧠 Memory System
+ Long-term user memory: facts extracted from conversations, stored in SQLite, injected into every session.
+
+ Database schema
+
+
+ | Table | Key columns | Purpose |
+
+ memory_facts |
+ (category, key) unique |
+ Individual facts about the user — preferences, projects, environment |
+
+
+ memory_summary |
+ Single row (id=1) |
+ Narrative summary generated from all facts; injected into every session |
+
+
+ session_memory_state |
+ session_id, extracted_at |
+ Tracks which sessions have been processed for extraction |
+
+
+
+
+ Automatic extraction trigger
+ POST /sessions (create new session) fires _process_stale_sessions() as a background task. Processes sessions idle > 30 minutes that haven't been extracted yet.
+
+ Memory injection
+ On every run_stream() / run() call, _memory_msg() fetches the summary and returns a system message: "## What I remember about the user\n\n{summary}". Injected after main system prompt, before conversation history.
+
+ Memory tools usage rules
+
+ Call memory_search when the user mentions something personal or before making assumptions about their environment. Do not call at session start reflexively — only when context warrants it. Call memory_forget only when explicitly asked.
+
+
+
+
+
+ ⚙️ Configuration
+ All settings read from .env via pydantic-settings. Imported as from navi.config import settings.
+
+ LLM
+
+
+ | Variable | Default | Description |
+ OLLAMA_HOST | http://localhost:11434 | Ollama server URL |
+ OLLAMA_DEFAULT_MODEL | gemma4:31b-cloud | Default model (overridable per profile) |
+ OLLAMA_NUM_CTX | 65536 | Context window size in tokens |
+ OLLAMA_THINK | true | Enable extended reasoning |
+
+
+
+ Security / Sandboxing
+
+
+ | Variable | Default | Description |
+ FS_ALLOWED_PATHS | * | Comma-separated paths filesystem tool can access. * = no limit |
+ TERMINAL_ALLOWED_COMMANDS | * | Comma-separated allowed executables. * = allow all |
+ SSH_HOSTS_FILE | ssh_hosts.json | Named SSH connections config |
+
+
+
+ Persona
+
+
+ | Variable | Description |
+ NAVI_PERSONA | Inline global personality prompt |
+ NAVI_PERSONA_FILE | Path to .txt file with persona (recommended — inline doesn't parse multiline well) |
+
+
+
+ Other
+
+
+ | Variable | Default | Description |
+ DB_PATH | navi.db | SQLite file path |
+ LOG_LEVEL | INFO | DEBUG / INFO / WARNING / ERROR |
+ TOOLS_DIR | tools | User tools directory |
+ SESSION_FILES_DIR | session_files | Uploaded files directory |
+ SESSION_FILES_MAX_SIZE_MB | 200 | Max upload size per file |
+ SESSION_FILES_TTL_HOURS | 24 | File retention hours |
+
+
+
+
+
+
+
+
+
diff --git a/docs/archive/visual.html.readme.md b/docs/archive/visual.html.readme.md
new file mode 100644
index 0000000..74d0d11
--- /dev/null
+++ b/docs/archive/visual.html.readme.md
@@ -0,0 +1,15 @@
+# docs/visual.html — Archived
+
+This file was an early interactive architecture reference page. It is now stale:
+
+- References old tool names (`mcp__navi_web__*`, `memory_search`, `memory_forget`, `write_tool` as a built-in).
+- Describes SQLite as primary database; Navi uses PostgreSQL.
+- omits newer tools (`create_mcp_server`, `test_mcp_tool`, `mcp_status`, `schedule_recall`, `manage_recall`, `content_publish`, `share_file`).
+- Profile/tool configuration changed to `tools.agent` / `tools.subagent`.
+
+Use the markdown docs in `docs/` instead:
+- `docs/index.md` — project overview and stack
+- `docs/architecture.md` — component diagram and data flow
+- `docs/tools.md` — built-in and MCP tools
+- `docs/profiles.md` — profile configuration
+- `docs/api.md` — REST + WebSocket reference
diff --git a/docs/mechanics.md b/docs/mechanics.md
index a38f75b..273aa33 100644
--- a/docs/mechanics.md
+++ b/docs/mechanics.md
@@ -41,8 +41,8 @@
| **Todo progress message injection** | Injects compact system reminder with current todo state and discipline notes at start of every iteration. | None | `agent.py` | ❌ |
| **Memory facts deduplication** | Tracks `_injected_fact_ids` across a single `run_stream` call so the same memory fact is not injected twice in one turn. | None | `agent.py` | ❌ |
| **Context injection collection (parallel)** | Fires `_collect_context_injections` and `_memory_facts_msg` concurrently before each turn. | `profile.context_providers` | `agent.py` | ❌ |
-| **MCP server group expansion** | Resolves `profile.mcp_servers`: `*` expands to all tools for that server; named groups resolve via `mcp_manager.resolve_group`. | `profile.mcp_servers` | `agent.py` | ⚠️ |
-| **User-enabled tools merge** | Loads extra tool names from `tools/enabled.json` and appends to profile's `enabled_tools`. | `settings.tools_dir` | `agent.py` | ⚠️ |
+| **MCP server group expansion** | Resolves `profile.tools.agent.mcp`: `*` expands to all tools for that server; named groups resolve via `mcp_manager.resolve_group`. | `profile.tools.agent.mcp` | `agent.py` | ⚠️ |
+| **User-enabled tools merge** | Loads extra tool names from `tools/enabled.json` and appends to profile's `tools.agent.native`. | `settings.tools_dir` | `agent.py` | ⚠️ |
| **Recall message wrapping** | When `is_recall=True`, prefixes user message with `[Scheduled recall — execute this task]\n\n`. | None | `agent.py` | ❌ |
| **Per-tool-call event sink** | Creates `asyncio.Queue` for each tool call so subagents can emit events back to parent in real time. | None | `agent.py` | ✅ |
| **Display vs context message splitting** | Accepts separate `display_message` (shown in UI) and `user_message` (sent to LLM, may contain injected hints). | None | `agent.py` | ✅ |
@@ -58,7 +58,7 @@
| **Wall-clock timeout** | Monitors elapsed time; aborts and returns `[Sub-agent timed out]` if exceeded. | `timeout_seconds` param (default 300.0) | `agent.py` | ✅ |
| **Subagent planning phase** | Optionally runs full 3-phase planning before tool loop for subagents. | `profile.subagent_planning_enabled` | `agent.py` | ✅ |
| **Parent session ID passthrough** | Sets session ContextVar to parent's ID so session-aware tools resolve paths correctly. | `parent_session_id` param | `agent.py` | ✅ |
-| **Dedicated subagent tool list** | Uses `profile.subagent_tools` if non-empty; falls back to `profile.enabled_tools`. | `profile.subagent_tools` | `agent.py` | ✅ |
+| **Dedicated subagent tool list** | Uses `profile.tools.subagent` if non-empty; falls back to `profile.tools.agent`. | `profile.tools.subagent` | `agent.py` | ✅ |
| **ContextVar restoration** | Saves/restores `current_session_id`, `current_model`, `current_user_id`, `current_user_role`, `current_user_info` in `finally` block. | None | `agent.py` | ✅ |
## Planning Pipeline (`navi/core/planning.py`)
@@ -87,7 +87,7 @@
| **Goal anchor builder** | Constructs `[Goal anchor]` system message with original request + todo lines. | None | `context_builder.py` | ❌ |
| **Security policy message** | Injects `[Security policy]` based on `current_user_role`: admin = full access; user = sandbox + terminal allowlist. | `TERMINAL_ALLOWED_COMMANDS` | `context_builder.py` | ❌ |
| **User context message** | Builds `[User context]` from `current_user_info` (display_name, email, locale, etc.). | None | `context_builder.py` | ❌ |
-| **MCP context message** | Combines MCP server instructions from handshake with overlay instructions from `mcp_servers.d/*.json`. | `profile.mcp_servers` | `context_builder.py` | ❌ |
+| **MCP context message** | Combines MCP server instructions from handshake with overlay instructions from `mcp_servers.d/*.json`. | `profile.tools.agent.mcp` | `context_builder.py` | ❌ |
| **Iteration budget message** | Appends `[Iteration N/M — K after this one]` with escalating urgency when ≤2 or ≤5 remaining. | `profile.iteration_budget_enabled` | `context_builder.py` | ✅ |
| **Session context injection** | Appends session ID and exact `session_files_dir/{session_id}/` path. | `SESSION_FILES_DIR` | `context_builder.py` | ❌ |
@@ -339,7 +339,7 @@
| Mechanic | Description | Config / Flags | Files | Docs |
|---|---|---|---|---|
| **`AgentProfile` model** | Full agent config: identity, LLM settings, tools, thinking mechanics, planning flags, subagent config, visibility flags. | All fields in `config.json` | `profiles/base.py` | ✅ |
-| **Profile loader** | Auto-discovers subdirs with `config.json` + `system_prompt.txt`. Validates keys, loads optional `subagent_system_prompt.txt`, migrates `planning_reflect_enabled` → `planning_phase2_enabled`. | None | `profiles/loader.py` | ✅ |
+| **Profile loader** | Auto-discovers subdirs with `config.json` + `system_prompt.txt`. Validates keys, loads optional `subagent_system_prompt.txt`, migrates legacy `enabled_tools`/`subagent_tools`/`mcp_servers` into `tools.agent`/`tools.subagent`. | None | `profiles/loader.py` | ✅ |
| **Profile saver** | Writes `config.json`, `system_prompt.txt`, optional `subagent_system_prompt.txt`. | None | `profiles/loader.py` | ✅ |
| **Runtime profile overrides** | `profile_overrides` table persists `is_admin_only` changes across restarts. Loaded at startup. | None | `profiles/_overrides.py` | ❌ |
diff --git a/docs/visual.html b/docs/visual.html
deleted file mode 100644
index 4797cee..0000000
--- a/docs/visual.html
+++ /dev/null
@@ -1,1362 +0,0 @@
-
-
-
-
-
-
-
- 🧭 Project Overview
- Navi is a personal modular AI agent system. FastAPI backend + vanilla JS client. The agent is named Navi — female personal assistant. Runs locally via Ollama.
-
-
-
-
Entry point
-
navi/main.py
-
FastAPI app
-
-
-
Run command
-
uvicorn navi.main:app
-
--reload --port 8000
-
-
-
Default model
-
gemma4:31b-cloud
-
Ollama, 2B active params
-
-
-
Context window
-
65 536 tokens
-
OLLAMA_NUM_CTX
-
-
-
Database
-
SQLite
-
navi.db via aiosqlite
-
-
-
Thinking
-
Enabled
-
OLLAMA_THINK=true
-
-
-
-
-
-
- 📦 Stack
-
-
- | Layer | Technology | Notes |
- | Web framework | FastAPI + uvicorn | ASGI, async throughout |
- | LLM backend (primary) | Ollama | Local, OllamaBackend in navi/llm/ollama.py |
- | LLM backend (alt) | OpenAI-compatible | navi/llm/openai_backend.py |
- | Database | aiosqlite | Sessions + memory facts in navi.db |
- | Config | pydantic-settings | Reads .env, typed Settings object |
- | Logging | structlog | Structured JSON-friendly logs |
- | Client | Vanilla JS ES modules | marked.js + highlight.js via esm.sh CDN |
- | Markdown rendering | marked.js | In browser, assistant messages |
-
-
-
-
-
-
- 🗂️ Component Map
-
-
-
-
-
Client (browser)
-
- WebSocket /ws/sessions/{id}
- REST /sessions/*
- REST /agents/*
-
-
-
-
↓
-
-
-
FastAPI — navi/main.py
-
- api/websocket.py · _AgentRun · stop endpoint
- routes/sessions.py
- routes/agents.py
- routes/messages.py
-
-
-
-
↓
-
-
-
Agent — navi/core/agent.py
-
- run_stream() → AsyncGenerator[AgentEvent]
- run() → str
- run_ephemeral() → str (subagent)
- _run_planning()
- _run_workers()
-
-
-
-
↓
-
-
-
Registries — navi/core/registry.py · build_default_registries()
-
- ToolRegistry
- ProfileRegistry
- BackendRegistry
-
-
-
-
↓
-
-
-
-
LLM Backend
-
- OllamaBackend
- complete()
- stream_complete()
-
-
-
-
SessionStore (SQLite)
-
- messages[]
- context[]
-
-
-
-
MemoryStore (SQLite)
-
- memory_facts
- summary
-
-
-
-
-
-
-
-
-
- 🔄 Request Lifecycle
- Streaming flow from WebSocket message to final response.
-
-
-
1
-
- Client sends message
- {type:"message", content:"...", images:[...]} over WebSocket
-
-
-
-
2
-
- websocket_session() creates _AgentRun
- Subscribes a queue, launches _run_agent() as asyncio task, sends stream_start
-
-
-
-
3
-
- Pre-turn compression check
- If context_token_count ≥ num_ctx × threshold → compress context before LLM call
-
-
-
-
4
-
- Planning phase
- If profile.planning_enabled: fast non-streaming LLM call → yields plan_ready event if plan generated
-
-
-
-
5
-
- Tool-calling loop (max_iterations)
- Calls llm.stream_complete() → yields thinking/text/tool events. Loops until finish_reason=stop
-
-
-
-
6
-
- StreamEnd + workers
- Saves session to DB. Runs post-turn workers (compression). Yields context_compressed if triggered
-
-
-
-
✓
-
- Done
- Events broadcast from _AgentRun to all subscriber queues → sent as JSON to WebSocket
-
-
-
-
-
-
-
- 🔗 Context Vars
- Thread-safe async-safe state shared between Agent and tools. Defined in navi/tools/base.py.
-
-
- | ContextVar | Type | Set by | Used by |
-
- | current_session_id |
- str | None |
- Agent before each run |
- SSH pool, scratchpad, todo — per-session state |
-
-
- | current_event_sink |
- Queue | None |
- run_stream() per tool task |
- run_ephemeral() forwards sub-agent events to parent stream |
-
-
- | current_stop_event |
- Event | None |
- _run_agent() before run_stream() |
- Agent loop checks before each LLM call and mid-stream |
-
-
-
-
- Never use task.cancel() for stopping generation. It corrupts Starlette's WebSocket receive state. Use current_stop_event.set() via POST /sessions/{id}/stop.
-
-
-
-
-
- ⚙️ Agent Loop
- Three entry points in navi/core/agent.py:
-
-
- | Method | Returns | Persistence | Planning |
-
- run(session_id, msg) |
- str |
- SQLite session |
- No |
-
-
- run_stream(session_id, msg) |
- AsyncGenerator[AgentEvent] |
- SQLite session |
- Yes (if profile.planning_enabled) |
-
-
- run_ephemeral(msg, profile_id) |
- str |
- In-memory only |
- No |
-
-
-
-
- System prompt construction
- Built fresh on every LLM call — never stored in session.context.
- NAVI_PERSONA (global personality)
-───────────────────────────────────────
-profile.system_prompt (domain rules)
-───────────────────────────────────────
-[memory injection: "## What I remember about the user"]
-───────────────────────────────────────
-session.context messages (history, no system msgs)
-
- Sub-agent isolation
- run_ephemeral() sets current_session_id = "subagent_<uuid12>" so each subagent has its own isolated scratchpad and SSH connection pool entry.
-
-
-
-
- 🗺️ Planning Phase
- Runs before the tool-calling loop when profile.planning_enabled = true.
-
-
-
-
1
-
- LLM call: decide or plan
- Fast non-streaming call: think=False, temperature=0.3, no tools
-
-
-
-
2
-
- Response classification
- Starts with DIRECT → skip planning. No numbered steps found → skip. Otherwise → real plan.
-
-
-
-
3
-
- Plan injection
- Appended to session.context as assistant message — model continues from it naturally
-
-
-
-
4
-
- PlanReady event emitted
- Rendered as collapsible 🗺️ card in UI before execution begins
-
-
-
-
-
-
-
- 💾 Sessions
-
- Session model (navi/core/session.py)
-
-
- | Field | Type | Description |
- id | UUID str | Unique session identifier |
- profile_id | str | Active profile |
- messages | list[Message] | Full history Never compressed. Used for UI display. |
- context | list[Message] | LLM context May be replaced by compression summary. |
- context_token_count | int | Accumulated tokens; reset to 0 after compression |
- pinned | bool | Pinned sessions appear first in sidebar |
-
-
-
- Dual-buffer design
-
- Key invariant: session.messages is the full, unmodified conversation history — always available for display. session.context is what the LLM actually sees — may contain a compression summary instead of old messages.
-
-
- Message format
-
-
- | Field | Present on | Type |
- role | all | user | assistant | tool | system |
- content | most | str | None |
- images | user, assistant | list[str] — base64 |
- tool_calls | assistant (when calling tools) | list[ToolCallRequest] |
- tool_call_id | tool results | str |
- name | tool results | tool name |
- is_summary | compressed blocks | bool |
- created_at | user/assistant | ISO 8601 datetime |
-
-
-
-
-
-
- 🗜️ Context Compression
- Keeps the LLM context within the token budget. Only session.context is modified — session.messages is never touched.
-
- Trigger points
-
-
-
Pre-turn
-
Before LLM call in run_stream()
-
Checks context_token_count against threshold
-
-
-
Post-turn (worker)
-
After StreamEnd via CompressionWorker
-
Re-checks and compresses if still needed
-
-
-
- Algorithm
-
-
-
1
-
- Partition into turns
- Keep last context_keep_recent turns verbatim. Tool call groups never split.
-
-
-
-
2
-
- Format old turns as text
- Tool args truncated to 120 chars, results to 300 chars. Total input capped at 12 000 chars.
-
-
-
-
3
-
- Summarize with LLM
- think=False, bullet-point output. Same model — no model swap or extra loading.
-
-
-
-
4
-
- Replace with summary message
- role=user, is_summary=True. Result: system_msgs + [summary] + recent_turns
-
-
-
-
- Config
-
-
- | Setting | Default | Description |
- CONTEXT_COMPRESSION_ENABLED | true | Enable/disable |
- CONTEXT_COMPRESSION_THRESHOLD | 0.80 | Trigger at 80% of context window |
- CONTEXT_KEEP_RECENT | 10 | Turns kept verbatim |
- CONTEXT_SUMMARY_TEMPERATURE | 0.3 | Summarization temperature |
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 📡 WebSocket Protocol
-
- Endpoint: ws://host/ws/sessions/{session_id}
- Closes with code 4004 if session not found.
-
- Client → Server
- {
- "type": "message", // required, always "message"
- "content": "user text", // required, non-empty
- "images": ["base64..."], // optional; data: URI prefix stripped server-side
- "files": [ // optional; from POST /sessions/{id}/files
- {"name": "file.pdf", "path": "/abs/path/..."}
- ]
-}
-
-
-
-
- 📬 Events Reference
-
-
- | Type | Direction | Fields | Description |
-
- | stream_start |
- S→C | — |
- Agent processing began. Block user input. |
-
-
- | thinking_delta |
- S→C | delta |
- Reasoning chunk (streaming). Accumulate until thinking_end. |
-
-
- | thinking_end |
- S→C | — |
- Reasoning phase complete. Auto-collapsed in UI. |
-
-
- | turn_thinking |
- S→C | thinking, is_subagent |
- Full reasoning block from tool-calling turn (non-streaming). |
-
-
- | plan_ready |
- S→C | plan |
- Step-by-step plan before execution. Rendered as 🗺️ card. |
-
-
- | tool_started |
- S→C | tool, args, is_subagent |
- Tool call began. Shows pending spinner in UI immediately. |
-
-
- | tool_call |
- S→C | tool, args, result, success, is_subagent |
- Tool finished. Pairs with preceding tool_started. |
-
-
- | stream_delta |
- S→C | delta |
- Final response text chunk. Accumulate to build full content. |
-
-
- | stream_end |
- S→C | content, context_tokens, max_context_tokens |
- Final response complete. Unlock user input. |
-
-
- | stream_stopped |
- S→C | — |
- User stopped generation via POST /sessions/{id}/stop. |
-
-
- | context_compressed |
- S→C | messages_before, messages_after |
- Context compression ran after this turn. |
-
-
- | profile_switched |
- S→C | profile_id, profile_name |
- Active profile changed mid-stream by switch_profile tool. |
-
-
- | error |
- S→C | message |
- Unhandled error. Some are recoverable, some terminate the stream. |
-
-
-
-
-
-
-
- 🎬 Typical Event Sequences
-
- Simple question (no tools)
-
-
stream_start
-
thinking_delta × N // if model reasons
-
thinking_end
-
stream_delta × N
-
stream_end
-
-
- With planning + tools
-
-
stream_start
-
plan_ready // if planning_enabled
-
turn_thinking // reasoning before tool selection
-
tool_started
-
tool_call
-
tool_started
-
tool_call
-
thinking_delta × N
-
thinking_end
-
stream_delta × N
-
stream_end
-
context_compressed // optional, if threshold hit
-
-
- Subagent (spawn_agent)
-
-
stream_start
-
tool_started spawn_agent is_subagent=false
-
turn_thinking is_subagent=true
-
tool_started mcp__navi_web__web_search is_subagent=true
-
tool_call mcp__navi_web__web_search is_subagent=true
-
tool_started filesystem is_subagent=true
-
tool_call filesystem is_subagent=true
-
tool_call spawn_agent is_subagent=false
-
stream_delta × N
-
stream_end
-
-
- Profile switch
-
-
stream_start
-
tool_started switch_profile
-
profile_switched // update UI here
-
tool_call switch_profile
-
stream_delta × N
-
stream_end
-
-
-
-
-
- 🌐 REST API
-
-
- | Method | Path | Description |
-
- | GET |
- /health |
- Health check → {"status":"ok"} |
-
-
- | GET |
- /agents/profiles |
- List all available profiles |
-
-
- | GET |
- /agents/tools |
- List all registered tools (builtin + user) |
-
-
- | POST |
- /sessions |
- Create session → {session_id, profile_id, created_at} |
-
-
- | GET |
- /sessions |
- List all sessions (sorted by pinned+last_active) |
-
-
- | GET |
- /sessions/{id} |
- Full session with message history (display buffer) |
-
-
- | GET |
- /sessions/{id}/context |
- LLM context (may differ from messages — for debugging) |
-
-
- | PATCH |
- /sessions/{id}/pin |
- Pin or unpin a session |
-
-
- | DEL |
- /sessions/{id} |
- Delete session and its uploaded files |
-
-
- | POST |
- /sessions/{id}/files |
- Upload file (multipart/form-data). Max 200 MB. TTL 24h. |
-
-
- | POST |
- /sessions/{id}/messages |
- Send message, wait for full response (non-streaming) |
-
-
- | POST |
- /sessions/{id}/stop |
- Signal cooperative stop for running agent |
-
-
- | WS |
- /ws/sessions/{id} |
- Streaming agent interface |
-
-
-
-
-
-
-
- 👤 Profiles
- Profiles define tools, system prompt, model, and behaviour per domain. Defined in navi/profiles/.
-
-
-
- | Profile ID | Name | Model | Temp | Planning |
-
- secretary | Personal Secretary |
- gemma4:31b-cloud |
- 0.7 |
- Yes |
-
-
- server_admin | Server Administrator |
- gemma4:31b-cloud |
- 0.2 |
- Yes |
-
-
- smart_home | Smart Home Assistant |
- gemma4:31b-cloud |
- 0.3 |
- Yes |
-
-
-
-
- Per-profile scratchpad sections
-
-
- | Profile | Sections | Domain focus |
- secretary | findings, sources, drafts | Research, writing, analysis |
- server_admin | status, logs, errors, plan | Remote ops, monitoring |
- smart_home | state, config, errors | Home Assistant, IoT, automations |
-
-
-
- AgentProfile fields
-
-
- | Field | Type | Description |
- id | str | Unique identifier used in API and sessions |
- name | str | Human-readable name for UI |
- system_prompt | str | Domain-specific instructions (appended after persona) |
- enabled_tools | list[str] | Tool names available to this profile |
- model | str | Ollama model override (falls back to settings default) |
- temperature | float | LLM temperature |
- max_iterations | int | Tool-calling loop limit (default 50) |
- planning_enabled | bool | Run planning phase before tool loop |
- llm_backend | str | Backend key in BackendRegistry (default "ollama") |
-
-
-
-
-
-
- 🧠 Memory System
- Long-term user memory: facts extracted from conversations, stored in SQLite, injected into every session.
-
- Database schema
-
-
- | Table | Key columns | Purpose |
-
- memory_facts |
- (category, key) unique |
- Individual facts about the user — preferences, projects, environment |
-
-
- memory_summary |
- Single row (id=1) |
- Narrative summary generated from all facts; injected into every session |
-
-
- session_memory_state |
- session_id, extracted_at |
- Tracks which sessions have been processed for extraction |
-
-
-
-
- Automatic extraction trigger
- POST /sessions (create new session) fires _process_stale_sessions() as a background task. Processes sessions idle > 30 minutes that haven't been extracted yet.
-
- Memory injection
- On every run_stream() / run() call, _memory_msg() fetches the summary and returns a system message: "## What I remember about the user\n\n{summary}". Injected after main system prompt, before conversation history.
-
- Memory tools usage rules
-
- Call memory_search when the user mentions something personal or before making assumptions about their environment. Do not call at session start reflexively — only when context warrants it. Call memory_forget only when explicitly asked.
-
-
-
-
-
- ⚙️ Configuration
- All settings read from .env via pydantic-settings. Imported as from navi.config import settings.
-
- LLM
-
-
- | Variable | Default | Description |
- OLLAMA_HOST | http://localhost:11434 | Ollama server URL |
- OLLAMA_DEFAULT_MODEL | gemma4:31b-cloud | Default model (overridable per profile) |
- OLLAMA_NUM_CTX | 65536 | Context window size in tokens |
- OLLAMA_THINK | true | Enable extended reasoning |
-
-
-
- Security / Sandboxing
-
-
- | Variable | Default | Description |
- FS_ALLOWED_PATHS | * | Comma-separated paths filesystem tool can access. * = no limit |
- TERMINAL_ALLOWED_COMMANDS | * | Comma-separated allowed executables. * = allow all |
- SSH_HOSTS_FILE | ssh_hosts.json | Named SSH connections config |
-
-
-
- Persona
-
-
- | Variable | Description |
- NAVI_PERSONA | Inline global personality prompt |
- NAVI_PERSONA_FILE | Path to .txt file with persona (recommended — inline doesn't parse multiline well) |
-
-
-
- Other
-
-
- | Variable | Default | Description |
- DB_PATH | navi.db | SQLite file path |
- LOG_LEVEL | INFO | DEBUG / INFO / WARNING / ERROR |
- TOOLS_DIR | tools | User tools directory |
- SESSION_FILES_DIR | session_files | Uploaded files directory |
- SESSION_FILES_MAX_SIZE_MB | 200 | Max upload size per file |
- SESSION_FILES_TTL_HOURS | 24 | File retention hours |
-
-
-
-
-
-
-
-
-
diff --git a/navi/api/routes/admin.py b/navi/api/routes/admin.py
index a770b20..fcd034a 100644
--- a/navi/api/routes/admin.py
+++ b/navi/api/routes/admin.py
@@ -51,9 +51,7 @@
sort_by=sort_by,
sort_order=sort_order,
)
- total = await store.count_all(
- user_id=user.id, is_admin=True, search=search or None
- )
+ total = await store.count_all(user_id=user.id, is_admin=True, search=search or None)
return {
"total": total,
"limit": limit,
@@ -348,10 +346,9 @@
"anti_stall_threshold": profile.anti_stall_threshold,
"step_validation_enabled": profile.step_validation_enabled,
"adaptive_replan_enabled": profile.adaptive_replan_enabled,
- "subagent_tools": profile.subagent_tools,
"subagent_planning_enabled": profile.subagent_planning_enabled,
"subagent_think_enabled": profile.subagent_think_enabled,
- "enabled_tools": profile.enabled_tools,
+ "tools": profile.tools.model_dump(),
"context_providers": profile.context_providers,
"is_admin_only": getattr(profile, "is_admin_only", False),
}
@@ -387,7 +384,6 @@
updated_data["id"] = profile_id
updated_data.setdefault("name", old_profile.name)
updated_data.setdefault("description", old_profile.description)
- updated_data.setdefault("enabled_tools", old_profile.enabled_tools)
updated_data.setdefault("system_prompt", old_profile.system_prompt)
try:
@@ -553,9 +549,7 @@
mcp_manager.clients[server_name] = client
except Exception as exc:
log.warning("admin.mcp_reconnect_failed", server=server_name, error=str(exc))
- raise HTTPException(
- status_code=502, detail=f"Reconnect failed: {exc}"
- ) from exc
+ raise HTTPException(status_code=502, detail=f"Reconnect failed: {exc}") from exc
# Re-register tools
try:
@@ -627,22 +621,16 @@
arguments = body.get("arguments", {})
if not server_name or not tool_name:
- raise HTTPException(
- status_code=400, detail="server_name and tool_name are required"
- )
+ raise HTTPException(status_code=400, detail="server_name and tool_name are required")
mcp_manager = get_mcp_manager()
if mcp_manager is None:
raise HTTPException(status_code=503, detail="MCP manager not initialized")
try:
- output, is_error = await mcp_manager.call_tool(
- server_name, tool_name, arguments
- )
+ output, is_error = await mcp_manager.call_tool(server_name, tool_name, arguments)
except Exception as exc:
- raise HTTPException(
- status_code=502, detail=f"Tool call failed: {exc}"
- ) from exc
+ raise HTTPException(status_code=502, detail=f"Tool call failed: {exc}") from exc
return {
"server_name": server_name,
diff --git a/navi/api/routes/agents.py b/navi/api/routes/agents.py
index 509e007..28ddd52 100644
--- a/navi/api/routes/agents.py
+++ b/navi/api/routes/agents.py
@@ -2,7 +2,7 @@
from typing import Annotated
-from fastapi import APIRouter, Depends, HTTPException
+from fastapi import APIRouter, Depends
from navi.api.deps import get_current_user, get_mcp_manager, get_profile_registry, get_tool_registry
from navi.auth import User
@@ -23,22 +23,25 @@
for p in profiles.all():
if getattr(p, "is_admin_only", False) and not is_admin:
continue
- result.append({
- "id": p.id,
- "name": p.name,
- "description": p.description,
- "enabled_tools": p.enabled_tools,
- "mcp_servers": p.mcp_servers,
- "llm_backend": p.llm_backend,
- "model": p.model,
- "temperature": p.temperature,
- "top_k": p.top_k,
- "top_p": p.top_p,
- "max_iterations": p.max_iterations,
- "iteration_budget_enabled": p.iteration_budget_enabled,
- "think_enabled": p.think_enabled,
- "subagent_think_enabled": p.subagent_think_enabled,
- })
+ result.append(
+ {
+ "id": p.id,
+ "name": p.name,
+ "description": p.description,
+ "tools": p.tools.model_dump()
+ if p.tools
+ else {"agent": {"native": [], "mcp": {}}, "subagent": {"native": [], "mcp": {}}},
+ "llm_backend": p.llm_backend,
+ "model": p.model,
+ "temperature": p.temperature,
+ "top_k": p.top_k,
+ "top_p": p.top_p,
+ "max_iterations": p.max_iterations,
+ "iteration_budget_enabled": p.iteration_budget_enabled,
+ "think_enabled": p.think_enabled,
+ "subagent_think_enabled": p.subagent_think_enabled,
+ }
+ )
return result
@@ -109,17 +112,19 @@
sections.append({"label": "profiles block", "content": "\n".join(lines)})
full = "\n\n---\n\n".join(s["content"] for s in sections)
- result.append({
- "profile_id": profile.id,
- "profile_name": profile.name,
- "model": profile.model,
- "enabled_tools": profile.enabled_tools,
- "mcp_servers": profile.mcp_servers,
- "resolved_mcp_tools": _resolve_mcp_tools(profile, mcp_manager, tool_registry),
- "sections": sections,
- "full": full,
- "total_chars": len(full),
- })
+ result.append(
+ {
+ "profile_id": profile.id,
+ "profile_name": profile.name,
+ "model": profile.model,
+ "enabled_tools": profile.enabled_tools,
+ "mcp_servers": profile.mcp_servers,
+ "resolved_mcp_tools": _resolve_mcp_tools(profile, mcp_manager, tool_registry),
+ "sections": sections,
+ "full": full,
+ "total_chars": len(full),
+ }
+ )
return result
@@ -155,10 +160,7 @@
if client and client.connected:
try:
tools = await client.list_tools()
- server_tools = [
- {"name": t.name, "description": t.description or ""}
- for t in tools
- ]
+ server_tools = [{"name": t.name, "description": t.description or ""} for t in tools]
except Exception:
pass
@@ -166,11 +168,13 @@
profile_refs = []
for p in all_profiles:
if name in (p.mcp_servers or {}):
- profile_refs.append({
- "profile_id": p.id,
- "profile_name": p.name,
- "groups": p.mcp_servers[name],
- })
+ profile_refs.append(
+ {
+ "profile_id": p.id,
+ "profile_name": p.name,
+ "groups": p.mcp_servers[name],
+ }
+ )
# Merge instructions: server-provided + config overlay
parts: list[str] = []
@@ -181,16 +185,18 @@
parts.append("")
parts.append(cfg.instructions)
- result.append({
- "name": name,
- "connected": client is not None and client.connected,
- "transport": cfg.transport,
- "url": cfg.url,
- "command": cfg.command,
- "groups": cfg.groups,
- "instructions": "\n".join(parts) if parts else None,
- "tools": server_tools,
- "profiles": profile_refs,
- })
+ result.append(
+ {
+ "name": name,
+ "connected": client is not None and client.connected,
+ "transport": cfg.transport,
+ "url": cfg.url,
+ "command": cfg.command,
+ "groups": cfg.groups,
+ "instructions": "\n".join(parts) if parts else None,
+ "tools": server_tools,
+ "profiles": profile_refs,
+ }
+ )
return result
diff --git a/navi/core/tool_executor.py b/navi/core/tool_executor.py
index 7c049b5..a71be58 100644
--- a/navi/core/tool_executor.py
+++ b/navi/core/tool_executor.py
@@ -1,7 +1,6 @@
"""Tool execution helpers — extracted from agent.py."""
import asyncio
-from datetime import datetime, timezone
from typing import TYPE_CHECKING
import structlog
@@ -24,7 +23,7 @@
return name, tool
# Support bare tool name when the full MCP name ends with it
- # e.g. "web_search" -> "mcp__navi_web__web_search"
+ # e.g. "web_search" -> "mcp__navi-web__web_search"
bare_matches = [
(candidate_name, candidate)
for candidate_name, candidate in tool_map.items()
@@ -96,9 +95,13 @@
metadata: dict = {}
if tool is None:
content = f"Error: tool '{tc.name}' not found."
- event = ToolEvent(tool_name=tc.name, arguments=tc.arguments,
- result=content, success=False,
- tool_call_id=tc.id)
+ event = ToolEvent(
+ tool_name=tc.name,
+ arguments=tc.arguments,
+ result=content,
+ success=False,
+ tool_call_id=tc.id,
+ )
else:
log.info("tool.execute", tool=resolved_name, requested_tool=tc.name, args=tc.arguments)
middlewares = getattr(self._tools, "_middlewares", [])
@@ -109,10 +112,14 @@
await mw.after_execute(resolved_name, tc.arguments, result)
content = result.to_message_content()
metadata = result.metadata or {}
- event = ToolEvent(tool_name=resolved_name, arguments=tc.arguments,
- result=content, success=result.success,
- metadata=metadata,
- tool_call_id=tc.id)
+ event = ToolEvent(
+ tool_name=resolved_name,
+ arguments=tc.arguments,
+ result=content,
+ success=result.success,
+ metadata=metadata,
+ tool_call_id=tc.id,
+ )
if result.success and result.metadata and result.metadata.get("is_image"):
b64 = result.metadata.get("base64")
if b64:
@@ -121,8 +128,13 @@
content=f"[Image loaded via {resolved_name} — analyse it]",
images=[b64],
)
- msg = Message(role="tool", content=content, tool_call_id=tc.id,
- name=resolved_name if tool is not None else tc.name, metadata=metadata)
+ msg = Message(
+ role="tool",
+ content=content,
+ tool_call_id=tc.id,
+ name=resolved_name if tool is not None else tc.name,
+ metadata=metadata,
+ )
return event, msg, image_msg
async def _run_single_tool(
@@ -142,7 +154,9 @@
self, tool_calls: list[ToolCallRequest], tools: list[Tool], ctx=None
) -> tuple[list[Message], list[Message]]:
tool_map = {t.name: t for t in tools}
- pairs = await asyncio.gather(*[self._execute_one(tc, tool_map, ctx=ctx) for tc in tool_calls])
+ pairs = await asyncio.gather(
+ *[self._execute_one(tc, tool_map, ctx=ctx) for tc in tool_calls]
+ )
tool_msgs = [p[1] for p in pairs]
image_msgs = [p[2] for p in pairs if p[2] is not None]
return tool_msgs, image_msgs
@@ -151,5 +165,7 @@
self, tool_calls: list[ToolCallRequest], tools: list[Tool], ctx=None
) -> tuple[list[tuple["ToolEvent", Message]], list[Message]]:
tool_map = {t.name: t for t in tools}
- triples = await asyncio.gather(*[self._execute_one(tc, tool_map, ctx=ctx) for tc in tool_calls])
+ triples = await asyncio.gather(
+ *[self._execute_one(tc, tool_map, ctx=ctx) for tc in tool_calls]
+ )
return [(t[0], t[1]) for t in triples], [t[2] for t in triples if t[2] is not None]
diff --git a/navi/profiles/developer/config.json b/navi/profiles/developer/config.json
index 636421a..3a62928 100644
--- a/navi/profiles/developer/config.json
+++ b/navi/profiles/developer/config.json
@@ -6,7 +6,7 @@
"full_description": {
"specialization": "Full-stack software development: writing code in any language, debugging, running tests, working with files and project structure, git, APIs, scripting. Works on the user's own projects, not Navi's internals.",
"when_to_use": "When the user wants to build something — a game, a script, an app, a web service, anything. For writing Navi tools specifically, use tool_developer instead.",
- "key_tools": "filesystem, code_exec, terminal, ssh_exec, mcp__navi_web__web_search, mcp__navi_web__web_view, spawn_agent"
+ "key_tools": "filesystem, code_exec, terminal, ssh_exec, mcp__navi-web__web_search, mcp__navi-web__web_view, spawn_agent"
},
"llm_backend": "ollama",
"model": [
diff --git a/navi/profiles/discuss/system_prompt.txt b/navi/profiles/discuss/system_prompt.txt
index f210c80..c6f89a5 100644
--- a/navi/profiles/discuss/system_prompt.txt
+++ b/navi/profiles/discuss/system_prompt.txt
@@ -18,7 +18,7 @@
## Tools
-Use `mcp__navi_web__web_search` + `mcp__navi_web__web_view` when a factual grounding would strengthen the discussion — not for every question, only when currency or precision matters.
+Use `mcp__navi-web__web_search` + `mcp__navi-web__web_view` when a factual grounding would strengthen the discussion — not for every question, only when currency or precision matters.
Use project `docs/` when discussing an active project. Prefer `docs/index.md` as the map, then query specific docs rather than rereading broad source trees.
diff --git a/navi/profiles/modeler_3d/config.json b/navi/profiles/modeler_3d/config.json
index 2577873..f8610fd 100644
--- a/navi/profiles/modeler_3d/config.json
+++ b/navi/profiles/modeler_3d/config.json
@@ -6,7 +6,7 @@
"full_description": {
"specialization": "Physically coherent 3D geometry and STL generation. Generates STL files from OpenSCAD through dedicated 3D tools, and validates with OpenSCAD compilation plus preview render inspection.",
"when_to_use": "When the user needs a physical object modeled as 3D geometry: replacement parts, mechanical assemblies, decorative items, functional prototypes, jigs, fixtures, or custom enclosures.",
- "key_tools": "spawn_agent, filesystem, mcp__navi_3d__lint_scad, mcp__navi_3d__compile_scad, mcp__navi_3d__render_stl, image_view, content_publish"
+ "key_tools": "spawn_agent, filesystem, mcp__navi-3d__lint_scad, mcp__navi-3d__compile_scad, mcp__navi-3d__render_stl, image_view, content_publish"
},
"llm_backend": "ollama",
"model": [
diff --git a/navi/profiles/modeler_3d/subagent_system_prompt.txt b/navi/profiles/modeler_3d/subagent_system_prompt.txt
index cd7d776..081ff0e 100644
--- a/navi/profiles/modeler_3d/subagent_system_prompt.txt
+++ b/navi/profiles/modeler_3d/subagent_system_prompt.txt
@@ -6,7 +6,7 @@
1. Read the briefing and parent session context.
2. Identify the exact missing facts requested by the parent agent.
-3. Use `mcp__navi_web__web_search`, `mcp__navi_web__web_view`, `filesystem`, and `image_view` as needed to gather evidence.
+3. Use `mcp__navi-web__web_search`, `mcp__navi-web__web_view`, `filesystem`, and `image_view` as needed to gather evidence.
4. Prefer primary sources, product pages, datasheets, manuals, dimensions in local files, or images provided by the user.
5. Return only the facts found, source paths/URLs, confidence, and unresolved gaps.
diff --git a/navi/profiles/secretary/config.json b/navi/profiles/secretary/config.json
index cc53485..439709e 100644
--- a/navi/profiles/secretary/config.json
+++ b/navi/profiles/secretary/config.json
@@ -6,7 +6,7 @@
"full_description": {
"specialization": "General-purpose personal assistant. Web research, document writing, data analysis, email correspondence, planning, calculations, and any everyday task that doesn't require direct server access or tool development.",
"when_to_use": "Default profile for most requests. If you're unsure which profile to use, this one is correct. Switch away only when the task clearly requires server/infrastructure access (server_admin) or modifying Navi's own tools (developer).",
- "key_tools": "mcp__navi_web__web_search, mcp__navi_web__web_view, filesystem, code_exec, gmail, todo, scratchpad, spawn_agent, memory"
+ "key_tools": "mcp__navi-web__web_search, mcp__navi-web__web_view, filesystem, code_exec, gmail, todo, scratchpad, spawn_agent, memory"
},
"llm_backend": "ollama",
"model": [
diff --git a/navi/profiles/secretary/system_prompt.txt b/navi/profiles/secretary/system_prompt.txt
index 843bd11..991c5bd 100644
--- a/navi/profiles/secretary/system_prompt.txt
+++ b/navi/profiles/secretary/system_prompt.txt
@@ -69,11 +69,11 @@
---
## Tool priorities
-1. mcp__navi_web__web_search — first choice for current info, facts, documentation.
+1. mcp__navi-web__web_search — first choice for current info, facts, documentation.
2. code_exec — calculations, data processing, text parsing, format conversion.
-3. mcp__navi_web__web_view — view a specific page in full.
+3. mcp__navi-web__web_view — view a specific page in full.
4. filesystem — read/write local documents, notes, data files.
-5. mcp__navi_web__http_request — external APIs, webhooks, content not suited for search.
+5. mcp__navi-web__http_request — external APIs, webhooks, content not suited for search.
6. image_view — whenever an image path or URL is mentioned.
## Output style
diff --git a/navi/profiles/server_admin/config.json b/navi/profiles/server_admin/config.json
index 262ce26..a3dbbbd 100644
--- a/navi/profiles/server_admin/config.json
+++ b/navi/profiles/server_admin/config.json
@@ -6,7 +6,7 @@
"full_description": {
"specialization": "Remote server operations via SSH, system diagnostics, service management, log analysis, network troubleshooting, process monitoring, and infrastructure automation.",
"when_to_use": "When the task involves SSH access to servers, running system commands, managing Linux services, analyzing logs, monitoring resources, or any hands-on infrastructure work.",
- "key_tools": "ssh_exec, terminal, filesystem, code_exec, mcp__navi_web__web_search, spawn_agent, memory"
+ "key_tools": "ssh_exec, terminal, filesystem, code_exec, mcp__navi-web__web_search, spawn_agent, memory"
},
"llm_backend": "ollama",
"model": [
diff --git a/navi/profiles/server_admin/system_prompt.txt b/navi/profiles/server_admin/system_prompt.txt
index 93625e4..ed06010 100644
--- a/navi/profiles/server_admin/system_prompt.txt
+++ b/navi/profiles/server_admin/system_prompt.txt
@@ -46,7 +46,7 @@
6. **Synthesise** — after all agents report back, write your conclusions and next steps.
### Plan → execution binding
-- **TOOL** — direct local call (terminal, filesystem, mcp__navi_web__http_request for health checks).
+- **TOOL** — direct local call (terminal, filesystem, mcp__navi-web__http_request for health checks).
- **AGENT** — call `spawn_agent` for THIS STEP ONLY. One AGENT step = one spawn_agent call.
If your plan has steps 1, 2, 3 all marked AGENT — you make three separate spawn_agent calls.
Never bundle multiple steps into one call. Never pass your full plan to a single subagent.
@@ -80,8 +80,8 @@
1. ssh_exec — direct single-command checks on known hosts when spawning is overkill.
2. terminal — local machine operations.
3. filesystem — local config files, scripts.
-4. mcp__navi_web__http_request — health check endpoints, REST APIs.
-5. mcp__navi_web__web_search — error lookups, documentation.
+4. mcp__navi-web__http_request — health check endpoints, REST APIs.
+5. mcp__navi-web__web_search — error lookups, documentation.
## Execution environment
`terminal`, `filesystem`, and `code_exec` run on the LOCAL machine (where Navi's server is running) — NOT on any remote host.