# Tool System

Tools are the agent's actions. All tools implement the `Tool` ABC from `navi/tools/base.py`.

## Two tiers

### Built-in tools (`navi/tools/`)

Registered in `build_default_registries()` as builtins. Never removed on hot-reload.

| Tool | Name | Description |
|---|---|---|
| `WebSearchTool` | `mcp__navi_web__web_search` | DuckDuckGo search |
| `WebViewTool` | `mcp__navi_web__web_view` | Fetch and render a URL |
| `FilesystemTool` | `filesystem` | Read/write/list/copy/grep/diff local files (path restrictions via config) |
| `HttpRequestTool` | `mcp__navi_web__http_request` | Generic HTTP client (GET/POST/etc.) |
| `CodeExecTool` | `code_exec` | Execute Python in a subprocess sandbox |
| `TerminalTool` | `terminal` | Run shell commands (command allowlist via config) |
| `SshExecTool` | `ssh_exec` | SSH exec and SCP file transfer; connection pool keyed by session ID |
| `ImageViewTool` | `image_view` | Load image from path/URL → resize to 1024px, convert to JPEG, return base64 for multimodal LLM |
| `TodoTool` | `todo` | Per-session task checklist (set/update/read) |
| `ScratchpadTool` | `scratchpad` | Per-session named working notes (write/append/read/clear) |
| `ReloadToolsTool` | `reload_tools` | Hot-reload user tools without server restart |
| `ListToolsTool` | `list_tools` | Return the live tool list from registry |
| `ToolManualTool` | `tool_manual` | Return manuals/{name}.md or auto-generate from schema |
| `MemoryTool` | `memory` | Unified memory tool: save, search, and forget facts |
| `SpawnAgentTool` | `spawn_agent` | Spawn an isolated subagent (blocking). Optional `profile_id` selects another profile; omitted means parent profile. `inherit_system_prompt=true` prepends the parent profile's full system prompt as a base layer for the subagent |
| `SwitchProfileTool` | `switch_profile` | Switch the active profile for a session |
| `ListProfilesTool` | `list_profiles` | List all available profiles |
| `ShareFileTool` | `share_file` | Copy an existing local file into session files and return a download link |
| `ContentPublishTool` | `content_publish` | Register an existing session file for inline viewing in chat |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__search_secrets` | Search personal secrets (UUID id, masked values) |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__get_secret` | Get secret metadata and masked fields |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__reveal_secret` | Decrypt and return plaintext value (audited) |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__create_secret` | Create a new secret with encrypted fields |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__update_secret` | Update fields/metadata of an existing secret |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__set_secret_status` | Change secret status (actual / outdated / archived) |
| `McpTool` (gnexus-creds) | `mcp__gnexus_creds__archive_secret` | Permanently hide secret from MCP queries |
| `McpTool` (navi-3d) | `mcp__navi_3d__compile_scad` | Compile an OpenSCAD script into a binary STL file |
| `McpTool` (navi-3d) | `mcp__navi_3d__lint_scad` | Lightweight OpenSCAD source linting before STL compilation |
| `McpTool` (navi-3d) | `mcp__navi_3d__render_stl` | Render preview PNG images from an STL file (up to 3 views) |
| `McpTool` (navi-web) | `mcp__navi_web__web_search` | Web search (SearXNG primary, DDG fallback, Brave tertiary) |
| `McpTool` (navi-web) | `mcp__navi_web__web_view` | Open a URL in a headless browser and return clean readable text |
| `McpTool` (navi-web) | `mcp__navi_web__http_request` | Raw HTTP request (GET/POST/PUT/PATCH/DELETE) |
| `McpStatusTool` | `mcp_status` | Check connectivity and list tools for configured MCP servers |
| `ReflectTool` | `reflect` | Self-reflection and analysis |
| `ScheduleRecallTool` | `schedule_recall` | Schedule a headless callback for the current session (once/recurring/immediate) |
| `ManageRecallTool` | `manage_recall` | Cancel, skip, or list scheduled recalls for the current session |

### User tools (`tools/*.py`)

Written manually or via `create_mcp_server`. Auto-discovered at startup.

- Files starting with `_` are ignored.
- `tools/enabled.json` — list of user tool names to include in all profiles automatically.
- `tools/_template.py` — canonical format reference (not loaded).

Currently present: `get_current_datetime.py`, `gmail.py`, `weather.py`.

---

## Tool formats

### Module-level format (preferred for user tools)

```python
name = "my_tool"
description = "What it does and when to use it — be specific."
parameters = {
    "type": "object",
    "properties": {
        "param": {"type": "string", "description": "..."}
    },
    "required": ["param"]
}

async def execute(params: dict) -> str:
    # Return a plain string on success.
    # Raise an exception to signal failure.
    return "result"
```

No classes, no module-level `print()`. The loader wraps `execute` in a `Tool` subclass automatically.

### Class-based format (built-in tools)

```python
from navi.tools.base import Tool, ToolResult

class MyTool(Tool):
    name = "my_tool"
    description = "..."
    parameters = {"type": "object", "properties": {...}, "required": [...]}

    async def execute(self, params: dict) -> ToolResult:
        return ToolResult(success=True, output="result")
```

`ToolResult` fields:
- `success: bool`
- `output: str` — always a string; LLM sees this
- `error: str | None` — included in output on failure via `to_message_content()`
- `metadata: dict` — internal hints (e.g. `is_image: True` → triggers image injection into context)

---

## Tool loading (`navi/tools/loader.py`)

`load_tools_from_dir(tools_dir)` returns `LoadResult(loaded, errors)`.

Load order:
1. Try module-level format (checks for `name`, `description`, `parameters`, `execute`).
2. Fall back to class-based (scans for `Tool` subclasses).

Errors are **isolated per file** — one broken file does not prevent others from loading. Errors are logged and returned in `LoadResult.errors`.

---

## Middleware hooks

Tools support `before_execute` / `after_execute` middleware hooks registered via `ToolRegistry.add_middleware()`. The built-in `LoggingMiddleware` logs every tool call with duration and result summary.

Use middleware for cross-cutting concerns: metrics, rate limiting, authorization, audit logging.

## Hot-reload

`reload_tools` tool calls `ToolRegistry.reload_user_tools(tools_dir)`:
1. Drops all tools that are NOT in `_builtin_names`.
2. Re-runs `load_tools_from_dir`.
3. New tools registered without server restart.

New tools become available from the **next** user message (tool schemas are built at `run_stream()` entry, not during execution).

---

## Self-extension via MCP servers

New capabilities are added as MCP servers using `create_mcp_server`. The server scaffolding includes:
1. A directory under `mcp-servers/{name}/`.
2. A `server.py` entrypoint with stdio transport.
3. A config file at `mcp_servers.d/{name}.json`.
4. Registration via `reload_tools` or server restart.

The agent should call `tool_manual("create_mcp_server")` before using it for the first time.

---

## Scratchpad and Todo

Both are per-session, backed by the PostgreSQL KV-store (`session_store` table) and survive server restarts.

**Scratchpad** — named sections for working notes within a task. Operations: `write`, `append`, `read`, `clear`. Subagents get isolated scratchpads (unique UUID-based session ID in `run_ephemeral()`).

**Todo** — checklist for tracking multi-step plans. Operations: `set` (replace all tasks), `update` (set status of one task), `read`. Statuses: `pending`, `in_progress`, `done`, `failed`, `skipped`.

---

## Image tool flow

When `image_view` succeeds, it returns `ToolResult` with `metadata={"is_image": True, "base64": "..."}`.

The agent detects this and appends a synthetic user message with the image to `session.context` (but not `session.messages`). This makes the image visible to the next LLM call without polluting the display history.

See [`sessions.md`](sessions.md) for the dual-buffer design.

## UI component tool flow (`navi_ui`)

The internal `navi_ui` MCP server renders structured UI components inside chat messages. It exposes one tool: `render_component(component_name, payload, session_id)`.

The server validates `payload` against the registered component schema and returns a JSON envelope:

```json
{
  "output": "Component 'card_grid' rendered for session s1",
  "metadata": {
    "ui_component": {
      "component": "card_grid",
      "payload": { ... validated payload ... }
    }
  }
}
```

`navi/mcp/tools.py` parses this envelope for the `navi_ui` server:

- `metadata.ui_component` is stored on the `role="tool"` message for the call.
- `output` is returned as the tool text result.
- If `output` starts with `"Error:"`, the call is treated as a failure.

The webclient reads `metadata.ui_component` from the tool message, resolves the Vue renderer by `component` name, and renders it inline within the assistant message. Because the metadata lives on the persistent tool message, the UI component survives page reload and session restore.

### Available components

| Component | Use for |
|---|---|
| `card_grid` | Compact grid of cards with details modal. |
| `form` | Schema-driven user input form with client-side validation. |

### `form` component

`render_component("form", payload)` renders a form in the webclient. The client validates input in real time, sends the result back as a JSON WebSocket message, and replaces the form with a read-only summary. The submitted JSON is stored as a hidden `role="user"` message (`is_display=False, is_context=True`), so it reaches the LLM but never appears in the chat UI.

Payload schema (`FormPayload`):

- `form_id` (required, string) — stable identifier for this form instance.
- `title` (optional, string) — heading above the form.
- `description` (optional, string) — instructions or helper text.
- `fields` (required, array, max 20) — list of `FormField` objects:
  - `name` (required) — machine identifier, `[a-zA-Z0-9_-]+`.
  - `label` (required) — shown above the input.
  - `type` — one of `text`, `textarea`, `number`, `email`, `url`, `select`, `multiselect`, `checkbox`, `date`.
  - `required` (bool).
  - `placeholder` (string).
  - `default` — pre-filled value.
  - `options` (array of `{label, value}`) — required for `select` / `multiselect`.
  - `min`, `max` — for `number`.
  - `minLength`, `maxLength` — for `text` / `textarea`.
  - `pattern` — ECMAScript regex for `text` / `textarea`.
  - `description` — per-field hint.
- `submit_label` (optional, default `"Submit"`) — button label.

On submit the client sends:

```json
{"type": "form_submit", "form_id": "...", "values": {"field_name": "value", ...}}
```

Server validation is intentionally minimal; the LLM receives the JSON values and requests clarification if anything is missing or wrong.
