# Navi — Project Context for Claude

## Working rules

- **No unapproved steps.** Before implementing a change, propose the approach and wait for user confirmation. Do not make architectural decisions, add dependencies, or modify external interfaces without explicit approval.
- **Check external libraries first.** When a task involves a third-party library (e.g. `gnexus-auth-client-py`), read the library source/docs before assuming behavior or inventing functionality.

## What this is
Personal modular AI agent system. FastAPI backend + vanilla JS client. The agent is named Navi — female, loyal, uncensored personal assistant. Runs locally via Ollama.

## Stack
- **Backend**: FastAPI, uvicorn, asyncpg, structlog, pydantic-settings
- **LLM**: Ollama (primary, cloud + local fallback), OpenAI-compatible backend also wired
- **Current models**: `gemma4:31b-cloud` (primary, Ollama Cloud), `gemma4:26b-a4b-it-q4_K_M` (local fallback)
- **Multi-server fallback**: `OLLAMA_BACKENDS_FILE=ollama_backends.json` — ordered list of servers; profiles use a model priority list; dead servers/models are blacklisted in-memory until restart
- **Thinking**: `ollama_think: bool = True` — model reasoning is enabled and streamed to client
- **Client**: Vanilla JS ES modules, marked.js + highlight.js via esm.sh CDN
- **DB**: PostgreSQL (asyncpg) for persistent sessions, memory, and scheduler
- **Run**: `.venv/bin/uvicorn navi.main:app --reload --reload-dir navi --port 8000`

## Key architecture

### Agent loop (`navi/core/agent.py`)
Tool-calling loop with `llm.complete()` for tool turns, `llm.stream()` for final response.
Events yielded: `ToolEvent`, `ThinkingDelta`, `ThinkingEnd`, `TextDelta`, `StreamEnd`.
Tool schemas built fresh on every `run_stream()` call from registry + `tools/enabled.json`.

### Tool system
Two tiers:

**Built-in tools** (`navi/tools/`):
- `web_search`, `filesystem`, `http_request`, `code_exec`, `terminal`, `ssh_exec`, `image_view`
- `write_tool` — Navi's primary self-extension mechanism (writes + reloads in one call)
- `reload_tools` — hot-reload all user tools without server restart
- `list_tools` — returns actual live tool list from registry (Navi calls this when asked what she can do)
- `tool_manual` — returns `manuals/<tool_name>.md` if exists, else auto-generates from schema

**User tools** (`tools/*.py`):
- Written by Navi via `write_tool`, or manually
- Module-level format: `name`, `description`, `parameters`, `async def execute(params) -> str`
- No classes, no module-level print(), execute must return plain string or raise
- Auto-discovered at startup and on reload
- `tools/enabled.json` — list of user tool names to auto-include in all profiles
- `tools/_template.py` — canonical format reference (starts with `_`, not auto-loaded)

Currently created by Navi: `get_current_datetime.py`, `user_notes.py` (working, correct format).

### Profiles (`navi/profiles/`)
`secretary`, `server_admin`, `developer`. Each has `enabled_tools`, `system_prompt`, `model`, `temperature`, `max_iterations`.
All profiles have the same built-in tool set: `[..., reload_tools, write_tool, list_tools, tool_manual]`.
User tools from `enabled.json` are merged in by `Agent._tool_list()`.

#### Thinking mechanics (per-profile flags in `config.json`)
Every autonomous-reasoning feature is gated by a flag on `AgentProfile`. New mechanics always add a flag first.
Full reference: `docs/profiles.md`.

| Flag | Default | Purpose |
|------|---------|---------|
| `think_enabled` | `true` | Extended LLM reasoning on every call |
| `planning_enabled` | `false` | Run planning pipeline on every message (first-message always runs it) |
| `planning_mandatory` | `false` | `true` = DIRECT shortcut disabled, all phases always run |
| `planning_phase1_enabled` | `true` | Phase 1: task analysis |
| `planning_phase2_enabled` | `false` | Phase 2: 3-advisor review (adds ~3 LLM calls) |
| `planning_phase3_enabled` | `true` | Phase 3: structured execution plan |
| `iteration_budget_enabled` | `true` | Injects remaining iterations into context |
| `goal_anchoring_enabled` | `true` | Goal reminder every N iterations |
| `goal_anchoring_interval` | `5` | N for goal anchoring |
| `anti_stall_enabled` | `true` | Stall detector (N iterations without todo progress) |
| `anti_stall_threshold` | `8` | Stall threshold in iterations |
| `step_validation_enabled` | `false` | LLM check after each todo step (adds latency) |
| `adaptive_replan_enabled` | `false` | Re-planning on step failure |

Low-latency profiles (e.g. future `smart_home`) can disable expensive flags; deep-reasoning profiles keep everything on.

### Global persona (`navi/config.py` → `.env`)
`NAVI_PERSONA` env var — prepended to every profile's system prompt separated by `---`.
Contains: personality, self-extension instructions, `write_tool` usage rules, `tool_manual` usage.

### Registry (`navi/core/registry.py`)
`ToolRegistry` tracks `_builtin_names` to distinguish builtins from user tools on reload.
`reload_user_tools()` drops all non-builtins and reloads from disk.
Built-in tools with registry injection: `ReloadToolsTool`, `WriteToolTool`, `ListToolsTool`, `ToolManualTool`.

### Sessions (`navi/core/pg_session_store.py`)
Persistent PostgreSQL sessions. `model_dump(mode='json')` required for datetime serialization.
Session ID in URL hash for bookmarking.

### WebSocket protocol (`navi/api/websocket.py`)
```
client → server: {type: "message", content: "...", images: [...]}
server → client: stream_start
                 thinking_delta {delta}   ← reasoning chunks (collapsible in UI)
                 thinking_end
                 tool_call {tool, args, result, success}
                 stream_delta {delta}
                 stream_end {content}
                 error {message}
```

### Client (`client/`)
ES modules: `app.js` (state/routing), `chat.js` (DOM helpers), `ws.js` (WebSocket), `api.js` (REST), `sidebar.js`.
Thinking blocks: open during reasoning, auto-collapse on `thinking_end`, re-openable (like tool cards).
Tool cards: accordion, collapsed by default, click to expand.
Images: paste/attach, base64 via FileReader, rendered in bubbles.
No localStorage — session from URL hash or most recent server session.

### Dynamic tool loading (`navi/tools/loader.py`)
Tries module-level format first (preferred for user tools), falls back to class-based.
Errors isolated per file — one broken file doesn't affect others.
Detailed error messages: lists exactly which required definitions are missing.

### Context providers (`navi/context_providers/`, `context_providers/`)
Inject dynamic runtime data as `role="system"` messages on every LLM call (before conversation history).
Module format: `name`, `description`, `global_provider: bool`, `async def get_context() -> str | None`.
`global_provider=True` → injected in all profiles. `False` → opt-in via `context_providers: [...]` in profile config.
Built-in: `public_url` (always injects `PUBLIC_URL` so Navi knows her own address).
Hot-reloaded by `reload_tools`. Navi uses `tool_manual("write_context_provider")` before writing one.
Full reference: `docs/context_providers.md`.

## Config (`.env`)
```
NAVI_PERSONA="..."          # global personality + tool writing rules
OLLAMA_HOST=...
OLLAMA_DEFAULT_MODEL=gemma4:e2b-it-q8_0
OLLAMA_NUM_CTX=8192
OLLAMA_THINK=true
```

## Manuals (`manuals/`)
Markdown files, one per tool. `tool_manual` serves them on demand.
Currently: `manuals/write_tool.md` (full format reference + working example).
Auto-generation fallback from tool schema if no .md exists.

## Important patterns
- `write_tool` validates code before writing (checks for 4 required definitions)
- `write_tool` adds tool to `tools/enabled.json` on success → available in all profiles
- New tool available from the **next** user message (tool schemas built at `run_stream()` entry)
- Navi should call `tool_manual("write_tool")` before writing a tool
- Navi should call `list_tools` when asked about her capabilities (not generate from memory)
- `no-store` cache middleware on `/static/` — safe to hard-refresh during development

## Documentation

Detailed reference is in `docs/`. Read specific files when you need depth on a subsystem:

| File | Covers |
|---|---|
| `docs/agent.md` | Agent loop, 3-phase planning, all thinking mechanics flags |
| `docs/profiles.md` | Profile fields, config flags, how to add a profile |
| `docs/tools.md` | Built-in tools, user tool format, hot-reload |
| `docs/sessions.md` | Session model, dual-buffer, context compression, debug endpoints |
| `docs/websocket.md` | Full WS protocol — all events, reconnect/replay, stop mechanism |
| `docs/memory.md` | Long-term memory — facts, extraction, search |
| `docs/api.md` | All REST + WS endpoints with request/response schemas |
| `docs/config.md` | All `.env` variables with types and defaults |
| `docs/context_providers.md` | Context providers — dynamic system message injection |
| `docs/android-client.md` | Android app — architecture, build/deploy, WebView config, platform detection |
| `docs/architecture.md` | Component diagram, data flow, registry wiring |

## What works well
- Hot-reload without server restart
- Thinking display in client
- Self-extension via `write_tool` (improving — model still sometimes struggles with format)
- Session persistence, URL-based navigation

## Known friction
- Small model (e2b) sometimes writes tools in wrong format despite detailed instructions
- `tool_manual` + explicit format feedback in `write_tool` errors is the current mitigation
- Navi tends to hallucinate tool lists — `list_tools` fixes this if she uses it