diff --git a/NAVI.md b/NAVI.md
new file mode 100644
index 0000000..c3a9d19
--- /dev/null
+++ b/NAVI.md
@@ -0,0 +1,60 @@
+# NAVI — Project Context
+
+Personal modular AI agent system. FastAPI backend + Ollama LLM + Vue webclient.
+
+## Server
+
+```bash
+.venv/bin/uvicorn navi.main:app --reload --reload-dir navi --port 8000
+```
+
+- UI: `http://localhost:8000`
+- Debug panel: `http://localhost:8000/debug`
+- Default model: `gemma4:26b-a4b-it-q4_K_M` (26B, Q4)
+
+## Key paths
+
+| Path | What |
+|---|---|
+| `navi/core/agent.py` | Agent loop, planning, tool execution |
+| `navi/profiles/` | Profile definitions (`secretary`, `server_admin`, `developer`) |
+| `navi/api/websocket.py` | WebSocket handler + event replay |
+| `tools/` | User tools (auto-loaded at startup) |
+| `tools/enabled.json` | Tools enabled across all profiles |
+| `persona.txt` | Global persona injected into every profile |
+| `navi.db` | SQLite session store |
+| `workspace/` | Persistent working files |
+| `manuals/` | Tool manuals (served by `tool_manual`) |
+
+## Documentation
+
+Detailed reference is in `docs/`. Query a specific file when you need depth:
+
+```
+filesystem(action="query", path="docs/<file>.md", question="your question")
+```
+
+| File | Covers |
+|---|---|
+| `docs/agent.md` | Agent loop, 3-phase planning, thinking mechanics flags |
+| `docs/profiles.md` | Profile fields, all config flags, how to add a profile |
+| `docs/tools.md` | Built-in tools, user tool format, hot-reload |
+| `docs/sessions.md` | Session model, dual-buffer, context compression |
+| `docs/websocket.md` | WebSocket protocol, all event types |
+| `docs/memory.md` | Long-term memory system |
+| `docs/api.md` | REST API endpoints |
+| `docs/config.md` | All `.env` variables |
+
+## Tool manuals
+
+For detailed usage of any tool:
+
+```
+tool_manual("tool_name")
+```
+
+Manuals exist for: `write_tool`, `spawn_agent`, `reflect`, `gmail`, `share_file`.
+
+## Extending Navi
+
+To add a new tool: `tool_manual("write_tool")` — full format reference + working example.
diff --git a/docs/agent.md b/docs/agent.md
index a1db0b8..2860730 100644
--- a/docs/agent.md
+++ b/docs/agent.md
@@ -1,36 +1,66 @@
 # Agent Loop
 
-The agent loop is the core execution engine. File: `navi/core/agent.py`.
+Core execution engine. File: `navi/core/agent.py`.
 
-## Three entry points
-
-### `run(session_id, user_message)` → `str`
-Non-streaming. Runs the full tool-calling loop and returns the final text. Used for REST endpoints or background tasks where streaming is not needed. No planning phase.
+## Entry points
 
 ### `run_stream(session_id, user_message)` → `AsyncGenerator[AgentEvent]`
-Streaming. Yields `AgentEvent` objects in real time. Used by the WebSocket handler. Includes planning phase.
+Streaming. Yields `AgentEvent` objects in real time. Used by the WebSocket handler. Runs the planning phase if `profile.planning_enabled = True`.
+
+### `run(session_id, user_message)` → `str`
+Non-streaming. Full tool-calling loop, returns final text. No planning phase.
 
 ### `run_ephemeral(user_message, profile_id)` → `str`
-Non-persistent subagent. No DB reads/writes. Uses a temporary in-memory context. Called by `SpawnAgentTool`. Assigns a unique session ID (`subagent_<uuid12>`) to isolate its scratchpad from the parent and from other subagents.
+Non-persistent subagent. No DB reads/writes. Temporary in-memory context. Called by `SpawnAgentTool`. Uses session ID `subagent_<uuid12>` to isolate scratchpad.
 
 ---
 
 ## Planning phase (`_run_planning`)
 
-Runs only when `profile.planning_enabled = True`, before the tool-calling loop.
+Runs before the tool loop when `profile.planning_enabled = True`.
 
-**What it does:**
-1. Sends the user request to the LLM with a special system prompt: "decide if this needs a plan".
-2. LLM either responds `DIRECT` (skip planning) or produces a numbered step list.
-3. If a real plan is returned, it's injected into `session.context` as an assistant message — the model then sees it as its own prior statement and naturally continues from it.
-4. Yields `PlanReady(plan)` event → rendered as a collapsible card in the UI.
+### Phase 1 — Analysis
+LLM receives the user request with a classification prompt. Outputs:
+- `DIRECT` → skip planning entirely (simple request).
+- A structured analysis + `REFLECT: yes/no` → continue to Phase 2 or 3.
 
-**Detection logic:**
-- Response starts with `DIRECT` → skip (no plan needed).
-- No numbered steps found (regex `^\s*\d+[\.\)]`) → skip (malformed response).
-- Otherwise → inject plan, emit `PlanReady`.
+### Phase 2 — Multi-perspective review (conditional)
+Runs only when `profile.planning_reflect_enabled = True` AND Phase 1 outputs `REFLECT: yes`.
+Three advisors run in parallel (via `asyncio.gather`), each receiving the full chat context and the Phase 1 analysis:
+- **Critic** — what could go wrong
+- **Pragmatist** — simpler/more direct path
+- **Detailer** — missing requirements
 
-**Parameters:** `think=False`, `temperature=0.3`, no tools → fast and structured.
+Advisor feedback is embedded into the Phase 3 prompt.
+
+### Phase 3 — Execution plan
+LLM produces a numbered step list. Each step is assigned an executor:
+- `TOOL: tool_name` — single tool call
+- `AGENT: profile_id` — delegated to a subagent via `spawn_agent`
+- `SELF` — handled inline (synthesis, context-dependent action)
+
+**Comma test (enforced in prompt):** if a step description lists multiple things with "and" or commas, each item must be a separate step.
+
+The plan is injected into `session.context` as an assistant message and saved to `session.messages` with `is_plan=True` for UI rendering. The todo list is auto-populated from the plan steps.
+
+---
+
+## Thinking mechanics
+
+All flags live on `AgentProfile` and can be set per-profile in `config.json`.
+
+| Flag | Default | What it does |
+|---|---|---|
+| `think_enabled` | `true` | Passes `think=True` to LLM on every main-loop call (extended reasoning) |
+| `iteration_budget_enabled` | `true` | Injects remaining iteration count into context so model wraps up in time |
+| `planning_reflect_enabled` | `false` | Enables Phase 2 advisor review (3 parallel LLM calls, adds latency) |
+| `goal_anchoring_enabled` | `true` | Injects goal-reminder system message every N iterations |
+| `goal_anchoring_interval` | `5` | N for goal anchoring |
+| `anti_stall_enabled` | `true` | Detects looping without todo progress and injects a warning |
+| `anti_stall_threshold` | `8` | Consecutive iterations without progress before warning fires |
+| `step_validation_enabled` | `false` | Blocks marking a todo step `done` without a `validation` field |
+| `adaptive_replan_enabled` | `false` | When a step is marked `failed`, queues a re-plan prompt for the next iteration |
+| `subagent_planning_enabled` | `false` | Subagents run their own planning phase |
 
 ---
 
@@ -39,72 +69,41 @@
 Runs up to `profile.max_iterations` times.
 
 ```
-iteration:
-  1. Check stop_event → yield StreamStopped and return if set
-  2. Call llm.stream_complete(context, tool_schemas)
-     - Yields ThinkingDelta events during reasoning
-     - Yields TextDelta events during text generation
-     - Final chunk carries tool_calls or finish_reason="stop"
-  3a. finish_reason == "stop" (no tool calls):
-       → Save session, yield StreamEnd
-       → Run post-turn workers (e.g. context compression)
-       → Return
-  3b. tool_calls present:
-       For each tool call:
-         - yield ToolStarted (pending card in UI)
-         - Create asyncio.Task for tool execution
-         - Set current_event_sink to a fresh Queue
-         - Drain the queue (receives subagent events in real time)
-         - yield ToolEvent (completed card in UI)
-         - Append tool result to session.context
-       Check if profile switched → reload profile + tools
-       Continue to next iteration
+Each iteration:
+  1. Check stop_event → yield StreamStopped if set
+  2. Build context: _build_context() injects iteration budget and goal anchor (if due)
+  3. Check anti-stall: if stalled, append warning message to context
+  4. Inject queued adaptive re-plan message (if a step failed last iteration)
+  5. llm.stream_complete(context, tool_schemas)
+     → ThinkingDelta/ThinkingEnd events during reasoning
+     → TextDelta events during text generation
+  6a. No tool calls → save session, yield StreamEnd, run workers, return
+  6b. Tool calls → execute each, yield ToolEvent, append results to context
+  7. Update anti-stall counters, detect newly-failed todo steps
+  8. Check if profile switched → reload profile + tools
 ```
 
 ### Sub-agent event forwarding
-
-When a tool (e.g. `spawn_agent`) runs a subagent internally, subagent events arrive through `current_event_sink`. The parent agent drains that queue while the tool task runs, yielding subagent `ToolStarted`/`ToolEvent` events marked with `is_subagent=True`.
+When `spawn_agent` runs a subagent, its events arrive through `current_event_sink`. The parent drains the queue in real time, yielding subagent events marked with `is_subagent=True`.
 
 ### Cooperative stop
-
-Stop is signalled via `current_stop_event` (an `asyncio.Event`). The agent checks it:
-- Before each LLM call
-- During streaming (breaks out of the stream loop → calls `aclose()` on generator → Ollama closes gracefully, model stays in VRAM)
-- After tool execution
-
-**Never use `task.cancel()`** for stopping — it corrupts Starlette's WebSocket state.
+Stop is signalled via `current_stop_event` (an `asyncio.Event`). Checked before each LLM call, during streaming, and after tool execution. Never use `task.cancel()` — it corrupts WebSocket state.
 
 ---
 
-## Workers (`_run_workers`)
+## Workers
 
-Workers run sequentially after `StreamEnd`. Each receives a `WorkerContext` with session state, token counts, and LLM access.
+Run sequentially after `StreamEnd`. Currently: `CompressionWorker`.
 
-Currently registered worker: `CompressionWorker` (`navi/workers/compressor.py`).
-
-Worker result: `WorkerResult.events` — list of `AgentEvent` objects that are yielded after `StreamEnd`.
-
-Pre-turn compression also exists: before calling the LLM, `run_stream()` checks if `session.context_token_count` is over the threshold and compresses proactively.
-
-See [`sessions.md`](sessions.md) for compression details.
+Pre-turn compression also runs at the start of `run_stream()` if `session.context_token_count` exceeds the threshold. See [`sessions.md`](sessions.md).
 
 ---
 
-## System prompt construction
+## System prompt construction (`_build_context`)
 
-Each LLM call uses `_build_context()`, which injects:
-1. System message: `persona + "---" + profile.system_prompt` (built fresh every call, never stored in session.context).
-2. Optional memory message: `"## What I remember about the user\n\n{summary}"`.
-3. Conversation messages from `session.context` (system messages stripped to avoid duplication).
+Every LLM call receives:
+1. System message: `persona + "---" + profile.system_prompt` (injected fresh, never stored).
+2. Optional memory message: `"## What I remember about the user\n..."`.
+3. `session.context` messages (system messages stripped to avoid duplication).
 
-This means profile switches and persona changes take effect immediately without modifying stored history.
-
----
-
-## Context vars set by Agent
-
-Before each `run_stream()` call: `current_session_id.set(session_id)`.  
-Before each tool task: `current_event_sink.set(sink_queue)`.  
-`run_ephemeral()` sets `current_session_id` to a unique subagent ID.
-
-See [`architecture.md`](architecture.md) for the full ContextVar table.
+Profile switches and persona changes take effect immediately.
diff --git a/docs/profiles.md b/docs/profiles.md
index 027eb74..a73ca4c 100644
--- a/docs/profiles.md
+++ b/docs/profiles.md
@@ -4,114 +4,112 @@
 
 ## Profile definition (`navi/profiles/base.py`)
 
+Each profile is loaded from a directory under `navi/profiles/<id>/`:
+- `config.json` — all fields below
+- `system_prompt.txt` — domain-specific instructions
+- `subagent_system_prompt.txt` — injected into subagents spawned from this profile (optional)
+
 ```python
 @dataclass
 class AgentProfile:
-    id: str                          # unique identifier (used in API, sessions, switch_profile)
-    name: str                        # human-readable name
-    description: str                 # shown in profile selector
-    system_prompt: str               # domain-specific instructions
-    enabled_tools: list[str]         # tool names available to this profile
-    model: str = "..."               # Ollama model to use
+    id: str                        # unique identifier
+    name: str
+    description: str
+    system_prompt: str             # loaded from system_prompt.txt
+    enabled_tools: list[str]       # tools available in the main loop
+    llm_backend: str = "ollama"
+    model: str = "gemma4:26b-a4b-it-q4_K_M"
+    max_iterations: int = 10
     temperature: float = 0.7
-    max_iterations: int = 50
-    planning_enabled: bool = False   # whether to run the planning phase before the loop
-    llm_backend: str = "ollama"      # backend key in BackendRegistry
+    planning_enabled: bool = False
+    short_description: str = ""    # 1-line summary shown to all profiles
+    full_description: dict = {}    # keys: specialization, when_to_use, key_tools
+
+    # Thinking mechanics (see docs/agent.md for details)
+    think_enabled: bool = True
+    iteration_budget_enabled: bool = True
+    planning_reflect_enabled: bool = False
+    goal_anchoring_enabled: bool = True
+    goal_anchoring_interval: int = 5
+    anti_stall_enabled: bool = True
+    anti_stall_threshold: int = 8
+    step_validation_enabled: bool = False
+    adaptive_replan_enabled: bool = False
+
+    # Sub-agent configuration
+    subagent_tools: list[str] = []
+    subagent_planning_enabled: bool = False
+    subagent_system_prompt: str = ""  # loaded from subagent_system_prompt.txt
 ```
 
-## Built-in profiles
-
-| Profile ID | Name | Model | Temperature | Planning |
-|---|---|---|---|---|
-| `secretary` | Personal Secretary | gemma4:26b-a4b-it-q4_K_M | 0.7 | Yes |
-| `server_admin` | Server Administrator | gemma4:26b-a4b-it-q4_K_M | 0.2 | Yes |
-| `smart_home` | Smart Home Assistant | gemma4:26b-a4b-it-q4_K_M | 0.3 | Yes |
-
-All profiles have the same base tool set:
-```
-todo, scratchpad, switch_profile,
-web_search, web_view, http_request,
-filesystem, code_exec, terminal, ssh_exec, image_view,
-memory_search, memory_forget,
-reload_tools, write_tool, list_tools, tool_manual,
-spawn_agent
-```
-
-User tools from `tools/enabled.json` are merged in on top of the profile's `enabled_tools` list.
-
-## System prompt construction
-
-The final system prompt the LLM sees is:
-
-```
-{NAVI_PERSONA}
-
 ---
 
-{profile.system_prompt}
+## Active profiles
+
+| ID | Name | Model | Temp | Planning |
+|---|---|---|---|---|
+| `secretary` | Personal Secretary | gemma4:26b-a4b-it-q4_K_M | 0.7 | Yes |
+| `server_admin` | Server Administrator | gemma4:26b-a4b-it-q4_K_M | 0.2 | Yes |
+| `developer` | Tool Developer | gemma4:26b-a4b-it-q4_K_M | 0.2 | Yes |
+
+All profiles share a base tool set. User tools from `tools/enabled.json` are merged in at runtime.
+
+---
+
+## System prompt construction
+
+The LLM sees (injected fresh on every call, never stored in session):
+
+```
+{persona.txt content}
+
+---
+
+{profile system_prompt.txt content}
 ```
 
-`NAVI_PERSONA` is the global personality layer — loaded from `settings.navi_persona` or `settings.navi_persona_file`. It contains: personality, self-extension rules, planning/scratchpad instructions, delegation guidance, memory rules.
+`persona.txt` — global layer: personality, CONTEXT FIRST principle, self-extension rules, scratchpad/todo/memory instructions, delegation rules.
 
-`profile.system_prompt` is the domain-specific layer: tool priorities, workflow rules, scratchpad section names, safety rules for that domain.
+`system_prompt.txt` — domain layer: tool priorities, workflow, safety rules for this profile.
 
-The system message is **never stored** in `session.context`. It is injected fresh on every LLM call. Profile switches take effect immediately.
+---
 
-## Adding a new profile
+## Adding a profile
 
-1. Create `navi/profiles/my_profile.py`:
-```python
-from .base import AgentProfile
-
-my_profile = AgentProfile(
-    id="my_profile",
-    name="My Profile",
-    description="What this profile is for.",
-    system_prompt="""Mode: ...
-
-## Tool priorities
-1. web_search — ...
-2. filesystem — ...
-
-## Scratchpad sections
-- findings, errors, plan
-
-## Safety rules
-...""",
-    enabled_tools=[
-        "todo", "scratchpad", "switch_profile",
-        "web_search", "filesystem", "code_exec",
-        "memory_search", "memory_forget",
-        "reload_tools", "write_tool", "list_tools", "tool_manual",
-        "spawn_agent",
-    ],
-    model="gemma4:26b-a4b-it-q4_K_M",
-    temperature=0.5,
-    planning_enabled=True,
-)
+1. Create directory `navi/profiles/my_profile/`
+2. Add `config.json`:
+```json
+{
+  "id": "my_profile",
+  "name": "My Profile",
+  "description": "...",
+  "short_description": "...",
+  "model": "gemma4:26b-a4b-it-q4_K_M",
+  "temperature": 0.5,
+  "max_iterations": 30,
+  "planning_enabled": true,
+  "think_enabled": true,
+  "iteration_budget_enabled": true,
+  "planning_reflect_enabled": false,
+  "goal_anchoring_enabled": true,
+  "goal_anchoring_interval": 5,
+  "anti_stall_enabled": true,
+  "anti_stall_threshold": 8,
+  "step_validation_enabled": false,
+  "adaptive_replan_enabled": false,
+  "subagent_planning_enabled": false,
+  "subagent_tools": ["todo", "filesystem", "terminal"],
+  "enabled_tools": ["todo", "scratchpad", "web_search", "filesystem"]
+}
 ```
+3. Add `system_prompt.txt` with domain-specific instructions.
+4. Optionally add `subagent_system_prompt.txt`.
+5. The profile is auto-discovered at startup — no registration needed.
 
-2. Register it in `navi/profiles/__init__.py`:
-```python
-from .my_profile import my_profile
-ALL_PROFILES = [..., my_profile]
-```
+---
 
 ## Profile switching
 
-`switch_profile` tool updates `session.profile_id` in the DB. In `run_stream()`, after each tool execution batch, the agent checks if `session.profile_id` changed and reloads profile + tools. The switch takes full effect on the next LLM call within the same run.
+`switch_profile` tool updates `session.profile_id` in the DB. After each tool execution batch, `run_stream()` checks for a profile change and reloads profile + tools. Takes effect on the next LLM call.
 
-Rules from the persona:
-- Don't switch for a single off-topic question.
-- Switch when the session domain clearly changes (e.g. coding → server admin).
-- Never switch back and forth repeatedly in one conversation.
-
-## Per-profile scratchpad sections
-
-Each profile defines named sections appropriate for its domain:
-
-| Profile | Scratchpad sections |
-|---|---|
-| secretary | `findings`, `sources`, `drafts` |
-| server_admin | `status`, `logs`, `errors`, `plan` |
-| smart_home | `state`, `config`, `errors` |
+Rules (in persona): don't switch for a single off-topic question; switch when the domain clearly changes; never switch back and forth repeatedly.