diff --git a/docs/memory.md b/docs/memory.md
index 26fccd0..be818d8 100644
--- a/docs/memory.md
+++ b/docs/memory.md
@@ -1,6 +1,6 @@
 # Memory System
 
-Long-term user memory: facts extracted from conversations, stored in SQLite, injected into every session.
+Long-term user memory: facts extracted from conversations and tool executions, stored in PostgreSQL (or SQLite fallback), injected into every session.
 
 ## PostgreSQL + pgvector (semantic search)
 
@@ -11,7 +11,19 @@
 | Storage | File-based | Server |
 | Semantic search | No | Yes (cosine distance on `vector(768)`) |
 | Embeddings | None | Generated via Ollama (`nomic-embed-text:latest`) |
-| Metadata | `category, key, value` | `+ source, confidence, expires_at, source_context` |
+| Metadata | `category, key, value` | `+ embedding, source, confidence, expires_at, source_context` |
+
+### Dedicated embedding backend
+
+The chat LLM backend (Ollama Cloud, OpenAI, etc.) and the embedding backend are **separate**.
+
+```
+.env
+EMBEDDING_OLLAMA_HOST=http://192.168.1.168:11434  # local CPU server
+EMBEDDING_OLLAMA_API_KEY=
+```
+
+If `EMBEDDING_OLLAMA_HOST` is empty, memory falls back to the main `OLLAMA_HOST`.
 
 ---
 
@@ -32,6 +44,20 @@
 
 ---
 
+## Backfill embeddings
+
+After enabling pgvector, existing facts have `embedding IS NULL`. Generate embeddings for them:
+
+```bash
+.venv/bin/python navi/memory/backfill_embeddings.py
+```
+
+- Batches of 8 facts at a time
+- 2-second sleep between batches (rate-limit safety)
+- Safe to run multiple times — only touches rows without embeddings
+
+---
+
 ## Storage (`navi/memory/store.py`)
 
 Three tables in the database:
@@ -48,14 +74,15 @@
 
 | Method | Description |
 |---|---|
-| `upsert_fact(category, key, value)` | Insert or update a fact |
-| `search_facts(query, limit=15)` | Full-text search across category/key/value (OR across terms) |
+| `upsert_fact(...)` | Insert or update a fact (generates embedding if pgvector + backend available) |
+| `search_facts(query, limit=15)` | **Vector search first** (cosine distance, cutoff 0.3), then ILIKE fallback |
 | `delete_fact(key, category=None)` | Delete by key, optionally filtered by category |
 | `get_all_facts(limit=None)` | All facts ordered by `(category, updated_at DESC)` |
 | `get_summary()` | Current narrative summary text |
 | `set_summary(content)` | Replace the summary |
 | `mark_session_extracted(session_id)` | Record extraction timestamp |
 | `get_extracted_at(session_id)` | Check if/when a session was processed |
+| `backfill_embeddings(batch_size=8)` | Generate embeddings for facts with `embedding IS NULL` |
 
 ---
 
@@ -68,12 +95,18 @@
 **Stale criterion:** `session.last_active < now - 30 minutes` AND not yet extracted (or extracted before last activity).
 
 **Extraction process:**
-1. Render conversation as plain text.
-2. Call LLM with an extraction prompt: "extract facts the user shared about themselves, their preferences, projects, and environment."
-3. Parse the response as `category: key = value` lines.
-4. Upsert each fact into `memory_facts`.
-5. Regenerate `memory_summary` from all current facts.
-6. Mark session as extracted.
+1. Render conversation as plain text, including:
+   - User messages
+   - Assistant messages
+   - `[Tool call] tool_name(args)` lines
+   - `[Tool result] tool_name: output` lines (truncated to 500 chars)
+2. Truncate overall transcript to 12 000 chars (keep head + tail, drop middle).
+3. Call LLM with extraction prompt: "extract stable facts about the user."
+4. Parse JSON array with fields: `category`, `key`, `value`, `source`, `source_context`.
+5. Map confidence: `tool_call`/`auto_discovery` → 95, `user_explicit` → 90, default → 70.
+6. Upsert each fact into `memory_facts`.
+7. Regenerate `memory_summary` from all current facts.
+8. Mark session as extracted.
 
 ---
 
@@ -93,11 +126,20 @@
 
 ---
 
-## Memory tools
+## Memory tool (`navi/tools/memory.py`)
 
-**`memory_search`** — searches facts by keyword query. Returns matching facts with category/key/value. Agent should call this when the user mentions something personal that may already be known.
+Single unified `memory` tool with `action` parameter:
 
-**`memory_forget`** — deletes facts matching a key (optionally filtered by category). Agent calls this when the user explicitly asks to forget something or when a fact is clearly outdated.
+| Action | Description |
+|---|---|
+| `save` | Upsert a fact with `category`, `key`, `value`, `source`, `confidence`, `expires_days`, `source_context` |
+| `search` | Find facts by keyword query (vector search → ILIKE fallback) |
+| `forget` | Delete a fact by `key` (optionally filtered by `category`) |
+| `list` | Return all stored facts |
+
+**Sources:** `conversation`, `tool_call`, `auto_discovery`, `user_explicit`
+
+**Confidence:** 0-100. Tool output = 95, user statement = 80, web = 50, guess = 30.
 
 ---