diff --git a/docs/memory.md b/docs/memory.md index 26fccd0..be818d8 100644 --- a/docs/memory.md +++ b/docs/memory.md @@ -1,6 +1,6 @@ # Memory System -Long-term user memory: facts extracted from conversations, stored in SQLite, injected into every session. +Long-term user memory: facts extracted from conversations and tool executions, stored in PostgreSQL (or SQLite fallback), injected into every session. ## PostgreSQL + pgvector (semantic search) @@ -11,7 +11,19 @@ | Storage | File-based | Server | | Semantic search | No | Yes (cosine distance on `vector(768)`) | | Embeddings | None | Generated via Ollama (`nomic-embed-text:latest`) | -| Metadata | `category, key, value` | `+ source, confidence, expires_at, source_context` | +| Metadata | `category, key, value` | `+ embedding, source, confidence, expires_at, source_context` | + +### Dedicated embedding backend + +The chat LLM backend (Ollama Cloud, OpenAI, etc.) and the embedding backend are **separate**. + +``` +.env +EMBEDDING_OLLAMA_HOST=http://192.168.1.168:11434 # local CPU server +EMBEDDING_OLLAMA_API_KEY= +``` + +If `EMBEDDING_OLLAMA_HOST` is empty, memory falls back to the main `OLLAMA_HOST`. --- @@ -32,6 +44,20 @@ --- +## Backfill embeddings + +After enabling pgvector, existing facts have `embedding IS NULL`. Generate embeddings for them: + +```bash +.venv/bin/python navi/memory/backfill_embeddings.py +``` + +- Batches of 8 facts at a time +- 2-second sleep between batches (rate-limit safety) +- Safe to run multiple times — only touches rows without embeddings + +--- + ## Storage (`navi/memory/store.py`) Three tables in the database: @@ -48,14 +74,15 @@ | Method | Description | |---|---| -| `upsert_fact(category, key, value)` | Insert or update a fact | -| `search_facts(query, limit=15)` | Full-text search across category/key/value (OR across terms) | +| `upsert_fact(...)` | Insert or update a fact (generates embedding if pgvector + backend available) | +| `search_facts(query, limit=15)` | **Vector search first** (cosine distance, cutoff 0.3), then ILIKE fallback | | `delete_fact(key, category=None)` | Delete by key, optionally filtered by category | | `get_all_facts(limit=None)` | All facts ordered by `(category, updated_at DESC)` | | `get_summary()` | Current narrative summary text | | `set_summary(content)` | Replace the summary | | `mark_session_extracted(session_id)` | Record extraction timestamp | | `get_extracted_at(session_id)` | Check if/when a session was processed | +| `backfill_embeddings(batch_size=8)` | Generate embeddings for facts with `embedding IS NULL` | --- @@ -68,12 +95,18 @@ **Stale criterion:** `session.last_active < now - 30 minutes` AND not yet extracted (or extracted before last activity). **Extraction process:** -1. Render conversation as plain text. -2. Call LLM with an extraction prompt: "extract facts the user shared about themselves, their preferences, projects, and environment." -3. Parse the response as `category: key = value` lines. -4. Upsert each fact into `memory_facts`. -5. Regenerate `memory_summary` from all current facts. -6. Mark session as extracted. +1. Render conversation as plain text, including: + - User messages + - Assistant messages + - `[Tool call] tool_name(args)` lines + - `[Tool result] tool_name: output` lines (truncated to 500 chars) +2. Truncate overall transcript to 12 000 chars (keep head + tail, drop middle). +3. Call LLM with extraction prompt: "extract stable facts about the user." +4. Parse JSON array with fields: `category`, `key`, `value`, `source`, `source_context`. +5. Map confidence: `tool_call`/`auto_discovery` → 95, `user_explicit` → 90, default → 70. +6. Upsert each fact into `memory_facts`. +7. Regenerate `memory_summary` from all current facts. +8. Mark session as extracted. --- @@ -93,11 +126,20 @@ --- -## Memory tools +## Memory tool (`navi/tools/memory.py`) -**`memory_search`** — searches facts by keyword query. Returns matching facts with category/key/value. Agent should call this when the user mentions something personal that may already be known. +Single unified `memory` tool with `action` parameter: -**`memory_forget`** — deletes facts matching a key (optionally filtered by category). Agent calls this when the user explicitly asks to forget something or when a fact is clearly outdated. +| Action | Description | +|---|---| +| `save` | Upsert a fact with `category`, `key`, `value`, `source`, `confidence`, `expires_days`, `source_context` | +| `search` | Find facts by keyword query (vector search → ILIKE fallback) | +| `forget` | Delete a fact by `key` (optionally filtered by `category`) | +| `list` | Return all stored facts | + +**Sources:** `conversation`, `tool_call`, `auto_discovery`, `user_explicit` + +**Confidence:** 0-100. Tool output = 95, user statement = 80, web = 50, guess = 30. ---