diff --git a/docs/memory.md b/docs/memory.md index 9f9a0ef..34477f5 100644 --- a/docs/memory.md +++ b/docs/memory.md @@ -1,10 +1,17 @@ # Memory System -Long-term user memory: facts extracted from conversations and tool executions, stored in PostgreSQL (or SQLite fallback), injected into every session. +Long-term user memory: facts extracted from conversations and tool executions, stored in PostgreSQL, injected into every session. -## PostgreSQL + pgvector (semantic search) +## PostgreSQL + pgvector + pg_trgm (semantic + text search) -When `DATABASE_URL` is set, the memory system uses **PostgreSQL with pgvector** for semantic search via embeddings. +The memory system requires **PostgreSQL** with two extensions: + +| Extension | Purpose | +|---|---| +| `vector` (pgvector) | Semantic search via cosine distance on `embedding vector(768)` | +| `pg_trgm` | Fast ILIKE fallback via GIN trigram indexes on `category`, `key`, `value` | + +Both extensions are created automatically on startup (`CREATE EXTENSION IF NOT EXISTS`). | Feature | SQLite | PostgreSQL | |---|---|---| @@ -38,7 +45,7 @@ This script: 1. Verifies the `vector` extension is installed in PostgreSQL 2. Adds missing columns: `embedding`, `source`, `confidence`, `expires_at`, `last_verified_at`, `source_context` -3. Creates indexes: `hnsw(embedding)`, `expires`, `source+category` +3. Creates indexes: `hnsw(embedding)`, `expires`, `source+category`, `pg_trgm` GIN indexes for ILIKE fallback Safe to run multiple times — all operations use `IF NOT EXISTS`. diff --git a/docs/tech_debt_review_2026-04-29.md b/docs/tech_debt_review_2026-04-29.md index 396962c..8e643d7 100644 --- a/docs/tech_debt_review_2026-04-29.md +++ b/docs/tech_debt_review_2026-04-29.md @@ -44,15 +44,15 @@ | # | Problem | File | Line | Explanation | |---|---------|------|------|-------------| -| 19 | **N+1 UPDATE queries in embedding backfill** | `navi/memory/store.py` | 170-176 | `backfill_embeddings()` loops over rows, issuing one `UPDATE` per row. Should batch via `executemany` or multi-row `UPDATE`. | -| 20 | **Unindexed ILIKE fallback search** | `navi/memory/store.py` | 282-314 | Text fallback constructs dynamic SQL with `ILIKE` on `category`, `key`, `value`. No indexes exist on these columns individually. Full table scan on large tables. | -| 21 | **Context list rebuilt from scratch every iteration** | `navi/core/agent.py` | ~683 | `_build_context()` creates a brand-new list by filtering and prepending system messages on every tool-calling iteration. O(n) per iteration, generating significant garbage. | +| 19 | **[FIXED] N+1 UPDATE queries in embedding backfill** | `navi/memory/store.py` | 170-176 | `backfill_embeddings()` loops over rows, issuing one `UPDATE` per row. Should batch via `executemany` or multi-row `UPDATE`. | +| 20 | **[FIXED] Unindexed ILIKE fallback search** | `navi/memory/store.py` | 282-314 | Text fallback constructs dynamic SQL with `ILIKE` on `category`, `key`, `value`. No indexes exist on these columns individually. Full table scan on large tables. | +| 21 | **[FIXED] Context list rebuilt from scratch every iteration** | `navi/core/agent.py` | ~683 | `_build_context()` creates a brand-new list by filtering and prepending system messages on every tool-calling iteration. O(n) per iteration, generating significant garbage. | | 22 | **[FIXED] Unbounded `planning_logs` growth** | `navi/core/agent.py` | 651 | `session.planning_logs.append(_ev.log)` with no size limit or rotation. Long sessions with planning enabled produce huge JSON in DB, slowing `save()`. | | 23 | **[FIXED] Background cleanup task unreferenced** | `navi/main.py` | 72 | `asyncio.create_task(cleanup_loop(...))` — the returned `Task` is not stored. In Python < 3.11, unhandled exceptions may be silently discarded. | | 24 | **[FIXED] BaseException catches SystemExit** | `navi/core/agent.py` | 847-848 | `_run_with_sentinel` catches `BaseException`, which includes `SystemExit`. Should catch only `Exception`. | | 25 | **[FIXED] SystemExit swallowed in agent loop** | `navi/core/agent.py` | 871 | `isinstance(r, BaseException)` includes `SystemExit`. It should propagate to allow graceful server shutdown. | | 26 | **[FIXED] Hardcoded context output reserve** | `navi/core/agent.py` | 1521 | `output_reserve = 2048` tokens is not configurable per model. May falsely reject valid contexts for small models. | -| 27 | **Vector literal NaN/Infinity vulnerability** | `navi/memory/store.py` | 73-75 | `_vector_to_str` converts floats naively. A misbehaving embedding model returning `NaN` or `Infinity` creates invalid PostgreSQL vector syntax. | +| 27 | **[FIXED] Vector literal NaN/Infinity vulnerability** | `navi/memory/store.py` | 73-75 | `_vector_to_str` converts floats naively. A misbehaving embedding model returning `NaN` or `Infinity` creates invalid PostgreSQL vector syntax. | | 28 | **Tight coupling to todo module internals** | `navi/core/agent.py` | 176-224 | `_todo_status_snapshot` directly reads `navi.tools.todo._plans`, a module-level dict. Hard to test and refactor. | | 29 | **ContextVar values not reset in subagent path** | `navi/core/agent.py` | 350-351 | `run_ephemeral()` sets `_sid_var` and `_model_var` but never resets them. Background tasks spawned by the subagent may inherit stale values. | | 30 | **[FIXED] Replay/reattach may miss post-finish events** | `navi/api/websocket.py` | 274-277 | If a client reconnects after a run finishes but before disconnect is fully processed, it gets `session_sync`. No push event indicates new messages were persisted while disconnected. | diff --git a/navi/core/agent.py b/navi/core/agent.py index 7686324..5ad6b52 100644 --- a/navi/core/agent.py +++ b/navi/core/agent.py @@ -213,6 +213,7 @@ self._workers: list["Worker"] = workers or [] self._memory = memory_store self._cp_registry = cp_registry + self._system_prompt_cache: dict[str, str] = {} # ------------------------------------------------------------------ # Public interface @@ -1430,6 +1431,10 @@ return result def _build_system_prompt(self, profile: "AgentProfile") -> str: + cached = self._system_prompt_cache.get(profile.id) + if cached is not None: + return cached + parts: list[str] = [] persona = settings.navi_persona.strip() @@ -1452,7 +1457,9 @@ lines.append("→ Switch profiles on your own judgment — do not ask for permission. When a task clearly fits another profile, call switch_profile immediately, then inform the user which profile is now active and why. Use list_profiles if you need details about a profile's capabilities.") parts.append("\n".join(lines)) - return "\n\n---\n\n".join(parts) + result = "\n\n---\n\n".join(parts) + self._system_prompt_cache[profile.id] = result + return result def _build_goal_anchor(self, session_id: str, user_message: str) -> Message: """Build a goal-anchor system message injected every N iterations. diff --git a/navi/memory/store.py b/navi/memory/store.py index 2c239b2..0161675 100644 --- a/navi/memory/store.py +++ b/navi/memory/store.py @@ -1,6 +1,7 @@ """Persistent memory store — facts about the user, backed by PostgreSQL with pgvector support.""" import asyncio +import math import re import uuid from datetime import datetime, timezone, timedelta @@ -37,6 +38,7 @@ if pgvector_available else "" ) stmts = [ + "CREATE EXTENSION IF NOT EXISTS pg_trgm", """CREATE TABLE IF NOT EXISTS memory_facts ( id TEXT PRIMARY KEY, category TEXT NOT NULL, @@ -55,6 +57,9 @@ )""" % embedding_col, "CREATE INDEX IF NOT EXISTS idx_memory_facts_expires ON memory_facts (expires_at) WHERE expires_at IS NOT NULL", "CREATE INDEX IF NOT EXISTS idx_memory_facts_source_cat ON memory_facts (source, category)", + "CREATE INDEX IF NOT EXISTS idx_memory_facts_cat_trgm ON memory_facts USING gin (category gin_trgm_ops)", + "CREATE INDEX IF NOT EXISTS idx_memory_facts_key_trgm ON memory_facts USING gin (key gin_trgm_ops)", + "CREATE INDEX IF NOT EXISTS idx_memory_facts_value_trgm ON memory_facts USING gin (value gin_trgm_ops)", """CREATE TABLE IF NOT EXISTS memory_summary ( id INTEGER PRIMARY KEY DEFAULT 1, content TEXT NOT NULL, @@ -66,12 +71,18 @@ )""", ] if embedding_idx: - stmts.insert(1, embedding_idx) + stmts.insert(2, embedding_idx) return stmts -def _vector_to_str(vec: list[float]) -> str: - """Serialize a vector to the PostgreSQL vector literal format.""" +def _vector_to_str(vec: list[float]) -> str | None: + """Serialize a vector to the PostgreSQL vector literal format. + + Returns None if the vector contains NaN, Infinity, or is empty, + since pgvector rejects those values. + """ + if not vec or not all(math.isfinite(v) for v in vec): + return None return "[" + ",".join(str(v) for v in vec) + "]" @@ -167,14 +178,18 @@ ids = [r["id"] for r in rows] texts = [r["value"] for r in rows] embeddings = await self._generate_embeddings(texts) + args: list[tuple[str, str]] = [] for fact_id, emb in zip(ids, embeddings): if emb: vec_str = _vector_to_str(emb) - await conn.execute( - "UPDATE memory_facts SET embedding = $1::vector WHERE id = $2", - vec_str, fact_id, - ) - updated += 1 + if vec_str: + args.append((vec_str, fact_id)) + if args: + await conn.executemany( + "UPDATE memory_facts SET embedding = $1::vector WHERE id = $2", + args, + ) + updated += len(args) # Rate-limit against Ollama Cloud (or any remote embed endpoint) if len(rows) == batch_size: await asyncio.sleep(2) @@ -195,10 +210,10 @@ ) -> None: now = datetime.now(timezone.utc) embedding = await self._generate_embedding(value) + vec_str = _vector_to_str(embedding) if embedding else None pool = await self._get_pool() async with pool.acquire() as conn: - if embedding: - vec_str = _vector_to_str(embedding) + if vec_str: await conn.execute( """INSERT INTO memory_facts (id, category, key, value, created_at, updated_at, source_session_id, @@ -238,9 +253,9 @@ # 1. Try vector search if pgvector + embedding backend are available if self._embedding_backend and await self._has_pgvector(): query_embedding = await self._generate_embedding(query) - if query_embedding: + vec_str = _vector_to_str(query_embedding) if query_embedding else None + if vec_str: try: - vec_str = _vector_to_str(query_embedding) pool = await self._get_pool() async with pool.acquire() as conn: rows = await conn.fetch(