# Memory System

Long-term user memory: facts extracted from conversations and tool executions, stored in PostgreSQL, injected into every session.

## PostgreSQL + pgvector + pg_trgm (semantic + text search)

The memory system requires **PostgreSQL** with two extensions:

| Extension | Purpose | Auto-created by app? |
|---|---|---|
| `vector` (pgvector) | Semantic search via cosine distance on `embedding vector(768)` | Yes (`CREATE EXTENSION IF NOT EXISTS`) |
| `pg_trgm` | Fast ILIKE fallback via GIN trigram indexes on `category`, `key`, `value` | **No — must be installed by DBA** |

**pgvector** is created automatically because the app typically runs with sufficient privileges on its own database. **pg_trgm** is a core PostgreSQL extension that may require superuser privileges to install. If it is already installed, GIN trigram indexes are created automatically; if not, the app falls back to plain ILIKE without indexes (functional but slower on large tables).

To install pg_trgm manually:

```sql
CREATE EXTENSION IF NOT EXISTS pg_trgm;
```

| Feature | SQLite | PostgreSQL |
|---|---|---|
| Storage | File-based | Server |
| Semantic search | No | Yes (cosine distance on `vector(768)`) |
| Embeddings | None | Generated via Ollama (`nomic-embed-text:latest`) |
| Metadata | `category, key, value` | `+ embedding, source, confidence, expires_at, source_context` |

### Dedicated embedding backend

The chat LLM backend (Ollama Cloud, OpenAI, etc.) and the embedding backend are **separate**.

```
.env
EMBEDDING_OLLAMA_HOST=http://192.168.1.168:11434  # local CPU server
EMBEDDING_OLLAMA_API_KEY=
```

If `EMBEDDING_OLLAMA_HOST` is empty, memory falls back to the main `OLLAMA_HOST`.

---

## Schema migration

When upgrading the memory system to a new schema (e.g. adding pgvector columns), run:

```bash
.venv/bin/python navi/memory/migrate_pgvector.py
```

This script:
1. Verifies the `vector` extension is installed in PostgreSQL
2. Adds missing columns: `embedding`, `source`, `confidence`, `expires_at`, `last_verified_at`, `source_context`
3. Creates indexes: `hnsw(embedding)`, `expires`, `source+category`, `pg_trgm` GIN indexes for ILIKE fallback

Safe to run multiple times — all operations use `IF NOT EXISTS`.

---

## Backfill embeddings

After enabling pgvector, existing facts have `embedding IS NULL`. Generate embeddings for them:

```bash
.venv/bin/python navi/memory/backfill_embeddings.py
```

- Batches of 8 facts at a time
- 2-second sleep between batches (rate-limit safety)
- Safe to run multiple times — only touches rows without embeddings

---

## Storage (`navi/memory/store.py`)

Three tables in the database:

| Table | Purpose |
|---|---|
| `memory_facts` | Individual facts: `(user_id, category, key, value)` — unique on `(user_id, category, key)` |
| `memory_summary` | Per-user narrative summary (`user_id` scoped) |
| `session_memory_state` | Tracks which sessions have been processed (by `extracted_at`) |

`user_id` references `navi_users(id)` with `ON DELETE CASCADE`. Facts and summaries are scoped per user. Admin with `navi.memory.read_all` can pass `user_id=None` for global search.

`MemoryStore` lazily creates tables on first async operation via `_get_pool()`. All operations are async via asyncpg (PostgreSQL).

### Key operations

| Method | Description |
|---|---|
| `upsert_fact(..., user_id=None)` | Insert or update a fact scoped to user |
| `search_facts(query, user_id=None, limit=15)` | **Vector search first** (cosine distance, cutoff 0.3), then ILIKE fallback. `user_id=None` requires admin permission for global search. |
| `delete_fact(key, category=None, user_id=None)` | Delete by key, optionally filtered by category and user |
| `get_all_facts(user_id=None, all_users=False, limit=None, offset=0, search=None, sort_by="category", sort_order="desc")` | All facts ordered by `sort_by`. Pass `all_users=True` for admin global view. |
| `get_summary(user_id=None)` | Current narrative summary text for user |
| `set_summary(content, user_id=None)` | Replace the summary for user |
| `mark_session_extracted(session_id)` | Record extraction timestamp |
| `get_extracted_at(session_id)` | Check if/when a session was processed |
| `backfill_embeddings(batch_size=8)` | Generate embeddings for facts with `embedding IS NULL` |

---

## Automatic extraction (`navi/memory/extractor.py`)

Facts are extracted from stale sessions automatically.

**Trigger:** `POST /sessions` (create new session) fires `_process_stale_sessions()` as a background task.

**Stale criterion:** `session.last_active < now - 30 minutes` AND not yet extracted (or extracted before last activity).

**Extraction process:**
1. Render conversation as plain text, including:
   - User messages
   - Assistant messages
   - `[Tool call] tool_name(args)` lines
   - `[Tool result] tool_name: output` lines (truncated to 500 chars)
2. Truncate overall transcript to 12 000 chars (keep head + tail, drop middle).
3. Call LLM with extraction prompt: "extract stable facts about the user."
4. Parse JSON array with fields: `category`, `key`, `value`, `source`, `source_context`.
5. Map confidence: `tool_call`/`auto_discovery` → 95, `user_explicit` → 90, default → 70.
6. Upsert each fact into `memory_facts`.
7. Regenerate `memory_summary` from all current facts.
8. Mark session as extracted.

---

## Memory injection into agent context

At the start of each `run_stream()` / `run()` / `run_ephemeral()` call, `_memory_msg()` is called:

```python
async def _memory_msg(self) -> Message | None:
    summary = await self._memory.get_summary()
    if not summary:
        return None
    return Message(role="system", content=f"## What I remember about the user\n\n{summary}")
```

This message is inserted after the main system message but before conversation history. The agent reads it on every turn.

---

## Memory tool (`navi/tools/memory.py`)

Single unified `memory` tool with `action` parameter:

| Action | Description |
|---|---|
| `save` | Upsert a fact with `category`, `key`, `value`, `source`, `confidence`, `expires_days`, `source_context` |
| `search` | Find facts by keyword query (vector search → ILIKE fallback) |
| `forget` | Delete a fact by `key` (optionally filtered by `category`) |
| `list` | Return all stored facts |

**Sources:** `conversation`, `tool_call`, `auto_discovery`, `user_explicit`

**Confidence:** 0-100. Tool output = 95, user statement = 80, web = 50, guess = 30.

---

## Memory usage guidelines (from persona)

Call `memory_search` when:
- The user mentions something personal (location, project, preference, recurring task).
- About to make an assumption about the user's environment or preferences — verify first.
- The user asks about something helped with before.

Do NOT call `memory_search` reflexively at the start of every session — only when context warrants it.

Call `memory_forget` only when the user explicitly asks, or when a stored fact is clearly wrong or outdated.