Long-term user memory: facts extracted from conversations and tool executions, stored in PostgreSQL, injected into every session.
The memory system requires PostgreSQL with two extensions:
| Extension | Purpose | Auto-created by app? |
|---|---|---|
vector (pgvector) |
Semantic search via cosine distance on embedding vector(768) |
Yes (CREATE EXTENSION IF NOT EXISTS) |
pg_trgm |
Fast ILIKE fallback via GIN trigram indexes on category, key, value |
No — must be installed by DBA |
pgvector is created automatically because the app typically runs with sufficient privileges on its own database. pg_trgm is a core PostgreSQL extension that may require superuser privileges to install. If it is already installed, GIN trigram indexes are created automatically; if not, the app falls back to plain ILIKE without indexes (functional but slower on large tables).
To install pg_trgm manually:
CREATE EXTENSION IF NOT EXISTS pg_trgm;
| Feature | SQLite | PostgreSQL |
|---|---|---|
| Storage | File-based | Server |
| Semantic search | No | Yes (cosine distance on vector(768)) |
| Embeddings | None | Generated via Ollama (nomic-embed-text:latest) |
| Metadata | category, key, value |
+ embedding, source, confidence, expires_at, source_context |
The chat LLM backend (Ollama Cloud, OpenAI, etc.) and the embedding backend are separate.
.env EMBEDDING_OLLAMA_HOST=http://192.168.1.168:11434 # local CPU server EMBEDDING_OLLAMA_API_KEY=
If EMBEDDING_OLLAMA_HOST is empty, memory falls back to the main OLLAMA_HOST.
When upgrading the memory system to a new schema (e.g. adding pgvector columns), run:
.venv/bin/python navi/memory/migrate_pgvector.py
This script:
vector extension is installed in PostgreSQLembedding, source, confidence, expires_at, last_verified_at, source_contexthnsw(embedding), expires, source+category, pg_trgm GIN indexes for ILIKE fallbackSafe to run multiple times — all operations use IF NOT EXISTS.
After enabling pgvector, existing facts have embedding IS NULL. Generate embeddings for them:
.venv/bin/python navi/memory/backfill_embeddings.py
navi/memory/store.py)Three tables in the database:
| Table | Purpose |
|---|---|
memory_facts |
Individual facts: (user_id, category, key, value) — unique on (user_id, category, key) |
memory_summary |
Per-user narrative summary (user_id scoped) |
session_memory_state |
Tracks which sessions have been processed (by extracted_at) |
user_id references navi_users(id) with ON DELETE CASCADE. Facts and summaries are scoped per user. Admin with navi.memory.read_all can pass user_id=None for global search.
MemoryStore lazily creates tables on first async operation via _get_pool(). All operations are async via asyncpg (PostgreSQL).
| Method | Description |
|---|---|
upsert_fact(..., user_id=None) |
Insert or update a fact scoped to user |
search_facts(query, user_id=None, limit=15) |
Vector search first (cosine distance, cutoff 0.3), then ILIKE fallback. user_id=None requires admin permission for global search. |
delete_fact(key, category=None, user_id=None) |
Delete by key, optionally filtered by category and user |
get_all_facts(user_id=None, all_users=False, limit=None, offset=0, search=None, sort_by="category", sort_order="desc") |
All facts ordered by sort_by. Pass all_users=True for admin global view. |
get_summary(user_id=None) |
Current narrative summary text for user |
set_summary(content, user_id=None) |
Replace the summary for user |
mark_session_extracted(session_id) |
Record extraction timestamp |
get_extracted_at(session_id) |
Check if/when a session was processed |
backfill_embeddings(batch_size=8) |
Generate embeddings for facts with embedding IS NULL |
navi/memory/extractor.py)Facts are extracted from stale sessions automatically.
Trigger: POST /sessions (create new session) fires _process_stale_sessions() as a background task.
Stale criterion: session.last_active < now - 30 minutes AND not yet extracted (or extracted before last activity).
Extraction process:
[Tool call] tool_name(args) lines[Tool result] tool_name: output lines (truncated to 500 chars)category, key, value, source, source_context.tool_call/auto_discovery → 95, user_explicit → 90, default → 70.memory_facts.memory_summary from all current facts.At the start of each run_stream() / run() / run_ephemeral() call, _memory_msg() is called:
async def _memory_msg(self) -> Message | None:
summary = await self._memory.get_summary()
if not summary:
return None
return Message(role="system", content=f"## What I remember about the user\n\n{summary}")
This message is inserted after the main system message but before conversation history. The agent reads it on every turn.
navi/tools/memory.py)Single unified memory tool with action parameter:
| Action | Description |
|---|---|
save |
Upsert a fact with category, key, value, source, confidence, expires_days, source_context |
search |
Find facts by keyword query (vector search → ILIKE fallback) |
forget |
Delete a fact by key (optionally filtered by category) |
list |
Return all stored facts |
Sources: conversation, tool_call, auto_discovery, user_explicit
Confidence: 0-100. Tool output = 95, user statement = 80, web = 50, guess = 30.
Call memory_search when:
Do NOT call memory_search reflexively at the start of every session — only when context warrants it.
Call memory_forget only when the user explicitly asks, or when a stored fact is clearly wrong or outdated.