Session management, dual-buffer design, and context compression.
navi/core/session.py)class Session(BaseModel):
id: str # UUID
profile_id: str # active profile
messages: list[Message] # full display history — never compressed
context: list[Message] # LLM context — may be replaced with summary
context_token_count: int # accumulated tokens; reset to 0 after compression
pinned: bool # pinned sessions appear first in sidebar
name: str | None # auto-generated display name (set after first exchange)
created_at: datetime
last_active: datetime
planning_logs: list[dict] # raw planning phase outputs per turn (debug)
Messages in session.messages carry optional flags beyond role/content:
| Flag | Purpose | |
|---|---|---|
is_plan: bool |
Message is a planning phase output (shown as plan card in UI, not text) | |
is_compression: bool |
Marker message injected when context compression ran | |
is_summary: bool |
A summary message replacing compressed history in session.context |
|
| `thinking: str \ | None` | LLM reasoning captured during a tool-calling turn |
Two separate message lists serve different purposes:
| Buffer | Purpose | Modified by compression? |
|---|---|---|
session.messages |
Full display history shown in the UI | Never |
session.context |
What the LLM sees on each call | Yes — old turns replaced with a summary |
Tool results, image injections, and assistant messages are appended to both buffers. When compression runs, only session.context is modified.
Note: System messages are not stored in either buffer. They are injected fresh from the current profile on every LLM call via _build_context(). This makes profile switches take effect immediately.
InMemorySessionStoreSimple dict-backed store for testing.
PgSessionStore (navi/core/pg_session_store.py)Production store backed by PostgreSQL via asyncpg.
create(profile_id) → new Sessionget(session_id) → Session | Nonesave(session) — serializes with model_dump(mode='json') (required for datetime serialization)list_all() → sorted by (pinned DESC, last_active DESC)delete(session_id) → boolset_pinned(session_id, pinned) → boolset_name(session_id, name) → boolRequires DATABASE_URL env variable (e.g. postgresql://user:pass@localhost/navi).
navi/core/compressor.py)Keeps the LLM context within the token budget by summarizing old conversation turns.
Two trigger points:
run_stream()): before calling the LLM, checks session.context_token_count against the threshold. Compresses if tokens >= num_ctx * threshold.CompressionWorker): after StreamEnd, the worker re-checks and compresses if needed.Config values (settings):
context_compression_enabled: bool = Truecontext_compression_threshold: float = 0.80 — trigger at 80% of ollama_num_ctxcontext_keep_recent: int = 10 — keep last N conversational turns verbatimcontext_summary_temperature: float = 0.3compress_context(context, llm, model, temperature, keep_recent):
to_summarize (old turns) and to_keep (recent keep_recent turns).
to_summarize as plain text (tool calls shown as compact previews, max 120 chars for args, max 300 chars for results)._MAX_SUMMARY_INPUT_CHARS = 12_000 chars.llm.complete() with think=False to produce a bullet-point summary.to_summarize with a single summary message (role=user, is_summary=True).system_msgs + [summary_msg] + to_keep.If compression fails, the exception propagates to CompressionWorker, which logs a warning and continues — compression failure is non-fatal.
session.messages — full history is always intact.context_keep_recent conversational turns.Files uploaded via POST /sessions/{id}/files are stored in session_files/{session_id}/.
session_files_max_size_mb (default: 200 MB)session_files_ttl_hours (default: 24 hours)cleanup_loop (started on FastAPI startup) deletes stale session directories..sh, .py, .exe, etc.) are rejected.When files are uploaded via the UI, their paths are appended to the user message content:
[Uploaded files on disk:
- filename.pdf → session_files/{id}/filename.pdf]
This lets the agent use filesystem or code_exec to access the files.
GET /sessions/{id}/context — returns what the LLM actually sees (may differ from messages after compression).GET /sessions/{id}/planning — returns session.planning_logs: raw planning phase outputs per turn.