# Sessions

Session management, dual-buffer design, and context compression.

## Session model (`navi/core/session.py`)

```python
class Session(BaseModel):
    id: str                          # UUID
    profile_id: str                  # active profile
    messages: list[Message]          # full display history — never compressed
    context: list[Message]           # LLM context — may be replaced with summary
    context_token_count: int         # accumulated tokens; reset to 0 after compression
    pinned: bool                     # pinned sessions appear first in sidebar
    created_at: datetime
    last_active: datetime
```

## Dual-buffer design

Two separate message lists serve different purposes:

| Buffer | Purpose | Modified by compression? |
|---|---|---|
| `session.messages` | Full display history shown in the UI | Never |
| `session.context` | What the LLM sees on each call | Yes — old turns replaced with a summary |

Tool results, image injections, and assistant messages are appended to **both** buffers. When compression runs, only `session.context` is modified.

**Note:** System messages are **not stored** in either buffer. They are injected fresh from the current profile on every LLM call via `_build_context()`. This makes profile switches take effect immediately.

## Session store

### `InMemorySessionStore`
Simple dict-backed store for testing.

### `SQLiteSessionStore` (`navi/core/sqlite_session_store.py`)
Production store backed by SQLite via aiosqlite.

- `create(profile_id)` → new `Session`
- `get(session_id)` → `Session | None`
- `save(session)` — serializes with `model_dump(mode='json')` (required for datetime serialization)
- `list_all()` → sorted by `(pinned DESC, last_active DESC)`
- `delete(session_id)` → `bool`
- `set_pinned(session_id, pinned)` → `bool`

DB path: `settings.db_path` (default: `navi.db`).

---

## Context compression (`navi/core/compressor.py`)

Keeps the LLM context within the token budget by summarizing old conversation turns.

### When it triggers

Two trigger points:

1. **Pre-turn** (in `run_stream()`): before calling the LLM, checks `session.context_token_count` against the threshold. Compresses if `tokens >= num_ctx * threshold`.
2. **Post-turn** (via `CompressionWorker`): after `StreamEnd`, the worker re-checks and compresses if needed.

Config values (`settings`):
- `context_compression_enabled: bool = True`
- `context_compression_threshold: float = 0.80` — trigger at 80% of `ollama_num_ctx`
- `context_keep_recent: int = 10` — keep last N conversational turns verbatim
- `context_summary_temperature: float = 0.3`

### Compression algorithm

`compress_context(context, llm, model, temperature, keep_recent)`:

1. Partition messages into `to_summarize` (old turns) and `to_keep` (recent `keep_recent` turns).
   - A "turn" = one user message + all following assistant/tool messages up to the next user message.
   - Tool call groups (assistant + results) are never split across the partition.
   - Existing summary messages are folded into the next pass.
2. Format `to_summarize` as plain text (tool calls shown as compact previews, max 120 chars for args, max 300 chars for results).
3. Truncate formatted input to `_MAX_SUMMARY_INPUT_CHARS = 12_000` chars.
4. Call `llm.complete()` with `think=False` to produce a bullet-point summary.
5. Replace `to_summarize` with a single summary message (`role=user`, `is_summary=True`).
6. Return `system_msgs + [summary_msg] + to_keep`.

If compression fails, the exception propagates to `CompressionWorker`, which logs a warning and continues — compression failure is non-fatal.

### What is never compressed

- `session.messages` — full history is always intact.
- The last `context_keep_recent` conversational turns.
- System messages (never stored in context anyway).

---

## Session file uploads

Files uploaded via `POST /sessions/{id}/files` are stored in `session_files/{session_id}/`.

- Max size: `session_files_max_size_mb` (default: 200 MB)
- TTL: `session_files_ttl_hours` (default: 24 hours)
- A background `cleanup_loop` (started on FastAPI startup) deletes stale session directories.
- Executable files (`.sh`, `.py`, `.exe`, etc.) are rejected.
- Duplicate filenames get a numeric suffix.

When files are uploaded via the UI, their paths are appended to the user message content:
```
[Uploaded files on disk:
- filename.pdf → session_files/{id}/filename.pdf]
```

This lets the agent use `filesystem` or `code_exec` to access the files.
