# Sessions

Session management, dual-buffer design, and context compression.

## Session model (`navi/core/session.py`)

```python
class Session(BaseModel):
    id: str                          # UUID
    profile_id: str                  # active profile
    user_id: str | None              # owner (null for legacy sessions)
    messages: list[Message]          # full display history — never compressed
    context: list[Message]           # LLM context — may be replaced with summary
    context_token_count: int         # accumulated tokens; reset to 0 after compression
    pinned: bool                     # pinned sessions appear first in sidebar
    name: str | None                 # auto-generated display name (set after first exchange)
    created_at: datetime
    last_active: datetime
    planning_logs: list[dict]        # raw planning phase outputs per turn (debug)
```

## Message flags

Messages in `session.messages` carry optional flags beyond role/content:

| Flag | Purpose |
|---|---|
| `is_plan: bool` | Message is a planning phase output (shown as plan card in UI, not text) |
| `is_compression: bool` | Marker message injected when context compression ran |
| `is_summary: bool` | A summary message replacing compressed history in `session.context` |
| `is_recall: bool` | Message was generated by a scheduled recall (styled differently in UI) |
| `thinking: str \| None` | LLM reasoning captured during a tool-calling turn |
| `metadata: dict` | Tool result metadata (e.g. `is_image`, `base64`) |

## Dual-buffer design

Two separate message lists serve different purposes:

| Buffer | Purpose | Modified by compression? |
|---|---|---|
| `session.messages` | Full display history shown in the UI | Never |
| `session.context` | What the LLM sees on each call | Yes — old turns replaced with a summary |

Tool results, image injections, and assistant messages are appended to **both** buffers. When compression runs, only `session.context` is modified.

**Note:** System messages are **not stored** in either buffer. They are injected fresh from the current profile on every LLM call via `_build_context()`. This makes profile switches take effect immediately.

## Session store

### `InMemorySessionStore`
Simple dict-backed store for testing.

### `PgSessionStore` (`navi/core/pg_session_store.py`)
Production store backed by PostgreSQL via asyncpg.

- `create(profile_id, user_id=None)` → new `Session`
- `get(session_id)` → `Session | None`
- `save(session)` — serializes with `model_dump(mode='json')` (required for datetime serialization)
- `list_all(user_id=None, is_admin=False)` → if `is_admin`: all sessions; else: sessions for `user_id` or legacy (`user_id IS NULL`) sessions
- `count_all(user_id=None, is_admin=False, search=None)` → total matching sessions
- `search_list(limit, offset, user_id=None, is_admin=False, search=None, sort_by="last_active", sort_order="desc")` → paginated, filtered, sorted sessions
- `delete(session_id)` → `bool`
- `list_page(user_id=None, is_admin=False, limit=50, offset=0)` → paginated list with `has_more` flag
- `set_pinned(session_id, pinned)` → `bool`
- `set_name(session_id, name)` → `bool`

Requires `DATABASE_URL` env variable (e.g. `postgresql://user:pass@localhost/navi`).

**Ownership:** Legacy sessions (`user_id IS NULL`) are accessible only to admins. New sessions created by authenticated users carry `user_id`. The `list_all()` method respects the `is_admin` flag to filter appropriately.

When `NAVI_AUTH_ENABLED=false`, every session is created with `user_id = 'anonymous'` and all access checks are bypassed. Querying `sessions.user_id = 'anonymous'` is a reliable way to identify sessions created in no-auth mode.

---

## Context compression (`navi/core/compressor.py`)

Keeps the LLM context within the token budget by summarizing old conversation turns.

### When it triggers

Two trigger points:

1. **Pre-turn** (in `run_stream()`): before calling the LLM, checks `session.context_token_count` against the threshold. Compresses if `tokens >= num_ctx * threshold`.
2. **Post-turn** (via `CompressionWorker`): after `StreamEnd`, the worker re-checks and compresses if needed.

Config values (`settings`):
- `context_compression_enabled: bool = True`
- `context_compression_threshold: float = 0.70` — trigger at 70% of `ollama_num_ctx`
- `context_keep_recent: int = 10` — keep last N conversational turns verbatim
- `context_summary_temperature: float = 0.3`

### Compression algorithm

`compress_context(context, llm, model, temperature, keep_recent)`:

1. Partition messages into `to_summarize` (old turns) and `to_keep` (recent `keep_recent` turns).
   - A "turn" = one user message + all following assistant/tool messages up to the next user message.
   - Tool call groups (assistant + results) are never split across the partition.
   - Existing summary messages are folded into the next pass.
2. Format `to_summarize` as plain text (tool calls shown as compact previews, max 120 chars for args, max 300 chars for results).
3. Truncate formatted input to `_MAX_SUMMARY_INPUT_CHARS = 12_000` chars.
4. Call `llm.complete()` with `think=False` to produce a bullet-point summary.
5. Replace `to_summarize` with a single summary message (`role=user`, `is_summary=True`).
6. Return `system_msgs + [summary_msg] + to_keep`.

If compression fails, the exception propagates to `CompressionWorker`, which logs a warning and continues — compression failure is non-fatal.

### What is never compressed

- `session.messages` — full history is always intact.
- The last `context_keep_recent` conversational turns.
- System messages (never stored in context anyway).

---

## Session file uploads

Files uploaded via `POST /sessions/{id}/files` are stored in `session_files/{session_id}/`.

- Max size: `session_files_max_size_mb` (default: 200 MB)
- A background `cleanup_loop` (started on FastAPI startup) deletes orphaned session directories after their DB session no longer exists.
- Executable files (`.sh`, `.py`, `.exe`, etc.) are rejected.
- Duplicate filenames get a numeric suffix.

When files are uploaded via the UI, their paths are appended to the user message content:
```
[Uploaded files on disk:
- filename.pdf → session_files/{id}/filename.pdf]
```

This lets the agent use `filesystem` or `code_exec` to access the files.

`workspace/` is separate from session files. Use `workspace/` for persistent private working files and `session_files/{session_id}/` for files that belong to the current chat.

Two tools expose session files to the user:
- `share_file` copies an existing local file into the session directory and returns a download link. It requires an absolute source path and has its own size limit (`SHARE_FILE_MAX_SIZE_MB`, default 1024 MB).
- `content_publish` registers a file that already exists in the session directory and exposes it as an inline viewer/card through `/sessions/{id}/files/{filename}`. It does not copy files.

---

## Debug endpoints

- `GET /sessions/{id}/context` — returns what the LLM actually sees (may differ from `messages` after compression).
- `GET /sessions/{id}/planning` — returns `session.planning_logs`: raw planning phase outputs per turn.
