Add context compression: rolling summarization when context fills up

Fork: 0

root / navi-1

Browse code Add context compression: rolling summarization when context fills up Mechanism: - After streaming ends, if context_tokens >= threshold (80% of num_ctx), compress old turns into a summary message using the same LLM - Partition: keep system msg + last N turns verbatim (default 6); everything older goes to the summarizer - Tool call groups (assistant + tool results) never split across boundary - Existing summary messages folded into new compression pass — no stack growth - Summary stored as Message(role=user, is_summary=True) after system msg - On failure: logged, session left unchanged (non-fatal) New files: - navi/core/compressor.py: should_compress, partition_messages, compress_session (pure logic, testable without agent) New config (navi/config.py): - context_compression_enabled: bool = True - context_compression_threshold: float = 0.80 - context_keep_recent: int = 6 - context_summary_temperature: float = 0.3 New agent event: ContextCompressed(messages_before, messages_after) Message.is_summary: bool field marks compressed history blocks Client: - context_compressed WS event → subtle inline notice in message list - loadHistory: is_summary messages rendered as collapsible summary cards - style.css: .summary-card, .compression-notice Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> feature/navi-code master vmkdemo
1 parent 72f5a6f commit 802c186a667ff187d584c999741e2431f5b44f3d Eugene Sukhodolskiy authored on 8 Apr

Browse code

Mechanism:
- After streaming ends, if context_tokens >= threshold (80% of num_ctx),
  compress old turns into a summary message using the same LLM
- Partition: keep system msg + last N turns verbatim (default 6);
  everything older goes to the summarizer
- Tool call groups (assistant + tool results) never split across boundary
- Existing summary messages folded into new compression pass — no stack growth
- Summary stored as Message(role=user, is_summary=True) after system msg
- On failure: logged, session left unchanged (non-fatal)

New files:
- navi/core/compressor.py: should_compress, partition_messages,
  compress_session (pure logic, testable without agent)

New config (navi/config.py):
- context_compression_enabled: bool = True
- context_compression_threshold: float = 0.80
- context_keep_recent: int = 6
- context_summary_temperature: float = 0.3

New agent event: ContextCompressed(messages_before, messages_after)
Message.is_summary: bool field marks compressed history blocks

Client:
- context_compressed WS event → subtle inline notice in message list
- loadHistory: is_summary messages rendered as collapsible summary cards
- style.css: .summary-card, .compression-notice

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feature/navi-code master vmkdemo

1 parent 72f5a6f commit 802c186a667ff187d584c999741e2431f5b44f3d

Eugene Sukhodolskiy authored on 8 Apr

Patch

Unified Split

Showing 9 changed files

Ignore Space Show notes View client/js/app.js

Ignore Space Show notes View client/js/chat.js

Ignore Space Show notes View client/style.css

Ignore Space Show notes View navi/api/websocket.py

Ignore Space Show notes View navi/config.py

Ignore Space Show notes View navi/core/__init__.py

Ignore Space Show notes View navi/core/agent.py

Ignore Space Show notes View navi/core/compressor.py 0 → 100644

Ignore Space Show notes View navi/llm/base.py

Show line notes below