root/navi-1

Fork: 0

root / navi-1

History for navi-1 / navi / core / agent.py

2026-04-17	3ddd995 Browse files » Fix core subagent misuse: enforce 1 plan step = 1 spawn_agent call ... Root cause: nowhere was it stated that each AGENT step in the plan maps to a separate spawn_agent call. Navi was bundling all AGENT steps into a single call, dumping the full plan on one subagent. spawn_agent description: - Lead with: "Delegate EXACTLY ONE step of your plan" - Explicit: "3 AGENT steps = 3 spawn_agent calls" - Remove "multi-step sub-task" wording that invited bundling - briefing: clarify as static context only (credentials, paths, instructions) Dynamic findings from prior steps → context_transfer, not briefing Planning Phase 2 prompt: - Add AGENT scoping rules: each step = one focused unit, not "do everything" - Add good/bad examples of AGENT step granularity - Show multiple AGENT steps in the format example Secretary & server_admin system prompts: - Add explicit 1:1 rule with counter-example - Show correct multi-agent execution pattern with code example - Clarify briefing vs context_transfer boundary everywhere Persona: - "ONE PLAN STEP = ONE spawn_agent CALL" as first sentence in SUB-AGENTS - Field descriptions tightened: briefing = static, context_transfer = dynamic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	b9bef33 Browse files » Subagent system prompt rework: separate from parent, briefing as system context ... run_ephemeral: - Add briefing param (passed from spawn_agent, injected into system prompt) - Subagent system prompt is now completely separate from parent's system_prompt: 1. profile.subagent_system_prompt (executor persona) 2. custom_system_prompt (role specialisation for this task) 3. briefing (task context as system-level instruction) Fallback to profile.system_prompt only if subagent_system_prompt is not defined spawn_agent: - task → user message only (the goal) - briefing → system prompt (credentials, context, instructions) - system_prompt → role specialisation injected alongside briefing - Removed old user-message composition (## Context / ## Task split) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	9c8ef3d Browse files » Fix NameError in _run_planning: session.context → context after refactor ... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	73cab8a Browse files » Improve subagent system: isolated tools, custom prompts, context transfer, timeout ... AgentProfile: - New fields: subagent_tools, subagent_planning_enabled, subagent_system_prompt - loader.py: loads subagent_tools/subagent_planning_enabled from config.json, reads optional subagent_system_prompt.txt per profile Profiles: - Each profile now has a dedicated subagent_tools list (focused subset, no admin tools) - subagent_planning_enabled: false (configurable per profile) - New subagent_system_prompt.txt per profile with executor-focused instructions run_ephemeral: - Uses profile.subagent_tools instead of enabled_tools - Builds subagent context without persona or profiles block (focused executor) - Injects subagent_system_prompt after profile.system_prompt - Accepts context_transfer: priming exchange injected before task message - Wall-clock timeout (default 5 min) checked per iteration - Returns (result_text, completed: bool) instead of bare string - Optionally runs planning phase if profile.subagent_planning_enabled spawn_agent: - Removed briefing param; task is now fully self-contained - Added system_prompt param: custom injected prompt for this specific task - Auto-reads parent scratchpad context_transfer section via get_section() - Result prefixed with [STATUS: completed\|limit_reached] - Timeout 300s scratchpad: - Added get_section(session_id, section) helper for cross-session reads Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	0c3dc98 Browse files » Planning phases, context compression, and tool improvements ... Agent: - Planning now a 3-phase async generator: Analysis → Execution plan → AIHelper critic - Yield PlanningStatus events before each phase (UI progress labels) - Phase 1 runs with think=True for deeper analysis - Phase 2 includes available tool list so executor assignments are accurate - Phase 3: independent critic pass validates and corrects TOOL: names against real tool list - Planning converted from list return to async generator (fixes token accounting) Backend: - Context compression threshold: 80% → 70% to trigger earlier - Compressor summary prompt: structured sections (goal, work state, key facts, outputs, errors) - Terminal output capped at 5000 chars to prevent context flooding - Web search: region=wt-wt for DDG, country=ALL for Brave, language=all for SearxNG - Scratchpad: mandate writing a 'goal' section at start of multi-step tasks - secretary max_iterations: 40→25, temperature: 0.7→0.5 - server_admin max_iterations: 40→20 Webclient: - ThinkingCard strips <thought> XML tags leaked by Ollama - planning_status WS event wired to chat.onPlanningStatus() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
2026-04-16	b1dd9ca Browse files » Count AIHelper tokens in session metrics ... Adds prompt/completion token fields to LLMResponse, populated by OllamaBackend.complete(). AIHelper emits AIHelperTokensUsed into the current event sink after each LLM call; run_stream drains it into _subagent_tokens so AIHelper usage is reflected in the turn token delta. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	533f9ee Browse files » Add AIHelper + filesystem query/smart_edit AI actions ... AIHelper (navi/core/ai_helper.py): - Reusable LLM utility for AI-enhanced tools: ask() and ask_json() - Reads current_model ContextVar (set per-turn) so tools always use the session's active model without extra wiring - Robust JSON extraction: strips markdown fences, bracket-matching fallback current_model ContextVar (navi/tools/base.py): - New ContextVar set by run_stream() and run_ephemeral() after profile is resolved; AIHelper reads it to pick the right model automatically filesystem query action: - Natural language question about any file, chunked at ~20k tokens of content (~80k chars) with 30-line overlap between chunks - Single-chunk: one LLM call; multi-chunk: partial answers accumulated then synthesized in a final call filesystem smart_edit action: - Natural language edit instruction on files up to ~200k chars - LLM outputs JSON patch ops: replace / delete / insert (1-based lines) - Ops validated then applied bottom-up to preserve line numbers - Returns unified diff of changes; preserves trailing newline registry: AIHelper created once, OllamaBackend reused (no double init), FilesystemTool receives ai_helper at construction Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	59cdf7f Browse files » Make profile switching autonomous: switch immediately, inform after ... Previously Navi asked for permission before switching profiles. Updated both the injected profiles block in the system prompt and the switch_profile tool description to explicitly say: switch on your own judgment, do not ask, then inform the user which profile is active and why. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	62ad39f Browse files » Add profile discoverability: list_profiles tool + system prompt injection ... - AgentProfile: new short_description (1-line) and full_description (dict with specialization / when_to_use / key_tools) fields - All 3 profile configs: structured descriptions added; list_profiles added to enabled_tools - _build_system_prompt: now accepts full AgentProfile; injects compact "Available profiles" block into every system prompt so Navi always knows what other profiles exist and when to switch — dynamically, no hardcoding - ListProfilesTool: new built-in; returns structured per-profile details (specialization, when_to_use, key_tools); accepts optional profile_id for single-profile lookup - registry: register list_profiles_tool after profiles registry is built Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	af8dfdb Browse files » Fix metrics: net token delta, subagent aggregation, ContextBar always visible ... - run_stream: track _prev_tokens baseline before turn loop; compute net token cost as (context_tokens - prev) + subagent_tokens for per-message cost - run_stream: intercept SubagentComplete in sink drain loop to accumulate subagent token and tool-call counts into the parent turn's totals - run_ephemeral: already emitting SubagentComplete (from prior session) - msg-meta-row: remove margin-left:auto from .msg-meta-time so time groups inline with elapsed/tools/tokens instead of floating right - ContextBar: remove v-if guard so bar is always visible (not only after first LLM response with token data) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	a338f8b Browse files » Add response metrics: elapsed time, tool calls, token count ... Server: - Message model: elapsed_seconds, tool_call_count, token_count fields (display-only, excluded from LLM context via exclude_none) - StreamEnd event: carries same three fields - agent.run_stream: tracks turn start time, counts ToolEvent completions, writes metrics onto the final assistant Message before saving to DB - WebSocket: forwards metrics in stream_end payload Client: - chat.onStreamEnd: attaches elapsed_seconds, tool_call_count, token_count to the streaming message on completion - buildMessageList: scans each assistant group for metrics from history - AssistantMessage: renders .msg-meta-row below the response — timer icon + Xs · wrench icon + N tools · coins icon + Nk tokens · time (each item only shown if present; time pushed right via margin-left: auto) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	ea5766e Browse files » Persist thinking and plan cards across session reloads ... - Message: add thinking and is_plan fields (display-only, not sent to LLM) - Agent main loop: accumulate thinking per iteration, save with assistant message - _run_planning: also append plan to session.messages with is_plan=True so UI can render plan cards after page reload (context already had the plan) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
2026-04-15	23e0a5d Browse files » Fix Ollama connection leak and empty message bug in agent ... - _iter_stream_guarded: track chunk_task as nullable, cancel in finally block to prevent zombie HTTP connections accumulating under load - Final turn: use `content or None` so empty text isn't saved to DB - client/index.html: point to new Vue webclient build - profiles: add email_manager tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
2026-04-14	33b2880 Browse files » Improve filesystem, web search, context guard, and subagent narration ... filesystem: add find (glob), info (stat), move, append actions; read now supports offset/limit with hard 1MB guard; list shows sizes, dates, optional recursion. web_search: retry DDG across auto/html/lite backends; add optional Brave Search API and SearXNG fallbacks configured via .env. agent: fix ContextTooLargeError to surface as Navi response instead of raw system error; fix _check_context_size to calculate from remaining budget (window - output_reserve) rather than a fixed 92% threshold. persona: add ReAct narration instruction to subagent briefing template. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
	8c88f49 Browse files » Fix LLM hang: stop button during prefill, context guard, timeouts ... Root cause: during prefill (processing input tokens), Ollama emits no HTTP chunks. The `async for chunk in stream_complete()` loop body never executes, so stop_event is never checked — Stop button has no effect. Same issue with complete() calls (planning, compression): blocking await with no cancellation path. Fixes: _iter_stream_guarded() (agent.py, module-level): Wraps any stream_complete() generator. Polls stop_event every 1s while waiting for the next chunk using asyncio.wait() — so Stop works even during multi-minute prefill. On stop or timeout, calls aclose() on the generator which closes the HTTP connection to Ollama → generation halts → GPU drops to idle. Applied to both run_stream() and run_ephemeral(). _check_context_size() (Agent method): Estimates context tokens (chars/4 + 500 per image) before every LLM call. Raises ContextTooLargeError (new NaviError subclass) at 92% of ollama_num_ctx — before Ollama ever receives the request. _run_planning() timeouts: Both complete() calls (phase 1 and 2) wrapped with asyncio.wait_for(). Timeout logged and planning skipped gracefully — execution continues. New config (config.py): llm_complete_timeout = 120s llm_stream_first_chunk_timeout = 180s (prefill budget) llm_stream_chunk_timeout = 60s (inter-token budget) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
	594ad9e Browse files » Improve planning: two-phase pipeline and orchestrator discipline ... agent.py: - _run_planning() now runs two sequential LLM calls: Phase 1 (analysis): reformulate task, identify subtasks and unknowns; skip immediately if DIRECT. Phase 2 (execution plan): assign each subtask an executor — TOOL/AGENT/SELF — using a structured ## Plan format. Phase 2 context = analysis (embedded in system prompt) + last user message only; full history excluded to keep focus on plan structure. - Warn in logs when plan lacks TOOL/AGENT/SELF executor assignments. persona.txt: - MANDATORY sequence: step 0 = scratchpad init before anything else; todo tasks must mirror plan steps exactly (same order, same executors). - PLAN → EXECUTION BINDING: explicit rule — never switch an AGENT step to inline execution silently. - SCRATCHPAD: initialize sections at task start, not after first tool call; write context to scratchpad before briefing subagents. - Fix typo in BRIEFING ("sub-lagent" → "sub-agent"). - Replace stale Knowledge Retrieval Protocol with accurate one-liner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
2026-04-11	b292ddd Browse files » Strengthen Navi planning/delegation, unify toolsets, isolate subagent scratchpad ... persona.txt: - DELEGATION: 'default to spawning, not to doing inline' — stronger default, clearer triggers, explicit when-not-to-spawn rules - PLANNING: ties automatic planning phase to mandatory todo(op='set') as first tool call; reconciles pre-loop plan with in-loop execution discipline - SCRATCHPAD: new section — when to write, section naming conventions, mandatory read before final answer Profiles (secretary, server_admin, smart_home): - All three now share the same 18-tool set (each file independent) - planning_enabled=True on all three - scratchpad and web_search added to smart_home - System prompts updated with scratchpad/todo execution discipline sections agent.py run_ephemeral: - Each subagent gets a unique session ID (subagent_<uuid>) for scratchpad isolation — parallel or sequential subagents no longer share working notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
	efb870f Browse files » Skip planning phase for simple/direct requests ... The planning prompt now asks the model to respond with "DIRECT" if the request doesn't need multiple steps. Added a regex fallback: if the response has no numbered steps it's also discarded. This prevents plan cards appearing for conversational replies that would just duplicate the final message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
	fe6d7bc Browse files » Add planning phase and scratchpad tool for smarter task execution ... - ScratchpadTool: session-scoped working notepad with named sections (write/append/read/clear). Lets Navi capture intermediate findings between tool calls instead of losing track of them. - Planning phase: when profile.planning_enabled=True, a fast pre-loop LLM call (think=False, no tools) outlines a numbered plan before any actions are taken. The plan is injected into session context as an assistant message so the model naturally continues from it. - PlanReady event + plan_ready WebSocket message + plan card in UI (green-tinted, collapsible, mirroring thinking card design). - secretary and server_admin profiles: planning_enabled=True, scratchpad added to enabled_tools, system prompts updated with explicit execution discipline instructions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
2026-04-10	86402e0 Browse files » Add stop button and fix context compression hang ... Stop generation: - Client: send button toggles to red ■ during streaming; sends {type:stop} via WS - Server: _stream_recv concurrently reads incoming messages during streaming using asyncio.wait — stop signal is handled immediately without polling - Cooperative stop via asyncio.Event (current_stop_event ContextVar): agent breaks out of LLM async-for cleanly so aclose() fires → Ollama stream closes gracefully, model stays in VRAM. No task.cancel() which would eject the model. - StreamStopped event propagates through run_stream/run_ephemeral; sub-agents stop via the same shared stop_event inherited through task context Context compression fix: - compress_context passes think=False to llm.complete() — no extended reasoning during summarization which caused GPU hang - Input truncated to 12k chars before sending to summarizer - LLMBackend.complete() / OllamaBackend.complete() accept think: bool \| None override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
	5386b61 Browse files » Fix profile switch: reload tools/schema after switch_profile tool call ... switch_profile updates profile_id in DB, but run_stream() held a stale local session object — the final save would overwrite the change, and subsequent LLM calls in the same turn still used the old tool schemas. After each tool-call iteration, compare DB profile_id with the local session object. On mismatch: update session.profile_id, reload profile, tools, tool_schemas, and llm backend so the next LLM call gets the correct schema and the final save preserves the new profile. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
	f68b071 Browse files » Dynamic system prompt — inject per-call instead of storing in context ... System prompt is no longer stored in session.context. Instead, _build_context() prepends the current profile's system prompt fresh on every LLM call. This means profile switches take effect immediately on the next message — no stale prompt lingering in stored context. Also strips any existing system messages from context for migration safety (old sessions that have one stored will still work). _with_memory() removed, replaced by _build_context(context, profile, mem). run_ephemeral() context no longer includes system message either. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
	1e8b65e Browse files » Major feature batch: visibility, planning, file uploads, streaming ... - stream_complete(): streaming with tools for all LLM turns — thinking now streams as ThinkingDelta/ThinkingEnd in real-time during tool- selection turns, not just on the final response - todo built-in tool: session-scoped plan manager (set/view/update/clear); persona + all profiles updated with mandatory planning instructions - TurnThinking event: sub-agent thinking forwarded to parent sink as a collapsible block in the spawn_agent card - File uploads: non-image files uploaded via XHR, shown as badges in message bubble; SVG treated as regular file (not base64 image) - session_files: POST /sessions/{id}/files, TTL cleanup, forbidden exts - WebSocket reconnect: _AgentRun broadcast pattern, re-attach mid-stream - UI: favicon, sidebar logo, turn-thinking cards, subagent thinking blocks, token counter, draft persistence, file progress bar - Removed AgentNote (content is always None alongside tool_calls) - Ollama stream_complete: tool_calls captured from non-final chunk (done=False) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
2026-04-09	4112e36 Browse files » Live tool visibility: pending cards, sub-agent step log ... Backend: - ToolStarted event: emitted before tool execution begins so client can render a pending card with spinner immediately - ToolEvent gains is_subagent flag; ToolStarted same - current_event_sink ContextVar in tools/base.py — run_stream() sets it to an asyncio.Queue before create_task(); run_ephemeral() reads it and puts ToolStarted/ToolEvent into the queue as each sub-agent step runs - run_stream() tool loop: sequential execution via create_task() + polling drain loop (20ms sleep); yields ToolStarted → sub-agent events from sink → ToolEvent (completed) for each tool call - run_ephemeral() rewritten to inline sequential tool execution with sink emission (replaces _execute_tool_calls gather) - _run_single_tool() helper extracted for run_stream() - websocket.py handles tool_started and adds is_subagent to tool_call Frontend: - appendPendingToolCard(): creates card with spinner; spawn_agent opens body immediately to show sub-agent log as it fills - finalizeToolCard(): fills result, removes spinner, adds toggle; strips "[Sub-agent result — ...]" reminder prefix from displayed text - appendSubagentStep() / finalizeSubagentStep(): live step log inside spawn_agent card — each sub-agent tool call gets a ↳ row - app.js: tool_started → pending card; tool_call → finalize card; is_subagent routing to sub-step vs main card; abandonStream() resets pendingToolCard/pendingSubStep - CSS: .spinner-inline for card headers; .subagent-log / .subagent-step for nested step display; .tool-body-open for always-open spawn_agent body; .tool-card.pending suppresses chevron Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
	2efec37 Browse files » Add spawn_agent: sub-agent delegation with isolated context ... - Agent.run_ephemeral() — runs a sub-agent loop without a persistent session; accepts exclude_tools to block recursion; logs start/complete - session_store made Optional in Agent.__init__ (None for ephemeral runs) - SpawnAgentTool (navi/tools/spawn_agent.py): spawns an isolated Agent for a focused task; resolves profile from parent session via ContextVar; blocks spawn_agent recursion via exclude_tools=["spawn_agent"] - build_default_registries() accepts session_store param; registers SpawnAgentTool after BackendRegistry is built (patches _backend_registry) - deps.py passes _session_store to build_default_registries - All profiles: spawn_agent added to enabled_tools, max_iterations 10→30 - persona.txt: DELEGATION section — when/how to use spawn_agent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
	108f65d Browse files » SSH connection pooling: per-session, 20-minute TTL ... - Pool keyed by session_id:host:port:username — parallel sessions share no state even when targeting the same server - Per-key asyncio.Lock prevents concurrent connection creation races - TTL (20 min) and is_closing() checked on every access; expired/closed connections are evicted and replaced transparently - On disconnect error during command execution: evict + retry once with fresh connection - current_session_id ContextVar (set by Agent before tool calls) is read in ssh_exec to build the pool key without changing tool signatures Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
	56611d7 Browse files » Add long-term user memory system ... Architecture: - navi/memory/store.py: MemoryStore backed by SQLite (memory_facts, memory_summary, session_memory_state tables in navi.db) - navi/memory/extractor.py: LLM-based fact extraction from sessions + summary regeneration (triggered after session goes idle >30 min) - Fact upsert uses UNIQUE(category, key) — same key always overwrites, no duplicates or stale contradictions - Keyword search across category + key + value (LIKE-based, no extra deps) Context injection: - Memory summary injected as an ephemeral system message on every LLM call via Agent._with_memory() — never persisted to session.context Tools (all profiles): - memory_search(query): keyword search against fact DB; persona instructs model to call it at session start and before personal-context questions - memory_forget(key, category?): delete a specific fact on user request Extraction trigger: - On new session creation, fire-and-forget background task checks all sessions idle >30 min with unprocessed messages → runs extraction Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
	7b1fa2c Browse files » Fix context loss: ensure system prompt is always present in LLM context ... Replaced `if not session.context:` with a role-based check so the system message is inserted whenever it is absent — not just for brand-new sessions. Root cause: backward-compat sessions (context column was empty) had their context initialised from session.messages, which never contains a system message. The old check (`if not session.context:`) saw a non-empty list and skipped the system prompt, so every subsequent request ran without it — Navi had no persona and no profile instructions. Also add context_token_count field to Session (follow-up for token counter fix — persistence wiring comes in next commit). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
2026-04-08	bdd3786 Browse files » Review fixes: events module, circular imports, deps, vision-aware compression ... - Extract all AgentEvent dataclasses to navi/core/events.py; import from there in agent.py and __init__.py — eliminates circular import between workers and core - workers/compressor.py: remove runtime import hack, use navi.core.events - workers/base.py: WorkerResult.events typed as list[AgentEvent] (was Any) - api/deps.py: replace @lru_cache on mutable list with module-level singletons (_registries, _workers) - core/compressor.py: _format_for_summary returns (text, images); images passed to summarization LLM so vision models describe them in summary; non-vision models silently ignore the images field; docstring updated - client/js/app.js: add comment explaining is_summary backward compat branch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
2026-04-08	261459a Browse files » Separate display history from LLM context; formalize worker system ... Architecture change: - session.messages: full display history, never modified by compression - session.context: what the LLM sees, may be compressed by workers - System messages go only into context (not display history) - Image injections (synthetic) go only into context - User/assistant/tool messages go into both SQLite: add context column with backward-compat migration (empty context → initialized from messages on load) Workers (navi/workers/): - Worker ABC + WorkerContext + WorkerResult (base.py) - CompressionWorker: compresses session.context when above threshold - build_default_workers() returns [CompressionWorker()] - Agent accepts workers list, runs them after StreamEnd - Workers injected via deps.py get_workers() (lru_cached singleton) - WebSocket agent construction also receives workers Compressor: compress_context() now takes context[], not messages[] Config: context_keep_recent 6 → 10 Agent: _run_workers() collects events from all workers and yields them Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr