root/navi-1

Fork: 0

root / navi-1

History for navi-1 / navi / core

2026-07-13	271800d Browse files » mcp: dedicated runner task per client (fix cross-task cancel-scope on shutdown) ... MCP client SDK transports (stdio/sse/streamable_http) + ClientSession are anyio task groups whose cancel scopes require __aenter__/__aexit__ in the SAME asyncio task. McpClient entered the transport in one task (lifespan connect / health-check reconnect / request retry) and exited it in another (lifespan teardown) -> RuntimeError: Attempted to exit cancel scope in a different task than it was entered in. Refactor McpClient to own a single long-running runner task that holds the AsyncExitStack and performs ALL transport enter/exit + list_tools/call_tool. The public async API (connect/disconnect/list_tools/call_tool/mark_disconnected) just enqueues a _Cmd and awaits a Future, so callers from any task no longer cross cancel-scope boundaries. connected/instructions mirror from the runner onto the instance to stay sync-readable. disconnect() enqueues a stop command and awaits shield(runner) so teardown isn't interrupted by lifespan cancel. Also call mcp_manager.stop_health_check() BEFORE disconnect_all() in AppContainer.shutdown() so the health-check task cannot enqueue onto a client whose runner is being torn down. mark_disconnected() is now async (queued) and its manager caller updated. Regression test: connect in one task, list_tools in a second, disconnect in a third — the exact scenario that raised the RuntimeError before. Eugene Sukhodolskiy committed 2 days ago
	55fa7ae Browse files » compressor: target hysteresis — shrink to 65% after compression, not just below the trigger ... Compression triggered at 90% but left the context just under the trigger, so a fixed keep_recent (navi_code: 12 turns ≈ 104 messages) re-triggered a couple of messages later — the context yo-yoed at the trigger line instead of gaining headroom. Add context_compression_target (0.65): in turn-based (preturn) mode compress_context shrinks keep_recent until the verbatim kept region fits 65% of the window, folding the extra turns into the same single summary LLM call; a safety net in compress_session token-budget-truncates if the kept region alone still exceeds the target (midturn kept a huge in-flight turn, or the keep_recent floor can't fit). Trigger 90% → target 65% leaves real headroom. Eugene Sukhodolskiy committed 2 days ago
	2305129 Browse files » Автономность мелких моделей: OUTPUT DISCIPLINE, milestone-todo, перехват финала хода ... Два последовательных захода над одной проблемой — мелкие модели (12–30B) в navi_code теряют автономность на трудных шагах: «остановился поболтать» вместо действия и слишком большое расстояние между пунктами плана. Заход 1 (З1–З3): - З1: OUTPUT DISCIPLINE в системном промпте navi_code — «act, don't announce», без few-shot антипримеров. Контракт хода: ответ без tool_calls = конец хода, поэтому объявление намерения текстом убивает автономность; правила заставляют вызывать инструмент в том же ходе. - З2: плоский todo + метка группы milestone + декомпозиция. _parse_plan_steps возвращает list[tuple[milestone, text]]; milestone — метка группировки (не сущность, без статуса), «done» вычисляется при рендеринге; подшаги = больше плоских шагов (без вложенности). TUI side-panel группирует по milestone (плоский фолбэк при пустом milestone). Plan depth: max 15→20 + правило декомпозиции. - З3: adaptive re-plan «длинный шаг» — nudge «разбей шаг» при in_progress ≥ порога итераций без смены todo (порог 4, раньше общего anti-stall warning на 8). Заход 2 (шаги 1–3, после cloud-теста 31b vs 12b): Корневая структурная причина: весь спасательный механизм (anti-stall warning с явным предложением reflect, adaptive re-plan) живёт только внутри tool-цикла — nudge инжектируется в pre_turn следующей итерации, которой при «остановился поболтать» нет (ход закрылся по return до post_turn). 31b застревала через «продолжаю tool-итерации» → дожала до warning → спаслась; 12b — через «замолчала текстом» → мимо всех nudge. - Шаг 1: перехват финала хода. Если модель выдала bare-text, но в todo есть открытые шаги (pending/in_progress) и лимит не исчерпан — НЕ эмитить StreamEnd, а сохранить ассистентский текст в session.context, поставить системный nudge и continue (без StreamEnd, без workers — консистентно с multi-iteration tool-турами). Счётчик final_interceptions на AgentTurnContext, лимит final_intercept_limit (default 2), эскалация жёсткости (мягкий → «second stop»). has_open_steps в todo.py: пустой todo → False (защита casual-сообщений), failed/skipped терминальны. Профильные флаги final_intercept_enabled/limit в base.py + loader + admin. - Шаг 2: жёсткий reflect-триггер в промпте — «~3 tool attempts on the same step without progress → call reflect IN THIS TURN (tool call, not reasoning aloud)». - Шаг 3: открыть replan для застревания — «call replan when reflect showed the whole approach is dead (not one failed step, but the approach itself)». Тесты: 874 passed, 1 skipped. Новые — has_open_steps (5), final intercept (5), milestone-группировка, adaptive long-step nudge, парсер шагов с milestone-маркером. cloud: navi_code model → gemma4:31b-cloud для тестирования догадки (31b признала застревание, 12b — нет); .env cloud-host уже gitignored. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 2 days ago
2026-07-12	c65c664 Browse files » compressor: split in-flight turn in midturn mode + flush incrementally ... Two related fixes for long autonomous turns in planning profiles (navi_code): A1 — partition_messages turn-based branch held the in-flight (current) turn verbatim when turns > keep_recent, so midturn compression was shallow (e.g. 150 -> 141 messages). Now, when keep_recent_messages is set, the in-flight turn is split like partition_current_turn_messages: head (user request) + tail (recent tool steps) kept, middle summarized. The adaptive swap is disabled in midturn mode (it could move the in-flight turn into old_turns and summarize the current request whole). First branch (len(turns) <= keep_recent) is untouched, so locked-in midturn tests hold. B1 — agent.run_stream had no try/finally around the for-loop, so an asyncio.CancelledError (server restart/shutdown) unwound the stack with no flush: all in-memory turn messages (sequence_number < 0) were lost, only the user message survived. Add incremental save() after planning, after the assistant tool-call decision, and after each tool result, so a crash loses at most the single in-flight tool call, not the whole turn. B2 (try/except safety-net) was dropped: B1 leaves no window where an append is unsaved. Tests: midturn split with many turns (partition + compress_session), and crash/cancel persistence via a snapshot session store that mirrors the DB boundary. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 3 days ago
	b1693d3 Browse files » replan: integrated mid-task re-planning tool ... Add a replan tool that re-runs the planner over the live session context + todo + scratchpad when the plan's structure is stale due to discoveries (NOT failed steps -- that stays [Adaptive re-plan]). Integrated approach: PlanningEngine.run gains is_replan/replan_context (suppresses DIRECT shortcut and observe-skip, frames Phase 1 as a revision); ReplanRunner packs reason/goal/todo/findings/errors and captures PlanReady; the tool is exposed to navi_code/developer/tool_developer via a current_replan_runner ContextVar set per-iteration in run_stream (correct after switch_profile). New plan replaces the todo. Lazy events import breaks the navi.tools -> navi.core -> navi.tools cycle. Eugene Sukhodolskiy committed 3 days ago
	e865a93 Browse files » recall: carry self-instruction (message) on the recall_update wire ... The recall card could show call_type/trigger_at but not the self-instruction (additional_context_message) — it was absent from RecallUpdate.to_wire, so the user couldn't see what future-self was about to do at the scheduled or fired moment. Extend the wire payload. - events.RecallUpdate: add `message` field; to_wire emits "message". - scheduler._publish_recall_update: accept and forward `message`. - Publish sites carry message=recall.additional_context_message: schedule_recall (scheduled) and orchestrator._finalize_recall (rescheduled / fired / cancelled). manage_recall cancel/skip omit it (no recall object handy; already visible in the prior scheduled card). - TUI RecallRenderer: preview the message (first line + "(+N lines)", capped at 80) on a `msg:` body line when present. - Tests: RecallUpdate.to_wire carries message (defaults None); renderer preview (single/multiline/truncate/empty) and scheduled-card renders it. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 3 days ago
	1b91fcf Browse files » compression: profile-aware worker + real-token baseline estimator ... Item 2 — thread the active profile into CompressionWorker so compress_context applies per-profile overrides (compression_keep_recent, compression_max_tokens, compression_prompt_file). navi_code now compresses with keep_recent=12 instead of the global 8. Item 1 — estimate the next LLM call's context from the real prompt_tokens of the previous call (bulk) plus a heuristic delta for messages appended since, replacing the chars//3 estimate that undercounts code-heavy tool output and fired midturn compression too late (Navi kept working until the window was exhausted). Baseline is recorded after each stream and cleared after compression; check_context_size and the midturn gate use it, with a heuristic fallback when no baseline exists or the context shrank. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 3 days ago
	3d9c3da Browse files » compression: fix auto-compress no-op on few-huge-messages + honest status ... Root cause: the compression gate (should_compress) measures tokens, but the partition measures message/turn count, and CompressionStarted was emitted before the attempt. For navi_code's "few very large messages" shape (one big file read = 1 user + assistant + 1 huge tool result, 66k tokens in 3 messages) the gate fired, the UI showed "compression", but partition returned to_summarize=[] -> compress_context None -> nothing shrank. The agent kept going until the window overflowed. It wasn't running "during" compression — there was no compression, just a no-op the user mistook for one. A. Per-message head/tail truncation in context_builder.build(): oversized tool/assistant messages (over context_message_token_budget, 0=num_ctx//6) are capped head+marker+tail in the LLM view only (model_copy — stored history and reloads are never affected). A single huge tool result can no longer alone blow the window; user/system messages are never truncated. B. Token-budget hard-truncate fallback in compress_session: when partition no-ops but tokens exceed the threshold, drop oldest turns to num_ctx0.5. _hard_truncate is now token-aware (was a fixed message-count floor that no-oped on <=6 messages even when huge). New would_compress() predicts compress_session's real outcome with no LLM call. C. Honest CompressionStarted: _compression_events_midturn/_preturn emit it only after would_compress() confirms the partition (or token-budget fallback) can actually shrink the stored context — no more "compression" status with no ContextCompressed to follow. Bonus: post-turn CompressionWorker now passes keep_recent_messages= max(12, context_keep_recent2), matching the midturn path, so a single long autonomous turn compresses post-turn too (was always a no-op). Tests (+14): would_compress agreement, token-budget fallback, token-aware hard_truncate, build() truncation (preserves user, no mutation, head+tail), agent no-CompressionStarted-when-nothing-to-compress, worker single-long-turn. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 3 days ago
2026-07-11	8615de3 Browse files » Navi Code: force context compression via typed /compact control message ... Forced /compact previously sent a chat message, which ran a full agent turn that only produced summary text instead of running the real context compressor. Worse, even when wired correctly, forced compact always reported "Nothing to compact yet — the context is still small" regardless of context size: the typical navi_code shape is a single long autonomous turn (1 user message + many tool iterations = one turn), and partition_messages finds nothing to summarize when turns <= keep_recent. The midturn auto-compress path already bypassed this via keep_recent_messages (intra-turn split), but compact_stream passed keep_recent_messages=None, so the fallback was disabled. Changes: - WS protocol: {"type":"compact"} control message (distinct from {"type": "message"}); rejected while an agent turn is active to avoid racing the agent. - Agent.compact_stream: forced compression that bypasses the token threshold but still runs the real compressor; passes keep_recent_messages=max(12, context_keep_recent*2) so a single long turn compresses via intra-turn split (mirrors midturn auto-compress). Raises NothingToCompactError when context is genuinely too small. - Orchestrator.run_compact + clear_run: broadcast agent events to subscribers, end with done marker, surface NothingToCompactError as an error event. - Terminal client: ws_client.send accepts str\|dict; CompactCommand enqueues {"type":"compact"}; TUI distinguishes forced compact (no stream_start) from in-turn auto-compress via the _streaming flag. - Tests: compact_stream (incl. single-long-turn regression), WS handler dispatch/rejection, run_compact event/error broadcasting, ws_client send, compact command. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 4 days ago
	acc3adc Browse files » tui: show the changed step's text in the todo update call card ... The todo update call card (→ todo · #3 → done) now shows the text of the step being changed, plus the validation when present. The LLM's update args carry only the index, not the task text, so the text is read from the current plan row before the tool runs and attached to ToolStarted.metadata (client-rendering only, backward-compatible). Backend: - events.ToolStarted gains a metadata dict (mirrors ToolEvent) → to_wire. - navi/tools/todo: step_text_for_update(index, ctx) resolves the step text via _sid/_uid (so sub-agent isolation holds), started_metadata_for_call wraps it for both emit sites. - agent.py (parent) and subagent_runner.py (sub-agent) enrich ToolStarted via the shared helper before emitting. Renderer: - TodoStartedRenderer update card reads msg.metadata.step_text → shows the step text + validation; falls back to 'no validation' on history replay. - Removed the step-text line from the result card (it now lives in the call card) — result card is back to the compact 'plan · X/N done' summary. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 4 days ago
	9784ce7 Browse files » Isolate sub-agent todo + render live todo in TUI side panel ... Phase 1 — sub-agent todo isolation (backend): - Add current_todo_session_id ContextVar; subagent_runner scopes it to the sub-agent's ephemeral run id so its auto-populated plan and todo updates land in an isolated KV row instead of clobbering the parent session's todo (which the parent's goal-anchoring reads every iteration). - todo._sid() and planning.set_tasks prefer current_todo_session_id; the parent run leaves it unset, so all existing todo consumers (anti-stall, goal anchor, get_progress_message) behave exactly as before. Phase 2 — live todo in the TUI right column: - TodoUpdated event + emit it from the agent loop after planning auto-populate and after each tool-execution turn. - GET /sessions/{id}/todos reads the parent session's todo KV row (explicit user_id/session_id, optional injected kv). - api.get_todos + TodoList/TodoPanel widgets: status-coloured markers (pending dim, in_progress accent+bold, done success, failed error, skipped dim), progress header, scrollable panel below the auto-height info block. - Hybrid delivery: REST seeds the panel on attach/switch, todo_updated WS events update it live. Sub-agent todos as nested sub-lists is deferred to a later phase. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 4 days ago
2026-07-10	2c85e90 Browse files » tui: show the currently-served model in the status panel ... The status panel's Model line was fed the global ollama_default_model, not the session/profile model, and the server never told the client which model actually served a call. Now: - Backends stamp the resolved model onto LLMChunk (first chunk) / LLMResponse. The fallback backend reports the model that survived its server+model priority list (may differ from the profile's first choice). - New ModelInfo event ({"type":"model_info","model":...}) emitted once per turn from agent._consume_stream, re-emitted only when the model changes across iterations. Additive WS event — old clients ignore it. - TUI: attach_session/switch fetch the profile's configured model (first of profile.model) via api.get_profile_model so the panel shows a value before the first request; model_info then refines it to the actually-served model. Not forwarded to the chat panel. raw CLI prints "[model] ...". Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 5 days ago
2026-07-09	fc0546c Browse files » agent: cwd-aware memory fact filter for bounded autonomy ... Long-term memory stored context-dependent path facts (e.g. "project_root → /home/.../navi-1") as global user facts. search_facts injected them into any session, so when working in another project the agent was told "the project root is navi-1" and drifted there. When scope_boundary_enabled and a session cwd is set, _memory_facts_msg now drops facts whose value is an absolute path outside the session cwd tree. Facts are kept when working inside that path (then they are correct), and non-path/relative facts always pass. Free flight stays reproducible by toggling the flag off. No facts deleted, extractor untouched. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 6 days ago
	2e06b02 Browse files » agent: bounded autonomy — scope boundary + observe-vs-act ... navi_code had unwanted "free flight": an observe request ("look at a directory") triggered the full Phase 3 plan with milestones + auto-todo, and goal_anchoring then drove the agent to finish those steps, climbing into sibling projects and executing milestone docs it found. Two toggleable, default-off profile flags (on for navi_code): - scope_boundary_enabled: injects a standing system message keeping the agent within the literally requested scope; forbids acting on discovered TODO/roadmap/milestone docs (report only). - observe_skips_plan_enabled: Phase 1 classifies MODE: observe\|act; an observe request skips Phase 2/3 — no multi-step plan, no auto-todo, no "execute step by step" prompt. The agent just gathers info and answers. Independent of force_plan (observe on the first message still skips). Free flight stays reproducible by flipping both flags off. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 6 days ago
	f78ede1 Browse files » session store: lazy persistence — no empty sessions in DB ... POST /sessions no longer inserts a DB row. The Session is registered in an in-memory _pending registry on PgSessionStore and only upserted on the first save() (first user message / meaningful state change). Empty sessions that never receive a message never reach the DB and vanish on server restart or via the hourly pending sweep. - pg_session_store: _pending dict + lock; create() registers, get() checks _pending first, save() upserts the sessions row (INSERT ... ON CONFLICT) and pops _pending so the session_messages FK is satisfied; sweep_pending() drops abandoned entries; pending_sweep_loop() background task. - main.py: start/cancel pending_sweep_loop in lifespan. - tests: 7 new tests for create/get/save/list/sweep semantics; updated existing save() test comments for the upsert. Eugene Sukhodolskiy committed 6 days ago
2026-06-26	eb2f092 Browse files » Navi Code: stop via Esc + project cwd propagation ... - TUI: Esc stops active stream cooperatively via POST /sessions/{id}/stop - TUI: render stream_stopped as status message - CLI/WebSocket: send shell cwd in client->server message field - Orchestrator stores cwd in session.session_metadata - ContextBuilder injects [Working directory] into LLM context - Agent sets current_working_directory ContextVar per turn - tools/base: ToolContext gains cwd field - filesystem/terminal/code_exec resolve relative paths against session cwd - Add bin/navi-code wrapper for PATH symlink; document in README.md - Update docs/websocket.md and tests Full pytest: 544 passed, 1 skipped. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 19 days ago
	4519268 Browse files » compressor: structured summaries, profile-aware compression, adaptive keep_recent ... - Replace free-form summary with strict Markdown template (Goal, Active Files, Decisions, Completed Work, Pending Work/Todo, Errors, Key Values). - Keep filesystem/code_exec/terminal tool results and messages with is_compression_critical=True verbatim during compression instead of 300-char truncation. - Make compression profile-aware: AgentProfile gains compression_keep_recent, compression_max_tokens, compression_prompt_file. navi_code uses dedicated compression prompt and larger keep_recent/max_tokens. - Adaptive partition_messages(): important turns (user corrections, errors, critical tools) survive longer; filler/social turns compress sooner. - Increase default context_summary_max_tokens from 3000 to 4000. - Propagate active profile changes to ContextCompressor and SubAgentRunner. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 19 days ago
	cb7dfd9 Browse files » agent: skip planning for casual greetings + strengthen DIRECT shortcut in Phase 1 ... - Add fast _is_casual_message heuristic in navi/core/agent.py. Greetings and social chat (e.g. 'привет', 'как дела', 'hi', 'thanks') bypass planning even on the first session message, unless planning_mandatory is enabled. - Strengthen Phase 1 planning prompt in navi/core/planning.py: explicitly require DIRECT output for greetings, simple questions, and one-step instructions. - Fix broken ContextCompressed import in navi/core/compressor.py. - Add unit tests covering the new heuristic and DIRECT prompt. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 19 days ago
	77af8fe Browse files » Navi Code TUI: fix input box layout, command palette duplicate IDs, status renderer, and WS input loop ... - clients/terminal/tui/widgets/input_box.py: switch Horizontal to Vertical with width: 100% for Input so it renders and accepts input in real terminals; add refresh on Input.Changed. - clients/terminal/tui/screens/command_palette.py: remove fixed ListItem IDs to avoid DuplicateIds on fast filter. - clients/terminal/tui/chat_model.py + renderers/status.py + widgets/chat_panel.py: render backend status events as dim system messages instead of raw dicts. - clients/terminal/tui/ws_bridge.py: start NaviWebSocketClient.input_loop so enqueued user messages are actually sent to the backend. - clients/terminal/tui/tui_app.py: focus InputBox synchronously in on_mount so typing works immediately. - tests/clients/test_tui_app.py: regression test for visible input text. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 19 days ago
2026-06-24	aaba87e Browse files » Fix MCP names and profile API format consistency ... - Replace stale mcp__navi_web__* / mcp__navi_3d__* names with canonical mcp__navi-web__* / mcp__navi-3d__* across prompts and key_tools. - Update /agents/profiles and /admin/profiles endpoints to expose tools.agent / tools.subagent instead of deprecated enabled_tools fields. - Update docs/mechanics.md to reference the new tools structure. - Archive stale docs/visual.html. Co-Authored-By: Claude <noreply@anthropic.com> Eugene Sukhodolskiy committed 21 days ago
2026-05-26	e55e0e8 Browse files » Move terminal_manager to _internal subpackage ... - terminal_manager is an internal helper, not a tool - Update imports in terminal.py, container.py, test_terminal.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 26 May
	dc7b6ca Browse files » Fix terminal review issues: security, lifecycle, reactivity ... - TerminalManager.open now accepts exec_tokens to use create_subprocess_exec for restricted commands instead of always using shell - Fix kill/terminate order: SIGTERM first, SIGKILL fallback - Pop closed sessions from _sessions dict to prevent memory leak - Add terminal_manager.shutdown() to AppContainer.shutdown() - Wait for reader tasks in foreground open before returning output - Add _MAX_TERMINALS_PER_SESSION limit (10) - Wrap cleanup_idle tasks in _close_one_safe with error logging - send_input catches BrokenPipeError/ConnectionResetError specifically - Foreground terminals auto-close after gathering output - Vue reactivity: replace terminals object immutably instead of mutating - onTerminalClosed marks matching tool card as no longer pending - Update tests for new behavior (foreground auto-close, max limit) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 26 May
	788da27 Browse files » Add persistent multi-session terminal tool with background support ... - New TerminalManager module: named subprocess sessions per Navi session, background readers, event-sink streaming, idle auto-cleanup - Refactor terminal tool to multi-action: run, open, close, list, status, send_input - Add TerminalOutputDelta and TerminalClosed events for streaming - Wire TerminalManager into AppContainer, orchestrator, and registry - Persist session_metadata in Session model and pg_session_store - Close all session terminals on session delete - Webclient: handle terminal_output/terminal_closed WS events, display live terminal output in tool cards - Update unit tests for new terminal actions Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 26 May
	5a93b2b Browse files » Fix delta loss during thinking-to-text transition and streaming orphaning ... Backend: - agent.py _consume_stream: replace elif with if so a single chunk can carry both the final thinking fragment AND the first text delta. Fixes the 'first token lost after tools' bug. Frontend: - chat.js reloadSession: bail out early when streaming.value is true. Replacing messages.value mid-stream orphans streamingMsg, breaking Vue reactivity for all subsequent deltas and tool cards. - useWebSocket.js session_sync handler: add !chat.streaming guard before calling reloadSession. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 26 May
2026-05-25	5e88cf9 Browse files » Fix 19 issues found in full codebase review ... Backend: - Stop session auth bypass: require auth for owned sessions, reject anonymous with 401 - upload_file: stream chunks directly to disk instead of buffering in RAM - MCP config: validate name against path traversal regex - auth deps: cleanup stale refresh locks periodically - auth routes: expire mobile auth states after 10 min to prevent unbounded growth - compressor: meta-summarize existing summaries before compression; preserve assistant content when tool_calls present; rewrite hard_truncate to keep whole turns - orchestrator: configurable WS replay buffer size; async cleanup/remove_websocket/clear_busy; fix run_recall ContextVar order to avoid deadlock on _build_agent failure; await cleanup in finally - agent: persist image_msg in session.messages; remove archived messages from session after archive; remove duplicate StreamStopped yield on tool stop - websocket: try/except around create_task with cleanup on failure; await remove_websocket Frontend: - App.vue: hashchange listener lifecycle in onMounted/onUnmounted - MessageList.vue: passive scroll, flash timeout cleanup, archive scroll snapshot - InputBar.vue: 300 ms debounce on draft save to localStorage - SessionList.vue: remove :key from DynamicScroller to avoid remount jitter Tests: 422 passed, 1 skipped Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 May
	182629b Browse files » Add meta-summary for multi-level compression ... When to_summarize contains multiple existing summary messages whose combined length exceeds 8000 chars (~1/3 of max summarizer input), run a quick meta-summary pass first to consolidate them into a single compact summary before the main compression. This prevents information loss when repeated compressions stack up long summary chains. - _meta_summarize(): fast LLM pass (think=False, max_tokens=1500) - compress_context(): detects >1 long summaries and triggers meta pass - Graceful fallback: if meta-summary fails, continue with raw summaries - 3 new unit tests: consolidation, skipped for short summaries, failure fallback Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 May
	7dcec4c Browse files » Add archive message pagination, configurable WS replay buffer ... Backend: - Add archive_threshold to Session model and getSession response - Add next_before_seq to archive endpoint for cursor pagination - Make WS replay buffer size configurable via WS_REPLAY_BUFFER_SIZE Webclient: - Add getArchivedMessages API function - Add archive pagination state and loadArchivedMessages to chat store - MessageList: auto-load older messages on scroll-to-top with scroll position preservation and loading spinner Docs: update config.md with new env vars Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 May
	6cea761 Browse files » Wire archive trigger into agent after compression ... After _do_compress_and_save finishes, if the total persisted message count (db_next_sequence) exceeds session_messages_window (default 1000), the agent now calls archive_old_messages() to move older rows into session_messages_archive. Adds session_messages_window config and unit tests for archive SQL. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 May
	c7c0479 Browse files » Add session message archive table + global sequence_number tracking ... Schema: - session_messages_archive — identical structure, stores old messages. - sessions.next_sequence — monotonic seq counter per session. - sessions.archive_threshold — split point between hot and archive. Behaviour: - get() / _build_sessions() load only seq >= archive_threshold (hot). - save() UPDATEs existing rows (seq >= 0) and INSERTs new ones (seq = -1) with auto-assigned sequence_number = next_sequence, next_sequence+1, ... - archive_old_messages() moves a batch to archive and bumps the threshold. This keeps the hot table bounded so list/get RAM stays flat regardless of total session history size. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 May
	cb22a1d Browse files » Implement delta-save for session messages ... Replace full DELETE/INSERT with efficient delta writes: - Track db_message_count on Session (how many rows already persisted). - On save(): UPDATE mutable flags for existing rows, DELETE only extras (race guard), INSERT new messages via executemany. - Reduces DB write amplification from O(N) to O(delta) per turn. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 May