root/navi-1

Fork: 0

root / navi-1

History for navi-1 / navi / core / compressor.py

2026-05-16	d67992a Browse files » Extract ContextCompressor, fix STL viewer, expand test suite, add architecture audit docs ... - Extract ContextCompressor from agent.py (Step 1 of god-object refactor) - Add retry + hard-truncate fallback logic to ContextCompressor - Add unit tests: agent loop (14), compressor (18), KV store (8), auth encrypt (3), auth deps (13), todo/scratchpad/image_view/memory - Fix WebGL STL viewer: allow-same-origin sandbox + graceful fallback - Add CompressionStarted event and client-side compression notice - Add docs/architecture_weak_spots.md and plan_01_god_object_agent.md - Update test_events.py and test_agent_context_size.py for new logic Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 May
2026-05-11	cebc073 Browse files » Fix ollama_backends / FallbackOllamaBackend issues ... - registry.py: always use FallbackOllamaBackend (unified backend). Enables model priority lists in all deployments, not just multi-server. - agent.py: add missing think=profile.think_enabled to run() (REST endpoint). - compressor.py: fix model param type (str → list[str] \| str \| None). - fallback.py: harden load_servers_from_file against missing/bad JSON files and entries without host. Add clear_blacklists() for manual reset. - admin.py: add POST /admin/ollama/clear-blacklists endpoint. - tech_debt_review: document dead stream() methods. - tests: add tests for single-server fallback, bad file handling, missing host skipping, and blacklist clearing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 May
2026-05-02	8b3ebca Browse files » Refine 3D modeler workflow Eugene Sukhodolskiy committed on 2 May
2026-04-25	e67e7a5 Browse files » Improve compression and memory prompts Eugene Sukhodolskiy committed on 25 Apr
2026-04-20	9704a92 Browse files » Autonomous reasoning improvements: budget, anchoring, anti-stall, validation ... - AgentProfile: per-profile thinking mechanics flags (think_enabled, iteration_budget_enabled, goal_anchoring, anti_stall, step_validation, planning_reflect, adaptive_replan) — all profiles updated in config.json - Iteration budget: inject remaining iterations into context so model knows when to wrap up; urgency levels at ≤7 and ≤3 remaining - Goal anchoring: inject original goal + todo state every N iterations to prevent drift on long tasks - Anti-stall: two signals — no todo progress for N iterations, or identical tool calls repeated N times; warning injected into context - Todo step validation: marking done requires a validation field describing how result was verified; failed gets a soft nudge with tip for re-planning - stream_complete: add think param to base class, ollama and openai backends - Summarizer: raise max_tokens 1024→3000, expand system prompt with user-preferences section and verbatim-value instructions - Compression card: persist to session.messages (is_compression flag on Message), show expandable summary in webclient with markdown body - ToolResult.to_message_content: always include output on failure so tracebacks and error details reach the model (fixes silent Error: None) - Developer profile: fix subagent profile secretary→developer, add write_tool to subagent_tools, clarify write_tool vs filesystem in system prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
2026-04-17	0c3dc98 Browse files » Planning phases, context compression, and tool improvements ... Agent: - Planning now a 3-phase async generator: Analysis → Execution plan → AIHelper critic - Yield PlanningStatus events before each phase (UI progress labels) - Phase 1 runs with think=True for deeper analysis - Phase 2 includes available tool list so executor assignments are accurate - Phase 3: independent critic pass validates and corrects TOOL: names against real tool list - Planning converted from list return to async generator (fixes token accounting) Backend: - Context compression threshold: 80% → 70% to trigger earlier - Compressor summary prompt: structured sections (goal, work state, key facts, outputs, errors) - Terminal output capped at 5000 chars to prevent context flooding - Web search: region=wt-wt for DDG, country=ALL for Brave, language=all for SearxNG - Scratchpad: mandate writing a 'goal' section at start of multi-step tasks - secretary max_iterations: 40→25, temperature: 0.7→0.5 - server_admin max_iterations: 40→20 Webclient: - ThinkingCard strips <thought> XML tags leaked by Ollama - planning_status WS event wired to chat.onPlanningStatus() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
2026-04-15	4b64763 Browse files » Add explicit output token budget for summarizer (context_summary_max_tokens) ... Previously there was no num_predict set for the summarization LLM call, so Ollama used its server default (often 128 tokens — very short summaries). - Add max_tokens param to LLMBackend.complete() and OllamaBackend (→ num_predict) - Add context_summary_max_tokens: int = 1024 to config - Thread it through compress_context() and CompressionWorker Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
2026-04-15	96548a1 Browse files » Expand summarization budget for better context quality ... - _MAX_SUMMARY_INPUT_CHARS: 12k → 24k chars (2x input fed to summarizer) - context_keep_recent: 10 → 8 turns (2 more turns go into each summary batch) - Summarizer prompt: replace "Be brief" with "Be thorough" — capture code/config snippets and enough detail to continue the conversation without original messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
2026-04-14	bbc93b5 Browse files » Expose compression summary as collapsible debug card in chat UI ... ContextCompressed event now carries the full summary text produced by the LLM. Compression notice in chat becomes a <details> element showing message count (before→after) with the summary expandable on click. Rendered as markdown via marked.js. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
2026-04-10	86402e0 Browse files » Add stop button and fix context compression hang ... Stop generation: - Client: send button toggles to red ■ during streaming; sends {type:stop} via WS - Server: _stream_recv concurrently reads incoming messages during streaming using asyncio.wait — stop signal is handled immediately without polling - Cooperative stop via asyncio.Event (current_stop_event ContextVar): agent breaks out of LLM async-for cleanly so aclose() fires → Ollama stream closes gracefully, model stays in VRAM. No task.cancel() which would eject the model. - StreamStopped event propagates through run_stream/run_ephemeral; sub-agents stop via the same shared stop_event inherited through task context Context compression fix: - compress_context passes think=False to llm.complete() — no extended reasoning during summarization which caused GPU hang - Input truncated to 12k chars before sending to summarizer - LLMBackend.complete() / OllamaBackend.complete() accept think: bool \| None override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
2026-04-08	bdd3786 Browse files » Review fixes: events module, circular imports, deps, vision-aware compression ... - Extract all AgentEvent dataclasses to navi/core/events.py; import from there in agent.py and __init__.py — eliminates circular import between workers and core - workers/compressor.py: remove runtime import hack, use navi.core.events - workers/base.py: WorkerResult.events typed as list[AgentEvent] (was Any) - api/deps.py: replace @lru_cache on mutable list with module-level singletons (_registries, _workers) - core/compressor.py: _format_for_summary returns (text, images); images passed to summarization LLM so vision models describe them in summary; non-vision models silently ignore the images field; docstring updated - client/js/app.js: add comment explaining is_summary backward compat branch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	261459a Browse files » Separate display history from LLM context; formalize worker system ... Architecture change: - session.messages: full display history, never modified by compression - session.context: what the LLM sees, may be compressed by workers - System messages go only into context (not display history) - Image injections (synthetic) go only into context - User/assistant/tool messages go into both SQLite: add context column with backward-compat migration (empty context → initialized from messages on load) Workers (navi/workers/): - Worker ABC + WorkerContext + WorkerResult (base.py) - CompressionWorker: compresses session.context when above threshold - build_default_workers() returns [CompressionWorker()] - Agent accepts workers list, runs them after StreamEnd - Workers injected via deps.py get_workers() (lru_cached singleton) - WebSocket agent construction also receives workers Compressor: compress_context() now takes context[], not messages[] Config: context_keep_recent 6 → 10 Agent: _run_workers() collects events from all workers and yields them Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	802c186 Browse files » Add context compression: rolling summarization when context fills up ... Mechanism: - After streaming ends, if context_tokens >= threshold (80% of num_ctx), compress old turns into a summary message using the same LLM - Partition: keep system msg + last N turns verbatim (default 6); everything older goes to the summarizer - Tool call groups (assistant + tool results) never split across boundary - Existing summary messages folded into new compression pass — no stack growth - Summary stored as Message(role=user, is_summary=True) after system msg - On failure: logged, session left unchanged (non-fatal) New files: - navi/core/compressor.py: should_compress, partition_messages, compress_session (pure logic, testable without agent) New config (navi/config.py): - context_compression_enabled: bool = True - context_compression_threshold: float = 0.80 - context_keep_recent: int = 6 - context_summary_temperature: float = 0.3 New agent event: ContextCompressed(messages_before, messages_after) Message.is_summary: bool field marks compressed history blocks Client: - context_compressed WS event → subtle inline notice in message list - loadHistory: is_summary messages rendered as collapsible summary cards - style.css: .summary-card, .compression-notice Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr