| 2026-06-01 |
Fix CompressionWorker: sync context compression with session messages
...
- Mark removed messages as is_context=False so DB reload doesn't resurrect them
- Persist summary_msg in messages (is_display=False) so summary survives reload
- Add is_compression marker with is_context=False
- Set context_token_count to real estimate instead of zero
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
24 days ago
|
| 2026-05-02 |
Refine 3D modeler workflow
Eugene Sukhodolskiy
committed
on 2 May
|
| 2026-04-29 |
Architecture extensibility — event bus, middleware, auto-discovery, Pydantic profiles
...
- EventBus: async pub/sub for AgentEvents, WebSocket subscribes instead of direct yield
- Declarative serialization: AgentEvent.to_wire() on all event types
- Auto-discovery for LLM backends (_discover_backends) and workers (scan navi/workers/*.py)
- AgentProfile: Pydantic BaseModel with extra='allow', @field_validator for model coercion
- Tool middleware chain: pre/post execute hooks via ToolRegistry.add_middleware()
- LoggingMiddleware: built-in, logs every tool call
- Fix pg_trgm DDL: conditional GIN indexes via DO $$ block, no CREATE EXTENSION
- New files: event_bus.py, middleware.py, logging_middleware.py
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 29 Apr
|
| 2026-04-20 |

Autonomous reasoning improvements: budget, anchoring, anti-stall, validation
...
- AgentProfile: per-profile thinking mechanics flags (think_enabled,
iteration_budget_enabled, goal_anchoring, anti_stall, step_validation,
planning_reflect, adaptive_replan) — all profiles updated in config.json
- Iteration budget: inject remaining iterations into context so model knows
when to wrap up; urgency levels at ≤7 and ≤3 remaining
- Goal anchoring: inject original goal + todo state every N iterations to
prevent drift on long tasks
- Anti-stall: two signals — no todo progress for N iterations, or identical
tool calls repeated N times; warning injected into context
- Todo step validation: marking done requires a validation field describing
how result was verified; failed gets a soft nudge with tip for re-planning
- stream_complete: add think param to base class, ollama and openai backends
- Summarizer: raise max_tokens 1024→3000, expand system prompt with
user-preferences section and verbatim-value instructions
- Compression card: persist to session.messages (is_compression flag on
Message), show expandable summary in webclient with markdown body
- ToolResult.to_message_content: always include output on failure so
tracebacks and error details reach the model (fixes silent Error: None)
- Developer profile: fix subagent profile secretary→developer, add write_tool
to subagent_tools, clarify write_tool vs filesystem in system prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 20 Apr
|
| 2026-04-15 |
Add explicit output token budget for summarizer (context_summary_max_tokens)
...
Previously there was no num_predict set for the summarization LLM call,
so Ollama used its server default (often 128 tokens — very short summaries).
- Add max_tokens param to LLMBackend.complete() and OllamaBackend (→ num_predict)
- Add context_summary_max_tokens: int = 1024 to config
- Thread it through compress_context() and CompressionWorker
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 15 Apr
|
| 2026-04-14 |
Expose compression summary as collapsible debug card in chat UI
...
ContextCompressed event now carries the full summary text produced by the
LLM. Compression notice in chat becomes a <details> element showing
message count (before→after) with the summary expandable on click.
Rendered as markdown via marked.js.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 14 Apr
|
| 2026-04-10 |

Major feature batch: visibility, planning, file uploads, streaming
...
- stream_complete(): streaming with tools for all LLM turns — thinking
now streams as ThinkingDelta/ThinkingEnd in real-time during tool-
selection turns, not just on the final response
- todo built-in tool: session-scoped plan manager (set/view/update/clear);
persona + all profiles updated with mandatory planning instructions
- TurnThinking event: sub-agent thinking forwarded to parent sink as a
collapsible block in the spawn_agent card
- File uploads: non-image files uploaded via XHR, shown as badges in
message bubble; SVG treated as regular file (not base64 image)
- session_files: POST /sessions/{id}/files, TTL cleanup, forbidden exts
- WebSocket reconnect: _AgentRun broadcast pattern, re-attach mid-stream
- UI: favicon, sidebar logo, turn-thinking cards, subagent thinking blocks,
token counter, draft persistence, file progress bar
- Removed AgentNote (content is always None alongside tool_calls)
- Ollama stream_complete: tool_calls captured from non-final chunk (done=False)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 10 Apr
|
| 2026-04-08 |
Review fixes: events module, circular imports, deps, vision-aware compression
...
- Extract all AgentEvent dataclasses to navi/core/events.py; import from
there in agent.py and __init__.py — eliminates circular import between
workers and core
- workers/compressor.py: remove runtime import hack, use navi.core.events
- workers/base.py: WorkerResult.events typed as list[AgentEvent] (was Any)
- api/deps.py: replace @lru_cache on mutable list with module-level
singletons (_registries, _workers)
- core/compressor.py: _format_for_summary returns (text, images); images
passed to summarization LLM so vision models describe them in summary;
non-vision models silently ignore the images field; docstring updated
- client/js/app.js: add comment explaining is_summary backward compat branch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 8 Apr
|

Separate display history from LLM context; formalize worker system
...
Architecture change:
- session.messages: full display history, never modified by compression
- session.context: what the LLM sees, may be compressed by workers
- System messages go only into context (not display history)
- Image injections (synthetic) go only into context
- User/assistant/tool messages go into both
SQLite: add context column with backward-compat migration
(empty context → initialized from messages on load)
Workers (navi/workers/):
- Worker ABC + WorkerContext + WorkerResult (base.py)
- CompressionWorker: compresses session.context when above threshold
- build_default_workers() returns [CompressionWorker()]
- Agent accepts workers list, runs them after StreamEnd
- Workers injected via deps.py get_workers() (lru_cached singleton)
- WebSocket agent construction also receives workers
Compressor: compress_context() now takes context[], not messages[]
Config: context_keep_recent 6 → 10
Agent: _run_workers() collects events from all workers and yields them
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 8 Apr
|