| 2026-05-21 |
Fix token counting: show only completion tokens, not cumulative prompt+completion
...
The token_count displayed next to assistant messages was summing
prompt_tokens + completion_tokens across ALL tool-calling iterations,
giving hundreds of thousands of tokens for multi-turn conversations.
Now:
- token_count (coins icon) = only completion tokens generated by the model
- context_tokens (ContextBar) = only prompt tokens (context size sent to LLM)
This gives users a realistic measure of how much the model actually generated.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 21 May
|

Migrate MCP tool naming from mcp:server:tool to mcp__server__tool
...
The colon separator (mcp:server:tool) confuses many LLMs during
tool-calling because colons appear in schemas and URLs. Switch to
double-underscore separator (mcp__server__tool) for robust parsing.
Key changes:
- navi/mcp/tools.py: add build_mcp_name(), parse_mcp_name(), is_mcp_tool()
- navi/core/tool_executor.py: update _resolve_tool() with new helpers
and legacy colon fallback for old sessions
- navi/core/tool_utils.py, subagent_runner.py: use build_mcp_name()
- navi/api/routes/{admin,agents}.py: prefix via build_mcp_name()
- navi/tools/{list_tools,reload_tools}.py: migrated
- All profile configs + system_prompt.txt: replace mcp: with mcp__
- manuals/{model_3d,lint_scad,render_3d,spawn_agent}.md: updated
- mcp_servers.d/gnexus-book.json: instructions updated
- docs/{api,profiles,tools,mechanics,visual.html}: updated
- tests: test_tool_executor.py and test_mcp.py aligned
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 21 May
|
| 2026-05-18 |
Make Settings immutable (frozen=True) and fix all test mutations
...
- Add frozen=True to SettingsConfigDict in navi/config.py
- Convert model_validator to mode="before" since mode="after" cannot mutate frozen instances
- Replace all field-level monkeypatches in tests with whole-Settings object replacement
- Ensure cross-module settings consistency (content_store, session_files, share_file, content_publish, filesystem)
392 passed, 1 skipped
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 18 May
|
| 2026-05-16 |
Step 4: Extract SubAgentRunner from run_ephemeral()
...
- Create navi/core/subagent_runner.py with full sub-agent loop logic
- Move _iter_stream_guarded to navi/core/stream_guard.py
- Move _check_context_size to ContextCompressor.check_context_size()
- Extract build_tool_list() and load_user_enabled_tools() to tool_utils.py
- Agent.run_ephemeral() becomes a thin wrapper delegating to SubAgentRunner
- Remove ~310 lines from agent.py
- All existing run_ephemeral tests pass unchanged
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Step 3: Extract AntiStallMonitor from run_stream()
...
- Create navi/core/anti_stall.py with AntiStallMonitor class
- Encapsulates stall detection (todo progress + repeated tool calls)
- Encapsulates adaptive re-plan (failed todo step detection)
- Provides init() / pre_turn() / post_turn() two-phase interface
- Remove ~50 lines of stall/replan logic from agent.py run_stream()
- Remove _todo_status_snapshot and _todo_failed_steps helpers from agent.py
- Update AgentTurnContext: remove stall fields (now live in AntiStallMonitor)
- Add 13 unit tests for pre_turn and post_turn behavior
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Extract ContextCompressor, fix STL viewer, expand test suite, add architecture audit docs
...
- Extract ContextCompressor from agent.py (Step 1 of god-object refactor)
- Add retry + hard-truncate fallback logic to ContextCompressor
- Add unit tests: agent loop (14), compressor (18), KV store (8),
auth encrypt (3), auth deps (13), todo/scratchpad/image_view/memory
- Fix WebGL STL viewer: allow-same-origin sandbox + graceful fallback
- Add CompressionStarted event and client-side compression notice
- Add docs/architecture_weak_spots.md and plan_01_god_object_agent.md
- Update test_events.py and test_agent_context_size.py for new logic
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Enhance native toolset and add persistent KV store
...
- Add PostgreSQL-backed KvStore (navi/store/) for session-scoped data.
- Migrate todo and scratchpad from in-memory dicts to KvStore.
- Filesystem: add copy, grep, diff actions; compress description.
- CodeExec: remove language param, expose working_dir in schema.
- ImageView: resize to 1024px JPEG + Content-Type guard for URLs.
- Memory list: return distinct categories instead of all facts.
- SSH: add scp action with upload/download support.
- Update CLAUDE.md (Postgres-only), docs/tools.md, add docs/store.md.
- Fix agent/planning/context_builder async signatures for todo helpers.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
| 2026-05-15 |
Add self-recall (scheduled callback) system
...
Core features:
- schedule_recall tool: once/recurring/immediate callbacks
- manage_recall tool: cancel/skip/list scheduled recalls
- Natural-language time parser (ISO, relative, "tomorrow at 09:00")
- PostgreSQL-backed RecallScheduler with lazy pool init
- Background recall_scheduler_loop with asyncio.Semaphore(3)
- _busy_sessions guard prevents user messages during headless runs
- Agent.run() preserves thinking field for session history visibility
- API endpoints: GET/DELETE/POST for session recall, admin list
- Frontend: recall badge, filter, cancel/skip in sidebar and chat header
- Tests: parser, scheduler CRUD, tools, API, scheduler loop (53 tests)
- Manuals: schedule_recall.md and manage_recall.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 15 May
|
| 2026-05-13 |
Fix gnexus-book MCP instructions and add old-format fallback in tool executor
...
- Updated mcp_servers.json gnexus-book instructions to use mcp:gnexus-book:
prefix instead of the old mcp_gnexus-book_ format.
- Added fallback in tool_executor.py for old underscore format
(mcp_server_tool → mcp:server:tool) so transitional models still work.
- Added unit test for the old-format fallback.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 13 May
|
Rename MCP tools to mcp:server:tool format and restore human-readable names
...
- Core naming: mcp_server_tool → mcp:server:tool (colon-delimited)
- navi-web tools: search/view/request → web_search/web_view/http_request
- navi-3d tools: compile_scad/render_stl/lint_scad (unchanged names)
- Updated all profile configs, system prompts, docs, manuals, tests
- Added new lint_scad.md manual
- Fixed modeler_3d prompt stale references (scad_lint, model_3d, render_3d)
- All 240 tests pass
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 13 May
|
| 2026-05-12 |
Handle MCP tool aliases robustly
Eugene Sukhodolskiy
committed
on 12 May
|
Clarify knowledge persistence prompts
Eugene Sukhodolskiy
committed
on 12 May
|
| 2026-05-11 |
Fix ollama_backends / FallbackOllamaBackend issues
...
- registry.py: always use FallbackOllamaBackend (unified backend).
Enables model priority lists in all deployments, not just multi-server.
- agent.py: add missing think=profile.think_enabled to run() (REST endpoint).
- compressor.py: fix model param type (str → list[str] | str | None).
- fallback.py: harden load_servers_from_file against missing/bad JSON files
and entries without host. Add clear_blacklists() for manual reset.
- admin.py: add POST /admin/ollama/clear-blacklists endpoint.
- tech_debt_review: document dead stream() methods.
- tests: add tests for single-server fallback, bad file handling,
missing host skipping, and blacklist clearing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 11 May
|
| 2026-05-02 |
Refine 3D modeler workflow
Eugene Sukhodolskiy
committed
on 2 May
|
| 2026-05-01 |
Improve 3D modeling validation prompts
Eugene Sukhodolskiy
committed
on 1 May
|
| 2026-04-29 |
Add regression tests for content publishing and LLM timeouts
Eugene Sukhodolskiy
committed
on 29 Apr
|
Add context provider registry all accessor
Eugene Sukhodolskiy
committed
on 29 Apr
|
Add 62 tests across all planned phases + fix integration flakiness
...
- Phase 3: 19 API route integration tests (health, agents, sessions, messages)
- Phase 3: 7 WebSocket integration tests (connect, send, replay, invalid, stop)
- Phase 4: 9 agent tests (_check_context_size, _iter_stream_guarded)
- Phase 4: 5 planning tests (_parse_plan_steps)
- Phase 5: 22 tool tests (filesystem 13, code_exec 5, terminal 4)
- Fix flaky integration tests by patching module-level singletons
(_session_store, _registries, _workers) instead of getter functions,
because FastAPI Depends() captures the original function at import time.
- Update docs/testing.md coverage table (150 total tests)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 29 Apr
|

Bootstrap test suite — Phase 1 unit tests
...
- docs/testing.md: testing strategy, mock strategy, phase breakdown
- tests/conftest.py: autouse fixture to reset navi.config.settings per test
- tests/conftest_factory.py: FakeLLMBackend, FakeTool, make_profile, make_registry helpers
- tests/unit/core/test_events.py: wire serialization for all 15 event dataclasses
- tests/unit/core/test_compressor.py: should_compress, partition_messages, format_for_summary, compress_context
- tests/unit/core/test_registry.py: ToolRegistry, ProfileRegistry, BackendRegistry
- tests/unit/core/test_context_builder.py: system prompt caching, persona injection, goal anchor, iteration budget
- tests/unit/profiles/test_base.py: Pydantic model coercion, defaults, extra fields
- navi/core/context_builder.py: use module-level `import navi.config` instead of `from navi.config import settings` so tests can swap the singleton
59 tests passing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 29 Apr
|