| 2026-05-26 |
Add persistent multi-session terminal tool with background support
...
- New TerminalManager module: named subprocess sessions per Navi session,
background readers, event-sink streaming, idle auto-cleanup
- Refactor terminal tool to multi-action: run, open, close, list,
status, send_input
- Add TerminalOutputDelta and TerminalClosed events for streaming
- Wire TerminalManager into AppContainer, orchestrator, and registry
- Persist session_metadata in Session model and pg_session_store
- Close all session terminals on session delete
- Webclient: handle terminal_output/terminal_closed WS events,
display live terminal output in tool cards
- Update unit tests for new terminal actions
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 26 May
|
| 2026-05-25 |

Fix 19 issues found in full codebase review
...
Backend:
- Stop session auth bypass: require auth for owned sessions, reject anonymous with 401
- upload_file: stream chunks directly to disk instead of buffering in RAM
- MCP config: validate name against path traversal regex
- auth deps: cleanup stale refresh locks periodically
- auth routes: expire mobile auth states after 10 min to prevent unbounded growth
- compressor: meta-summarize existing summaries before compression; preserve assistant content when tool_calls present; rewrite hard_truncate to keep whole turns
- orchestrator: configurable WS replay buffer size; async cleanup/remove_websocket/clear_busy; fix run_recall ContextVar order to avoid deadlock on _build_agent failure; await cleanup in finally
- agent: persist image_msg in session.messages; remove archived messages from session after archive; remove duplicate StreamStopped yield on tool stop
- websocket: try/except around create_task with cleanup on failure; await remove_websocket
Frontend:
- App.vue: hashchange listener lifecycle in onMounted/onUnmounted
- MessageList.vue: passive scroll, flash timeout cleanup, archive scroll snapshot
- InputBar.vue: 300 ms debounce on draft save to localStorage
- SessionList.vue: remove :key from DynamicScroller to avoid remount jitter
Tests: 422 passed, 1 skipped
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 May
|
Add meta-summary for multi-level compression
...
When to_summarize contains multiple existing summary messages whose
combined length exceeds 8000 chars (~1/3 of max summarizer input),
run a quick meta-summary pass first to consolidate them into a single
compact summary before the main compression. This prevents information
loss when repeated compressions stack up long summary chains.
- _meta_summarize(): fast LLM pass (think=False, max_tokens=1500)
- compress_context(): detects >1 long summaries and triggers meta pass
- Graceful fallback: if meta-summary fails, continue with raw summaries
- 3 new unit tests: consolidation, skipped for short summaries, failure fallback
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 May
|
Wire archive trigger into agent after compression
...
After _do_compress_and_save finishes, if the total persisted message count
(db_next_sequence) exceeds session_messages_window (default 1000), the agent
now calls archive_old_messages() to move older rows into
session_messages_archive.
Adds session_messages_window config and unit tests for archive SQL.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 May
|
Review fixes: restore _build_sessions, fix flags, search filter, tests
...
- Restored _load_messages_map and _build_sessions helpers that were
accidentally dropped in Phase 5 (list_all/list_page/search_list
called them but they were missing, causing NameError at runtime)
- _build_sessions now filters messages by is_display so list methods
return consistent display-only history like get()
- count_all/search_list EXISTS subquery now filters to is_display=true
so search only matches visible chat messages
- Updated pg_session_store docstring to remove stale dual-write claim
- compressor summary_msg now defaults to is_display=False
- Added unit tests for message flags (agent, compressor, planning)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 May
|
Phase 2: Dual-write with is_context/is_display flags on Message
...
- Message model gets is_context and is_display bools
- PgSessionStore.save() writes flags directly to session_messages
- Agent sets is_context=False on display-only messages, is_display=False on context-only
- Planning: plan context msg is_display=False, plan marker is_context=False
- Compression: summarized messages get is_context=False, summary added to messages with is_display=False
- Tests updated for extra user display+context messages per turn
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 May
|
Add subagent progress report on failure
...
When a subagent stops (timeout, max iterations, thinking stall, user stop),
it now returns a structured progress report built from its local message
context, so the parent agent knows what tools were called and what was
accomplished before the stop.
- Add _build_progress_report() to SubAgentRunner
- Report includes: turn number, assistant text, tool calls with results
- Prepended to result_text for every stop reason (completed also gets it)
- Updated test_run_ephemeral_complete to expect the report prefix
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 May
|
| 2026-05-24 |
Fix MCP tool spinner bug: match tool_started → tool_call by tool_call_id
...
- Add tool_call_id field to ToolStarted and ToolEvent dataclasses
- Pass tc.id as tool_call_id from agent.py, subagent_runner.py, and tool_executor.py
- Update frontend chat.js onToolStarted/onToolCall to match cards by toolCallId
with fallback to name-matching for backward compatibility
Closes spinner issue where LLM short name ("search_docs") didn't match
resolved MCP name ("mcp__gnexus_book__search_docs").
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 May
|
Add auth resilience: user cache, retry, and API token fallback
...
- 30-second in-memory _user_cache to avoid hammering gnexus-auth
- _fetch_user_with_retry: one retry after 1.5s sleep on transient failure
- API token fallback when OAuth cookie is present but refresh fails
- Clear cache/locks in test fixture to prevent cross-test pollution
- Fix registry timeout test after lowering default to 90s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 May
|
Apply review fixes to API token auth system
...
Backend:
- navi/auth/deps.py: replace 3 DB round-trips with single JOIN query for
token resolution; update last_used_at still separate (best-effort)
- navi/api/routes/api_tokens.py: replace asyncpg-specific "UPDATE 1"
string check with RETURNING id fetchrow; increase token_prefix from
8 to 12 chars for better visual identification; add security notes
- tests/unit/auth/test_api_tokens.py: update tests for JOIN query and
RETURNING-based revoke
Frontend:
- webclient/src/components/settings/ShowTokenModal.vue: new modal that
shows the plain token in a readonly field with copy button and
explicit warning — replaces the transient toast notification
- webclient/src/components/settings/ApiKeysPanel.vue: use ShowTokenModal
- webclient/src/composables/useWebSocket.js: add security comment about
localStorage XSS risk and query param log exposure
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 May
|

Add API token auth system for headless/micro clients
...
Backend:
- navi/auth/_ddl.py: add api_tokens table with boot-time migration
- navi/auth/deps.py: _resolve_user now falls back to X-Api-Token header
and ?api_token query param for WebSocket auth
- navi/auth/__init__.py: add ApiToken pydantic model
- navi/api/routes/api_tokens.py: CRUD endpoints (POST/GET/DELETE)
- navi/main.py: wire api_tokens router
Frontend:
- webclient/src/App.vue: add #settings hash routing
- webclient/src/components/settings/: SettingsView, ApiKeysPanel,
CreateKeyModal with copy-to-clipboard flow
- webclient/src/api/index.js: token CRUD API functions
- webclient/src/stores/apiTokens.js: Pinia store
- webclient/src/components/sidebar/AppSidebar.vue: settings link
- webclient/src/composables/useWebSocket.js: append ?api_token= when
localStorage token is present
Tests:
- tests/unit/auth/test_api_tokens.py: 10 unit tests covering token
resolution (header + query param), revoke, missing/revoked tokens,
orphan users, and CRUD endpoints
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 May
|
| 2026-05-23 |
Unify in-memory session state in AgentSessionOrchestrator
...
Replace scattered _runs + _busy_sessions + _session_sockets with a
single _sessions: dict[str, SessionState] on the orchestrator.
- SessionState dataclass holds run, busy_event, and websockets
- _session_sockets module-level global removed from websocket.py;
socket tracking moved into orchestrator (add/remove_websocket)
- Event bus subscriber _on_recall_update moved into orchestrator
- Per-session asyncio.Lock added to protect concurrent-run guard
- _cleanup() auto-removes empty SessionState entries
Tests updated to reference _sessions instead of legacy _runs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 23 May
|
Pass explicit ToolContext to tools instead of hidden ContextVars
...
Add ToolContext dataclass (session_id, event_sink, stop_event, model,
user_id, user_role, user_info) and thread it through the execution chain:
Agent._execute_tools_with_sink → ToolExecutor → tool.execute().
All ~25 tools updated to accept ctx parameter. Tools that previously
read ContextVar now prefer ctx when provided, falling back to
ContextVar for backward compatibility.
Tests updated to pass ToolContext explicitly — no more test fixtures
that set current_session_id / current_user_id ContextVars.
ContextVar setters remain as fallback for non-tool consumers
(ai_helper, context_builder, planning) and will be removed in a
follow-up refactor.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 23 May
|
Fix auth race condition causing frequent logouts
...
Add per-session-id asyncio.Lock around token refresh to prevent
parallel requests from simultaneously refreshing the same token.
Re-read the session inside the lock so a second request can use
the token already refreshed by the first one.
Stop deleting the auth session on refresh failure — transient
errors (network, race condition, expired refresh token) were
wiping the session and forcing a full re-login.
+ tests for both behaviours.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 23 May
|
| 2026-05-21 |
Fix token counting: show only completion tokens, not cumulative prompt+completion
...
The token_count displayed next to assistant messages was summing
prompt_tokens + completion_tokens across ALL tool-calling iterations,
giving hundreds of thousands of tokens for multi-turn conversations.
Now:
- token_count (coins icon) = only completion tokens generated by the model
- context_tokens (ContextBar) = only prompt tokens (context size sent to LLM)
This gives users a realistic measure of how much the model actually generated.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 21 May
|

Migrate MCP tool naming from mcp:server:tool to mcp__server__tool
...
The colon separator (mcp:server:tool) confuses many LLMs during
tool-calling because colons appear in schemas and URLs. Switch to
double-underscore separator (mcp__server__tool) for robust parsing.
Key changes:
- navi/mcp/tools.py: add build_mcp_name(), parse_mcp_name(), is_mcp_tool()
- navi/core/tool_executor.py: update _resolve_tool() with new helpers
and legacy colon fallback for old sessions
- navi/core/tool_utils.py, subagent_runner.py: use build_mcp_name()
- navi/api/routes/{admin,agents}.py: prefix via build_mcp_name()
- navi/tools/{list_tools,reload_tools}.py: migrated
- All profile configs + system_prompt.txt: replace mcp: with mcp__
- manuals/{model_3d,lint_scad,render_3d,spawn_agent}.md: updated
- mcp_servers.d/gnexus-book.json: instructions updated
- docs/{api,profiles,tools,mechanics,visual.html}: updated
- tests: test_tool_executor.py and test_mcp.py aligned
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 21 May
|
FallbackOllamaBackend: do not blacklist single server, empty file fallback
...
- When only one Ollama server is configured, LLMConnectionError no longer
adds it to the dead-server blacklist. This fixes the bug where a
transient failure permanently blocked all requests until server restart.
- LLMModelNotFoundError on a single server is also not blacklisted.
- _discover_backends now falls back to settings.ollama_host when the
ollama_backends_file is empty, missing, or returns no valid servers.
- Added unit tests covering single-server no-blacklist, multi-server
blacklist, model-not-found no-blacklist, and empty-file fallback.
400 passed, 1 skipped
Eugene Sukhodolskiy
committed
on 21 May
|
McpTool: auto-inject session_id + normalize navi-3d paths
...
- McpTool.execute() now forces the real session_id from current_session_id
ContextVar, preventing LLM hallucinations of wrong UUIDs (ghost-session bug).
- For navi-3d MCP server, source_path/output_path are normalized to basename
to prevent double path nesting when the LLM passes full relative paths.
- Updated MCP tool descriptions to ask for filenames only.
- Added system prompt instructions in context_builder and subagent_runner
reminding the model to pass bare filenames to navi-3d tools.
396 passed, 1 skipped
Eugene Sukhodolskiy
committed
on 21 May
|
| 2026-05-18 |
Extract single shared Database pool, eliminate 4 duplicated pool creations
...
- Create navi/db.py::Database managing one asyncpg pool
- KvStore, PgSessionStore, MemoryStore, RecallScheduler now accept pool in constructor
- AppContainer holds Database, shutdown closes one pool instead of 4
- create_container creates one pool and passes it to all stores
- All tests updated to set _initialized=True on fakes to skip DDL
392 passed, 1 skipped
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 18 May
|
Make Settings immutable (frozen=True) and fix all test mutations
...
- Add frozen=True to SettingsConfigDict in navi/config.py
- Convert model_validator to mode="before" since mode="after" cannot mutate frozen instances
- Replace all field-level monkeypatches in tests with whole-Settings object replacement
- Ensure cross-module settings consistency (content_store, session_files, share_file, content_publish, filesystem)
392 passed, 1 skipped
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 18 May
|
Extract WebSocket business logic into AgentSessionOrchestrator
...
- Create navi/core/orchestrator.py with AgentSessionOrchestrator and SessionRun
- Orchestrator owns _runs, _busy_sessions, Agent creation, run_agent(), run_recall()
- Transport-agnostic: accepts notify callback from WebSocket handler
- WebSocket handler (websocket.py) now only does serialization/deserialization
- _fire_recall delegates to orchestrator.run_recall() instead of inline logic
- recall_scheduler_loop now accepts orchestrator parameter
- AppContainer gains .orchestrator field, created in create_container()
- deps.py: add get_orchestrator()
- Update integration tests for scheduler_loop and websocket unit tests
All 392 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 18 May
|
Add WebSocket handler unit tests
...
Tests for reconnect/replay, concurrent-run guard, event buffering,
and session_sync behavior after both normal and recall runs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 18 May
|

Replace global lazy singletons with explicit AppContainer + lifespan
...
- Create navi/core/container.py with AppContainer dataclass and create_container()
- Rewrite navi/api/deps.py: remove module-level singletons, add _container global
fallback + set_container(), use _resolve_container() for all getters
- Replace @app.on_event with @asynccontextmanager lifespan in main.py
- Update routes to use Depends(get_scheduler) instead of calling get_scheduler()
- Fix FastAPI body parsing bug: remove dataclass parameters from Depends getters
(FastAPI was treating AppContainer sub-dependencies as Body params, forcing
embed=True on all endpoint body params and causing 422 errors)
- Update websocket.py to use _resolve_container() instead of get_registries()
- Update integration test fixtures to build AppContainer and call set_container()
- Remove obsolete tests/unit/test_startup.py (tests removed _on_startup)
- Fix test_scheduler_loop.py fixture (get_registries no longer exists)
All 387 tests pass (excluding websocket hang tests).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 18 May
|
| 2026-05-16 |
Fix session file URLs (add /api prefix) and WebGL error handling in STL viewer
...
Backend:
- _file_url in content_store.py now returns /api/sessions/... instead of /sessions/...
- share_file.py URL construction updated to include /api
- content_publish.py docstring updated to reflect correct endpoint
Frontend:
- contentLinks.js: avoid double /api prefix in dev mode when backend already returns it
- stl.html: replace throw err with return after showing WebGL fallback message
- Rebuild dist
Tests:
- Update expected URLs in test_content_store.py and test_share_file.py
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Step 4: Extract SubAgentRunner from run_ephemeral()
...
- Create navi/core/subagent_runner.py with full sub-agent loop logic
- Move _iter_stream_guarded to navi/core/stream_guard.py
- Move _check_context_size to ContextCompressor.check_context_size()
- Extract build_tool_list() and load_user_enabled_tools() to tool_utils.py
- Agent.run_ephemeral() becomes a thin wrapper delegating to SubAgentRunner
- Remove ~310 lines from agent.py
- All existing run_ephemeral tests pass unchanged
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Step 3: Extract AntiStallMonitor from run_stream()
...
- Create navi/core/anti_stall.py with AntiStallMonitor class
- Encapsulates stall detection (todo progress + repeated tool calls)
- Encapsulates adaptive re-plan (failed todo step detection)
- Provides init() / pre_turn() / post_turn() two-phase interface
- Remove ~50 lines of stall/replan logic from agent.py run_stream()
- Remove _todo_status_snapshot and _todo_failed_steps helpers from agent.py
- Update AgentTurnContext: remove stall fields (now live in AntiStallMonitor)
- Add 13 unit tests for pre_turn and post_turn behavior
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Extract ContextCompressor, fix STL viewer, expand test suite, add architecture audit docs
...
- Extract ContextCompressor from agent.py (Step 1 of god-object refactor)
- Add retry + hard-truncate fallback logic to ContextCompressor
- Add unit tests: agent loop (14), compressor (18), KV store (8),
auth encrypt (3), auth deps (13), todo/scratchpad/image_view/memory
- Fix WebGL STL viewer: allow-same-origin sandbox + graceful fallback
- Add CompressionStarted event and client-side compression notice
- Add docs/architecture_weak_spots.md and plan_01_god_object_agent.md
- Update test_events.py and test_agent_context_size.py for new logic
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Fix review issues: KV-store NULL, SVG messages, filesystem tests + docs
...
Bugs fixed:
- filesystem.py: add missing `import re` (grep was broken in production)
- image_view.py: consistent SVG rejection message for URL and file paths
- store/__init__.py: normalize user_id None→'' to prevent duplicate rows
in unique constraint for anonymous sessions; add DDL migration for
existing NULL values
Tests:
- Add 10 unit tests for filesystem copy, grep, diff operations
Documentation:
- agent.md: document streaming guard wrapper, system prompt caching,
ContextVar restoration in subagents
- tools.md: document middleware hooks
- websocket.md: document image upload limits and concurrent run guard
- store.md: document user_id normalization
- mechanics.md: mark newly-documented mechanics as documented
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Fix test_rejects_empty to match current _check_path behavior
...
Empty string is explicitly rejected by _check_path since the
guard clause was added. The test comment was stale.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|
Enhance native toolset and add persistent KV store
...
- Add PostgreSQL-backed KvStore (navi/store/) for session-scoped data.
- Migrate todo and scratchpad from in-memory dicts to KvStore.
- Filesystem: add copy, grep, diff actions; compress description.
- CodeExec: remove language param, expose working_dir in schema.
- ImageView: resize to 1024px JPEG + Content-Type guard for URLs.
- Memory list: return distinct categories instead of all facts.
- SSH: add scp action with upload/download support.
- Update CLAUDE.md (Postgres-only), docs/tools.md, add docs/store.md.
- Fix agent/planning/context_builder async signatures for todo helpers.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 May
|