root/navi-1

Fork: 0

root / navi-1

History for navi-1 / navi / api / websocket.py

2026-05-04	bad65a1 Browse files » Fix legacy session visibility and add WebSocket auth debug logging ... - pg_session_store: remove OR user_id IS NULL from list_all/list_page so legacy sessions are no longer visible to all users - auth/deps.py: add debug logging at every step of _resolve_user - websocket.py: add debug logging at every stage of websocket_session Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 4 May
2026-05-03	3014ba6 Browse files » Multi-user auth via gnexus-auth OAuth + hybrid role/permission model ... - Integrate gnexus-auth-client-py (GAuthClient) for OAuth flow, token refresh, and webhook parsing - Add navi/auth/ package: User model, Fernet encryptor, client singleton, deps (get_current_user, require_admin, require_permission) - New tables: navi_users, user_auth_sessions (auto-created on startup) - Session/memory isolation by user_id with legacy NULL support - Cookie-based auth proxy: /auth/login, /callback, /logout, /me - Webhook receiver /webhooks/gnexus-auth handling user events, global logout, session revocation, role/permission changes - Admin endpoints (/admin/*) gated by role + permissions - Webclient auth store with isAdmin/hasPermission guards - Admin-only profile filtering in /agents/profiles - 200/200 tests passing Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 3 May
2026-04-29	8f68841 Browse files » Architecture extensibility — event bus, middleware, auto-discovery, Pydantic profiles ... - EventBus: async pub/sub for AgentEvents, WebSocket subscribes instead of direct yield - Declarative serialization: AgentEvent.to_wire() on all event types - Auto-discovery for LLM backends (_discover_backends) and workers (scan navi/workers/*.py) - AgentProfile: Pydantic BaseModel with extra='allow', @field_validator for model coercion - Tool middleware chain: pre/post execute hooks via ToolRegistry.add_middleware() - LoggingMiddleware: built-in, logs every tool call - Fix pg_trgm DDL: conditional GIN indexes via DO $$ block, no CREATE EXTENSION - New files: event_bus.py, middleware.py, logging_middleware.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 29 Apr
2026-04-29	098401a Browse files » Stability fixes batch — tech debt review 2026-04-29 ... Critical: - Concurrent WS run race guard (#1) - Tool task cancellation on generator teardown (#2) - StopAsyncIteration kills fallback chain (#3) - Session loading race with _lastLoadId guard (#4) - ContentCard .match() crash on non-string result (#5) - Image data type guard in buildMessageList (#6) High: - Cap WS replay buffer at 500 events (#7) - Deduplicate memory extraction task with asyncio.Lock (#9) - TTL-based fallback blacklisting (5 min) (#10) - Subagent tool exception isolation (#11) - Inline image size/count validation on WS (#12) - Clean up orphaned file on DB insert failure (#13) - Deep watch streamingMsg for auto-scroll (#14) - WS_SCHEME wss:// support for HTTPS (#15) - Sending guard against duplicate message sends (#16) - Global unhandledrejection listener in API layer (#17) Medium: - Cap planning_logs at 20 entries (#22) - Store cleanup_loop task reference (#23) - BaseException → Exception in _run_with_sentinel (#24) - Propagate SystemExit in agent loop (#25) - Configurable output_reserve_tokens (#26) - Always reloadSession on session_sync (#30) - FIFO queue for confirm dialogs (#31) - Reset body.overflow on ImageLightbox unmount (#32) - try/finally in fallback copy (#33) - _isConnecting guard in WS send() (#34) Low: - Lazy-init deps.py singletons (#36) - Replace __import__ with direct imports (#38) - Preserve token count 0 in ollama.py (#39) - Clear orphaned streamingMsg on reconnect reload (#43) - Escape single quote in UserMessage (#44) - Polyfill-free findLast replacement (#48) - Match <table> tags with attributes in markdown (#49) - Attach copy buttons only when msg.done (#50) - Fix hasMeta falsy-0 bug (#53) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 29 Apr
2026-04-28	cabfce8 Browse files » Fix system prompt leakage into chat history; polish content cards ... Backend: - websocket.py + agent.py: separate user-visible display_message from LLM user_message. System hints (image/file attachments) no longer leak into session.messages and appear after page reload. - main.py: add ensure_tables() on startup so session_content table is created before first publish. - profiles: add kimi-k2.6:cloud to all model lists as fallback. Frontend: - ContentCard.vue: remove border-radius, add scrollbar styles, fix metadata fallback parsing so cards survive page reload. - content-viewers/*.html: add matching scrollbar styles. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 28 Apr
	b88b7c0 Browse files » Add content hosting system with inline viewers ... Backend: - Add navi/content/ directory for published files - Add content_store.py with publish/list/delete/cleanup functions - Add content_publish tool for publishing files as viewable content - Add /content static file mount in main.py - Add /content-viewers mount for viewer pages - Extend ToolEvent with metadata field - Forward metadata through websocket tool_call events - Update Agent to include metadata in ToolEvent Frontend: - Add ContentCard.vue component for displaying published content - Add viewer pages: stl.html (Three.js), svg.html, html.html, pdf.html - Update AssistantMessage.vue to render ContentCard for content_publish - Update chat store to preserve metadata in tool cards - Update websocket protocol docs with metadata field Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 28 Apr
	d9e9f4d Browse files » Stop image_view hallucinations on inline-attached images ... The model was inventing fake paths/URLs (e.g. files.oaiusercontent.com, /home/ubuntu/navi-1/input_file_0.png) and calling image_view on them when the user attached an image directly in chat — the image was already in the multimodal context, but the tool description and lack of a signal pushed the model to "load" it anyway. - websocket.py: when a user message has inline images, append a brief note that they are already in context. - image_view.py: soften the description — keep proactive use for paths and URLs the model genuinely cannot see, but tell it inline images don't need this tool. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 28 Apr
2026-04-25	d025bfc Browse files » Fix websocket.py: unpack 4-tuple from get_registries(), pass cp_registry to Agent ... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 Apr
2026-04-21	b43428a Browse files » WebSocket event replay buffer for disconnect resilience ... On reconnect to an active agent run the server now replays all events emitted since the turn started, then switches to live forwarding. This eliminates the gap where tool cards, thinking blocks and stream deltas were permanently lost after a network blip. Server (_AgentRun): - events: list[dict] buffers every serialised agent event - broadcast() serialises and appends before putting in subscriber queues - reconnect flow: subscribe → replay_count snapshot → stream_start → replay events[0:replay_count] → live _stream_to_client Client: - onStreamStart() removes the frozen ghost message instead of marking done=true, so replay cleanly rebuilds the message from scratch - replayMode flag suppresses animations during replay - onReplayStart/onReplayEnd handlers set/clear the flag and restore animate on the message once live events resume Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 21 Apr
2026-04-17	59f01b3 Browse files » Route subagent planning events into spawn_agent card in the UI ... Previously PlanningStatus/PlanReady had no is_subagent flag, so subagent planning spinners and plan cards rendered as top-level Navi planning UI. Backend: - Add is_subagent field to PlanningStatus and PlanReady events - _run_planning accepts is_subagent param, passes it through all yields - run_ephemeral calls _run_planning with is_subagent=True - websocket.py forwards is_subagent in planning_status and plan_ready messages Frontend (chat.js): - onPlanningStatus: if is_subagent, set planningLabel on the last spawn_agent card instead of msg.statusLabel - onPlanReady: if is_subagent, push plan into spawn card steps and clear planningLabel; otherwise behave as before Frontend (ToolCard.vue): - Render subagent-planning-indicator (spinner + label) when planningLabel set - Render plan cards inside subagent steps using the same plan-card pattern Also includes leftover session changes: spawn_agent default 40 in description and manual, updated manual content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
2026-04-17	0c3dc98 Browse files » Planning phases, context compression, and tool improvements ... Agent: - Planning now a 3-phase async generator: Analysis → Execution plan → AIHelper critic - Yield PlanningStatus events before each phase (UI progress labels) - Phase 1 runs with think=True for deeper analysis - Phase 2 includes available tool list so executor assignments are accurate - Phase 3: independent critic pass validates and corrects TOOL: names against real tool list - Planning converted from list return to async generator (fixes token accounting) Backend: - Context compression threshold: 80% → 70% to trigger earlier - Compressor summary prompt: structured sections (goal, work state, key facts, outputs, errors) - Terminal output capped at 5000 chars to prevent context flooding - Web search: region=wt-wt for DDG, country=ALL for Brave, language=all for SearxNG - Scratchpad: mandate writing a 'goal' section at start of multi-step tasks - secretary max_iterations: 40→25, temperature: 0.7→0.5 - server_admin max_iterations: 40→20 Webclient: - ThinkingCard strips <thought> XML tags leaked by Ollama - planning_status WS event wired to chat.onPlanningStatus() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
2026-04-16	f83886a Browse files » Fix WS disconnect and missed stream on reconnect ... Two related problems: - During long AIHelper calls (non-streaming LLM), no data flows to the WebSocket and browsers drop the connection after ~30-60s of inactivity. Fixed with a 20s heartbeat: _stream_to_client now uses asyncio.wait_for and sends {"type":"heartbeat"} on timeout to keep the connection alive. - After reconnect, if the agent finished while the client was offline, _runs no longer holds the session and no stream_start is sent. Client would reconnect silently with no response shown. Fixed by sending {"type":"session_sync"} on every new WS connection (after reattach completes or immediately when no run is active) so the client knows to reload session history. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
2026-04-16	a338f8b Browse files » Add response metrics: elapsed time, tool calls, token count ... Server: - Message model: elapsed_seconds, tool_call_count, token_count fields (display-only, excluded from LLM context via exclude_none) - StreamEnd event: carries same three fields - agent.run_stream: tracks turn start time, counts ToolEvent completions, writes metrics onto the final assistant Message before saving to DB - WebSocket: forwards metrics in stream_end payload Client: - chat.onStreamEnd: attaches elapsed_seconds, tool_call_count, token_count to the streaming message on completion - buildMessageList: scans each assistant group for metrics from history - AssistantMessage: renders .msg-meta-row below the response — timer icon + Xs · wrench icon + N tools · coins icon + Nk tokens · time (each item only shown if present; time pushed right via margin-left: auto) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
2026-04-14	bbc93b5 Browse files » Expose compression summary as collapsible debug card in chat UI ... ContextCompressed event now carries the full summary text produced by the LLM. Compression notice in chat becomes a <details> element showing message count (before→after) with the summary expandable on click. Rendered as markdown via marked.js. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
2026-04-11	fe6d7bc Browse files » Add planning phase and scratchpad tool for smarter task execution ... - ScratchpadTool: session-scoped working notepad with named sections (write/append/read/clear). Lets Navi capture intermediate findings between tool calls instead of losing track of them. - Planning phase: when profile.planning_enabled=True, a fast pre-loop LLM call (think=False, no tools) outlines a numbered plan before any actions are taken. The plan is injected into session context as an assistant message so the model naturally continues from it. - PlanReady event + plan_ready WebSocket message + plan card in UI (green-tinted, collapsible, mirroring thinking card design). - secretary and server_admin profiles: planning_enabled=True, scratchpad added to enabled_tools, system prompts updated with explicit execution discipline instructions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
2026-04-11	65428f5 Browse files » Fix WebSocket state corruption preventing messages after first reply ... Replace concurrent WS reads (_stream_recv + recv_task.cancel()) with HTTP stop endpoint (POST /sessions/{id}/stop). Cancelling a background receive_text() task corrupted Starlette's WS state, breaking all subsequent receives. Now the WS has a single reader at all times. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
2026-04-10	86402e0 Browse files » Add stop button and fix context compression hang ... Stop generation: - Client: send button toggles to red ■ during streaming; sends {type:stop} via WS - Server: _stream_recv concurrently reads incoming messages during streaming using asyncio.wait — stop signal is handled immediately without polling - Cooperative stop via asyncio.Event (current_stop_event ContextVar): agent breaks out of LLM async-for cleanly so aclose() fires → Ollama stream closes gracefully, model stays in VRAM. No task.cancel() which would eject the model. - StreamStopped event propagates through run_stream/run_ephemeral; sub-agents stop via the same shared stop_event inherited through task context Context compression fix: - compress_context passes think=False to llm.complete() — no extended reasoning during summarization which caused GPU hang - Input truncated to 12k chars before sending to summarizer - LLMBackend.complete() / OllamaBackend.complete() accept think: bool \| None override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
	2012de2 Browse files » Profile switch: emit WS event so client updates UI immediately ... ProfileSwitched event emitted by switch_profile tool via current_event_sink. Client handles profile_switched: updates chat header, profile selector, and local sessions[] — no page refresh needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
	1e8b65e Browse files » Major feature batch: visibility, planning, file uploads, streaming ... - stream_complete(): streaming with tools for all LLM turns — thinking now streams as ThinkingDelta/ThinkingEnd in real-time during tool- selection turns, not just on the final response - todo built-in tool: session-scoped plan manager (set/view/update/clear); persona + all profiles updated with mandatory planning instructions - TurnThinking event: sub-agent thinking forwarded to parent sink as a collapsible block in the spawn_agent card - File uploads: non-image files uploaded via XHR, shown as badges in message bubble; SVG treated as regular file (not base64 image) - session_files: POST /sessions/{id}/files, TTL cleanup, forbidden exts - WebSocket reconnect: _AgentRun broadcast pattern, re-attach mid-stream - UI: favicon, sidebar logo, turn-thinking cards, subagent thinking blocks, token counter, draft persistence, file progress bar - Removed AgentNote (content is always None alongside tool_calls) - Ollama stream_complete: tool_calls captured from non-final chunk (done=False) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 10 Apr
2026-04-09	4112e36 Browse files » Live tool visibility: pending cards, sub-agent step log ... Backend: - ToolStarted event: emitted before tool execution begins so client can render a pending card with spinner immediately - ToolEvent gains is_subagent flag; ToolStarted same - current_event_sink ContextVar in tools/base.py — run_stream() sets it to an asyncio.Queue before create_task(); run_ephemeral() reads it and puts ToolStarted/ToolEvent into the queue as each sub-agent step runs - run_stream() tool loop: sequential execution via create_task() + polling drain loop (20ms sleep); yields ToolStarted → sub-agent events from sink → ToolEvent (completed) for each tool call - run_ephemeral() rewritten to inline sequential tool execution with sink emission (replaces _execute_tool_calls gather) - _run_single_tool() helper extracted for run_stream() - websocket.py handles tool_started and adds is_subagent to tool_call Frontend: - appendPendingToolCard(): creates card with spinner; spawn_agent opens body immediately to show sub-agent log as it fills - finalizeToolCard(): fills result, removes spinner, adds toggle; strips "[Sub-agent result — ...]" reminder prefix from displayed text - appendSubagentStep() / finalizeSubagentStep(): live step log inside spawn_agent card — each sub-agent tool call gets a ↳ row - app.js: tool_started → pending card; tool_call → finalize card; is_subagent routing to sub-step vs main card; abandonStream() resets pendingToolCard/pendingSubStep - CSS: .spinner-inline for card headers; .subagent-log / .subagent-step for nested step display; .tool-body-open for always-open spawn_agent body; .tool-card.pending suppresses chevron Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
2026-04-09	56611d7 Browse files » Add long-term user memory system ... Architecture: - navi/memory/store.py: MemoryStore backed by SQLite (memory_facts, memory_summary, session_memory_state tables in navi.db) - navi/memory/extractor.py: LLM-based fact extraction from sessions + summary regeneration (triggered after session goes idle >30 min) - Fact upsert uses UNIQUE(category, key) — same key always overwrites, no duplicates or stale contradictions - Keyword search across category + key + value (LIKE-based, no extra deps) Context injection: - Memory summary injected as an ephemeral system message on every LLM call via Agent._with_memory() — never persisted to session.context Tools (all profiles): - memory_search(query): keyword search against fact DB; persona instructs model to call it at session start and before personal-context questions - memory_forget(key, category?): delete a specific fact on user request Extraction trigger: - On new session creation, fire-and-forget background task checks all sessions idle >30 min with unprocessed messages → runs extraction Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 9 Apr
2026-04-08	261459a Browse files » Separate display history from LLM context; formalize worker system ... Architecture change: - session.messages: full display history, never modified by compression - session.context: what the LLM sees, may be compressed by workers - System messages go only into context (not display history) - Image injections (synthetic) go only into context - User/assistant/tool messages go into both SQLite: add context column with backward-compat migration (empty context → initialized from messages on load) Workers (navi/workers/): - Worker ABC + WorkerContext + WorkerResult (base.py) - CompressionWorker: compresses session.context when above threshold - build_default_workers() returns [CompressionWorker()] - Agent accepts workers list, runs them after StreamEnd - Workers injected via deps.py get_workers() (lru_cached singleton) - WebSocket agent construction also receives workers Compressor: compress_context() now takes context[], not messages[] Config: context_keep_recent 6 → 10 Agent: _run_workers() collects events from all workers and yields them Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	802c186 Browse files » Add context compression: rolling summarization when context fills up ... Mechanism: - After streaming ends, if context_tokens >= threshold (80% of num_ctx), compress old turns into a summary message using the same LLM - Partition: keep system msg + last N turns verbatim (default 6); everything older goes to the summarizer - Tool call groups (assistant + tool results) never split across boundary - Existing summary messages folded into new compression pass — no stack growth - Summary stored as Message(role=user, is_summary=True) after system msg - On failure: logged, session left unchanged (non-fatal) New files: - navi/core/compressor.py: should_compress, partition_messages, compress_session (pure logic, testable without agent) New config (navi/config.py): - context_compression_enabled: bool = True - context_compression_threshold: float = 0.80 - context_keep_recent: int = 6 - context_summary_temperature: float = 0.3 New agent event: ContextCompressed(messages_before, messages_after) Message.is_summary: bool field marks compressed history blocks Client: - context_compressed WS event → subtle inline notice in message list - loadHistory: is_summary messages rendered as collapsible summary cards - style.css: .summary-card, .compression-notice Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	9c0c6b3 Browse files » Add context token counter: 64k default, live UI display ... - config: ollama_num_ctx default 8192 → 65536 - LLMChunk: add prompt_tokens / completion_tokens fields - OllamaBackend.stream: populate token counts from final chunk (prompt_eval_count + eval_count when chunk.done) - StreamEnd: add context_tokens and max_context_tokens - Agent.run_stream: capture token counts, pass to StreamEnd - websocket: include context_tokens / max_context_tokens in stream_end - index.html: split chat-header into title span + token-counter span - sidebar.js: updateChatHeader targets #chat-header-title, not innerHTML - app.js: updateTokenCounter() shows "X/Y (Z%) tokens", colors: gray <50%, amber 50–79%, red ≥80% - style.css: .token-counter, .warn, .danger styles Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	f5f8d90 Browse files » Server review fixes: profile model routing, sorting, datetime, cleanup ... - LLMBackend.complete/stream: add model param; OllamaBackend uses it over self.model, enabling per-profile model selection - BackendRegistry.get(): remove unused model param - Agent: pass profile.model to complete() and stream() - Profiles: correct model to gemma4:e2b-it-q8_0 (was leftover e4b) - InMemorySessionStore.list_all(): fix sort (pinned+newest first, was pinned+oldest) — now consistent with SQLite ORDER BY - session.py, sqlite_session_store.py: datetime.utcnow() → datetime.now(timezone.utc) (deprecated since Python 3.12) - _base_options(): accept temperature param, remove dead default - deps.py: rename _registries → get_registries (public API) - websocket.py: update import accordingly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	c9ee0ec Browse files » Add thinking/reasoning streaming support ... Enable Ollama think param and stream reasoning chunks to client. New agent events: ThinkingDelta, ThinkingEnd. Config gains ollama_think and ollama_num_ctx settings. WebSocket protocol updated accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	9a8056d Browse files » Add multimodal image support and client UX improvements ... Server: - Add ImageViewTool (load image from file/URL, returns base64) - Add images field to Message model with created_at timestamp - Agent run/run_stream accept images param; inject image messages after image_view tool calls - WebSocket handler accepts images array from client, strips data URI prefix - All profiles include image_view tool - Fix tool call serialization (model_dump mode=json for datetime) - Add no-store cache headers for static files Client: - Image attachment: file picker button + clipboard paste + preview strip with remove - Images rendered in chat bubbles; loaded from history - Tool cards rebuilt as div+CSS toggle (fixes details/overflow-hidden collapse bug) - Tool cards appear before response bubble (lazy bubble creation on first stream_delta) - Typing indicator persists through tool calls, removed only when text starts streaming - Tool cards restored from history on page reload - Message timestamps stored via created_at field, shown correctly in history - Session ID reflected in URL hash for bookmarking; restored on page load - Remove localStorage session tracking (server last_active used instead) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 8 Apr
	41cdab1 Browse files » Initial implementation of the agent system core ... - FastAPI server with REST API and WebSocket streaming - Modular LLM backend abstraction (Ollama implemented, OpenAI stub) - Tool system: web_search (ddgs), filesystem, http_request, code_exec, terminal - Agent profiles: smart_home, server_admin, secretary - Tool-calling loop with concurrent tool execution - In-memory session store with SessionStore ABC for future persistence - Registry pattern for tools, profiles, and backends - Orchestrator stub as foundation for multi-agent scenarios Eugene Sukhodolskiy committed on 8 Apr