root/navi-1

Fork: 0

root / navi-1

History for navi-1 / navi / core

2026-04-21	e9d1e77 Browse files » Remove code-specific scoping rules from planning prompt ... Keep only the universal comma test heuristic — code-specific rules were too narrow and cluttered the prompt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 21 Apr
2026-04-21	0db1ea6 Browse files » Tighten AGENT step scoping in planning prompt ... Added comma test heuristic: if a step description lists things with 'and' or commas, each item is a separate step. Added code-specific guidance: one step = one file or one focused feature addition, never scaffold + logic + helpers combined. Replaced abstract good/bad examples with concrete code implementation examples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 21 Apr
2026-04-20	98c0be9 Browse files » Adaptive re-plan on todo step failure ... When a todo step is newly marked failed, queue a targeted system message for the next iteration prompting the model to revise its remaining pending steps before continuing. Enabled by adaptive_replan_enabled flag (on by default in developer profile). Zero overhead when no failure occurs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
	9704a92 Browse files » Autonomous reasoning improvements: budget, anchoring, anti-stall, validation ... - AgentProfile: per-profile thinking mechanics flags (think_enabled, iteration_budget_enabled, goal_anchoring, anti_stall, step_validation, planning_reflect, adaptive_replan) — all profiles updated in config.json - Iteration budget: inject remaining iterations into context so model knows when to wrap up; urgency levels at ≤7 and ≤3 remaining - Goal anchoring: inject original goal + todo state every N iterations to prevent drift on long tasks - Anti-stall: two signals — no todo progress for N iterations, or identical tool calls repeated N times; warning injected into context - Todo step validation: marking done requires a validation field describing how result was verified; failed gets a soft nudge with tip for re-planning - stream_complete: add think param to base class, ollama and openai backends - Summarizer: raise max_tokens 1024→3000, expand system prompt with user-preferences section and verbatim-value instructions - Compression card: persist to session.messages (is_compression flag on Message), show expandable summary in webclient with markdown body - ToolResult.to_message_content: always include output on failure so tracebacks and error details reach the model (fixes silent Error: None) - Developer profile: fix subagent profile secretary→developer, add write_tool to subagent_tools, clarify write_tool vs filesystem in system prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
	94e32e9 Browse files » Planning debug panel, todo auto-populate, scratchpad/persona improvements ... - Planning debug panel: new Planning tab in debug/index.html shows raw phase 1/2 outputs and token counts per planning run, stored in session.planning_logs (new column in both SQLite and PostgreSQL) - New GET /sessions/{id}/planning API endpoint - PlanningDebugData internal event wires _run_planning() output into session storage; never forwarded to WebSocket clients - Phase 3 (plan critic) disabled — to be reworked with reflect integration - Todo tool: auto-populated from plan steps after phase 2; model only needs to call update/view, not set - Scratchpad: clarified description and persona instructions; removed context_transfer from user-facing docs (internal mechanism only) - web_search: switched to ddgs package, SearXNG as primary backend, DDG html-only fallback; added find_up action to filesystem tool - Persona: added SCRATCHPAD and TODO sections with clear usage rules; added NAVI.md project context instructions - chat.js: fixed subagent planning event fallthrough into parent UI; statusLabel cleared on first stream delta Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
2026-04-17	59f01b3 Browse files » Route subagent planning events into spawn_agent card in the UI ... Previously PlanningStatus/PlanReady had no is_subagent flag, so subagent planning spinners and plan cards rendered as top-level Navi planning UI. Backend: - Add is_subagent field to PlanningStatus and PlanReady events - _run_planning accepts is_subagent param, passes it through all yields - run_ephemeral calls _run_planning with is_subagent=True - websocket.py forwards is_subagent in planning_status and plan_ready messages Frontend (chat.js): - onPlanningStatus: if is_subagent, set planningLabel on the last spawn_agent card instead of msg.statusLabel - onPlanReady: if is_subagent, push plan into spawn card steps and clear planningLabel; otherwise behave as before Frontend (ToolCard.vue): - Render subagent-planning-indicator (spinner + label) when planningLabel set - Render plan cards inside subagent steps using the same plan-card pattern Also includes leftover session changes: spawn_agent default 40 in description and manual, updated manual content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	2d2d5c4 Browse files » Fix subagent planning isolation and raise default max_iterations to 40 ... - run_ephemeral signature default: max_iterations=20 → 40 (consistent with spawn_agent's explicit default) - _run_planning accepts system_prompt_override; when called from run_ephemeral, passes the subagent's isolated system prompt instead of _build_system_prompt(profile) which includes the full orchestrator persona and profiles block — subagents now plan with only their own executor context Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	3ddd995 Browse files » Fix core subagent misuse: enforce 1 plan step = 1 spawn_agent call ... Root cause: nowhere was it stated that each AGENT step in the plan maps to a separate spawn_agent call. Navi was bundling all AGENT steps into a single call, dumping the full plan on one subagent. spawn_agent description: - Lead with: "Delegate EXACTLY ONE step of your plan" - Explicit: "3 AGENT steps = 3 spawn_agent calls" - Remove "multi-step sub-task" wording that invited bundling - briefing: clarify as static context only (credentials, paths, instructions) Dynamic findings from prior steps → context_transfer, not briefing Planning Phase 2 prompt: - Add AGENT scoping rules: each step = one focused unit, not "do everything" - Add good/bad examples of AGENT step granularity - Show multiple AGENT steps in the format example Secretary & server_admin system prompts: - Add explicit 1:1 rule with counter-example - Show correct multi-agent execution pattern with code example - Clarify briefing vs context_transfer boundary everywhere Persona: - "ONE PLAN STEP = ONE spawn_agent CALL" as first sentence in SUB-AGENTS - Field descriptions tightened: briefing = static, context_transfer = dynamic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	b9bef33 Browse files » Subagent system prompt rework: separate from parent, briefing as system context ... run_ephemeral: - Add briefing param (passed from spawn_agent, injected into system prompt) - Subagent system prompt is now completely separate from parent's system_prompt: 1. profile.subagent_system_prompt (executor persona) 2. custom_system_prompt (role specialisation for this task) 3. briefing (task context as system-level instruction) Fallback to profile.system_prompt only if subagent_system_prompt is not defined spawn_agent: - task → user message only (the goal) - briefing → system prompt (credentials, context, instructions) - system_prompt → role specialisation injected alongside briefing - Removed old user-message composition (## Context / ## Task split) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	9c8ef3d Browse files » Fix NameError in _run_planning: session.context → context after refactor ... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	73cab8a Browse files » Improve subagent system: isolated tools, custom prompts, context transfer, timeout ... AgentProfile: - New fields: subagent_tools, subagent_planning_enabled, subagent_system_prompt - loader.py: loads subagent_tools/subagent_planning_enabled from config.json, reads optional subagent_system_prompt.txt per profile Profiles: - Each profile now has a dedicated subagent_tools list (focused subset, no admin tools) - subagent_planning_enabled: false (configurable per profile) - New subagent_system_prompt.txt per profile with executor-focused instructions run_ephemeral: - Uses profile.subagent_tools instead of enabled_tools - Builds subagent context without persona or profiles block (focused executor) - Injects subagent_system_prompt after profile.system_prompt - Accepts context_transfer: priming exchange injected before task message - Wall-clock timeout (default 5 min) checked per iteration - Returns (result_text, completed: bool) instead of bare string - Optionally runs planning phase if profile.subagent_planning_enabled spawn_agent: - Removed briefing param; task is now fully self-contained - Added system_prompt param: custom injected prompt for this specific task - Auto-reads parent scratchpad context_transfer section via get_section() - Result prefixed with [STATUS: completed\|limit_reached] - Timeout 300s scratchpad: - Added get_section(session_id, section) helper for cross-session reads Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	0c3dc98 Browse files » Planning phases, context compression, and tool improvements ... Agent: - Planning now a 3-phase async generator: Analysis → Execution plan → AIHelper critic - Yield PlanningStatus events before each phase (UI progress labels) - Phase 1 runs with think=True for deeper analysis - Phase 2 includes available tool list so executor assignments are accurate - Phase 3: independent critic pass validates and corrects TOOL: names against real tool list - Planning converted from list return to async generator (fixes token accounting) Backend: - Context compression threshold: 80% → 70% to trigger earlier - Compressor summary prompt: structured sections (goal, work state, key facts, outputs, errors) - Terminal output capped at 5000 chars to prevent context flooding - Web search: region=wt-wt for DDG, country=ALL for Brave, language=all for SearxNG - Scratchpad: mandate writing a 'goal' section at start of multi-step tasks - secretary max_iterations: 40→25, temperature: 0.7→0.5 - server_admin max_iterations: 40→20 Webclient: - ThinkingCard strips <thought> XML tags leaked by Ollama - planning_status WS event wired to chat.onPlanningStatus() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	6e3ab45 Browse files » Add reflect tool: three parallel expert perspectives ... ReflectTool runs Critic / Pragmatist / Detailer advisors concurrently via asyncio.gather() + AIHelper.ask(). Each role has a distinct system prompt designed to produce genuinely different analysis: - Critic: challenges assumptions, surfaces risks and logical gaps - Pragmatist: finds the simplest path, cuts unnecessary complexity - Detailer: spots missing requirements, edge cases, ambiguities Parameters: situation (required), assumptions (required list — the key input that forces Navi to surface implicit beliefs), tried (optional). Registered as a builtin with AIHelper injection. Added to all three profiles. Persona updated with guidance on when to use it (complex or ambiguous tasks before planning, or when stuck mid-execution). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
2026-04-16	b1dd9ca Browse files » Count AIHelper tokens in session metrics ... Adds prompt/completion token fields to LLMResponse, populated by OllamaBackend.complete(). AIHelper emits AIHelperTokensUsed into the current event sink after each LLM call; run_stream drains it into _subagent_tokens so AIHelper usage is reflected in the turn token delta. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	533f9ee Browse files » Add AIHelper + filesystem query/smart_edit AI actions ... AIHelper (navi/core/ai_helper.py): - Reusable LLM utility for AI-enhanced tools: ask() and ask_json() - Reads current_model ContextVar (set per-turn) so tools always use the session's active model without extra wiring - Robust JSON extraction: strips markdown fences, bracket-matching fallback current_model ContextVar (navi/tools/base.py): - New ContextVar set by run_stream() and run_ephemeral() after profile is resolved; AIHelper reads it to pick the right model automatically filesystem query action: - Natural language question about any file, chunked at ~20k tokens of content (~80k chars) with 30-line overlap between chunks - Single-chunk: one LLM call; multi-chunk: partial answers accumulated then synthesized in a final call filesystem smart_edit action: - Natural language edit instruction on files up to ~200k chars - LLM outputs JSON patch ops: replace / delete / insert (1-based lines) - Ops validated then applied bottom-up to preserve line numbers - Returns unified diff of changes; preserves trailing newline registry: AIHelper created once, OllamaBackend reused (no double init), FilesystemTool receives ai_helper at construction Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	59cdf7f Browse files » Make profile switching autonomous: switch immediately, inform after ... Previously Navi asked for permission before switching profiles. Updated both the injected profiles block in the system prompt and the switch_profile tool description to explicitly say: switch on your own judgment, do not ask, then inform the user which profile is active and why. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	62ad39f Browse files » Add profile discoverability: list_profiles tool + system prompt injection ... - AgentProfile: new short_description (1-line) and full_description (dict with specialization / when_to_use / key_tools) fields - All 3 profile configs: structured descriptions added; list_profiles added to enabled_tools - _build_system_prompt: now accepts full AgentProfile; injects compact "Available profiles" block into every system prompt so Navi always knows what other profiles exist and when to switch — dynamically, no hardcoding - ListProfilesTool: new built-in; returns structured per-profile details (specialization, when_to_use, key_tools); accepts optional profile_id for single-profile lookup - registry: register list_profiles_tool after profiles registry is built Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	af8dfdb Browse files » Fix metrics: net token delta, subagent aggregation, ContextBar always visible ... - run_stream: track _prev_tokens baseline before turn loop; compute net token cost as (context_tokens - prev) + subagent_tokens for per-message cost - run_stream: intercept SubagentComplete in sink drain loop to accumulate subagent token and tool-call counts into the parent turn's totals - run_ephemeral: already emitting SubagentComplete (from prior session) - msg-meta-row: remove margin-left:auto from .msg-meta-time so time groups inline with elapsed/tools/tokens instead of floating right - ContextBar: remove v-if guard so bar is always visible (not only after first LLM response with token data) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	a338f8b Browse files » Add response metrics: elapsed time, tool calls, token count ... Server: - Message model: elapsed_seconds, tool_call_count, token_count fields (display-only, excluded from LLM context via exclude_none) - StreamEnd event: carries same three fields - agent.run_stream: tracks turn start time, counts ToolEvent completions, writes metrics onto the final assistant Message before saving to DB - WebSocket: forwards metrics in stream_end payload Client: - chat.onStreamEnd: attaches elapsed_seconds, tool_call_count, token_count to the streaming message on completion - buildMessageList: scans each assistant group for metrics from history - AssistantMessage: renders .msg-meta-row below the response — timer icon + Xs · wrench icon + N tools · coins icon + Nk tokens · time (each item only shown if present; time pushed right via margin-left: auto) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	5c34cfd Browse files » Add session name generation via LLM ... Backend: - Session model gets name: str \| None field - SQLite migration: ADD COLUMN name TEXT - PostgreSQL: ADD COLUMN IF NOT EXISTS name TEXT (applied on pool init) - SessionStore: add set_name() abstract method, implemented in all stores - navi/core/name_generator.py: LLM worker that reads user messages and returns a 3–6 word title or None if content isn't substantial yet - POST /sessions/{id}/generate-name endpoint: fires LLM, saves and returns name; skips if session already named or has no user messages - GET /sessions and GET /sessions/{id} now include name field Client: - api.generateSessionName(id) — calls the new endpoint - sessions store: updateName(id, name) mutation - chat store: after stream_end, _tryGenerateName() runs fire-and-forget; skips silently if session already has a name or if request fails - SessionItem already displays session.name (falls back to id prefix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	ea5766e Browse files » Persist thinking and plan cards across session reloads ... - Message: add thinking and is_plan fields (display-only, not sent to LLM) - Agent main loop: accumulate thinking per iteration, save with assistant message - _run_planning: also append plan to session.messages with is_plan=True so UI can render plan cards after page reload (context already had the plan) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
2026-04-15	23e0a5d Browse files » Fix Ollama connection leak and empty message bug in agent ... - _iter_stream_guarded: track chunk_task as nullable, cancel in finally block to prevent zombie HTTP connections accumulating under load - Final turn: use `content or None` so empty text isn't saved to DB - client/index.html: point to new Vue webclient build - profiles: add email_manager tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
	2d2bf84 Browse files » Migrate storage to PostgreSQL with SQLite fallback; misc fixes ... - Add PgSessionStore (asyncpg pool) and PgMemoryStore replacing aiosqlite - Keep SqliteSessionStore + SqliteMemoryStore for zero-dependency quick start - Selection logic in deps.py: DATABASE_URL set → PG, else → SQLite - Add asyncpg>=0.29 to dependencies; add DATABASE_URL / DB_PATH to config - Add RESPONSE HYGIENE rule to persona: never echo tool output or plan state - Add developer profile user tools: weather, internal_monitor - Update README: developer profile, DB section, current tool/profile state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
	2c4b808 Browse files » Add delete_tool: trash-based tool removal with restore support ... Moves tool files to tools/.trash/ instead of deleting permanently. Actions: remove (trash + unregister), restore (recover + re-register), list. Data files are intentionally left in place on both remove and restore. Available only in the developer profile. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
	b4a6be8 Browse files » Add developer profile; replace write_tool pattern with direct filesystem approach ... - New TestToolTool: runs a user tool's execute() from disk in isolation, returns result or full traceback. No stale module cache — always fresh import. - New developer profile: full architecture knowledge in system prompt (format rules, file locations, workflow, data persistence, common mistakes), test_tool + reload_tools + filesystem/terminal/code_exec toolset, spawn_agent for API research only. - Remove write_tool and reload_tools from server_admin and smart_home profiles. - persona.txt: drop SELF-EXTENSION block; add one-liner to switch to developer profile when the user asks to create/edit a tool. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
	4b64763 Browse files » Add explicit output token budget for summarizer (context_summary_max_tokens) ... Previously there was no num_predict set for the summarization LLM call, so Ollama used its server default (often 128 tokens — very short summaries). - Add max_tokens param to LLMBackend.complete() and OllamaBackend (→ num_predict) - Add context_summary_max_tokens: int = 1024 to config - Thread it through compress_context() and CompressionWorker Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
	96548a1 Browse files » Expand summarization budget for better context quality ... - _MAX_SUMMARY_INPUT_CHARS: 12k → 24k chars (2x input fed to summarizer) - context_keep_recent: 10 → 8 turns (2 more turns go into each summary batch) - Summarizer prompt: replace "Be brief" with "Be thorough" — capture code/config snippets and enough detail to continue the conversation without original messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
2026-04-14	e08b681 Browse files » Consolidate memory_search/save/forget into single memory tool ... Three separate tools → one tool with action enum (save/search/forget/list). Reduces tool-slot pressure; same functionality, same MemoryStore backend. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
	c24e51d Browse files » Add memory_save tool for proactive fact persistence ... Navi previously had no way to write to memory mid-conversation — she could only search and forget. Facts were extracted automatically after sessions went idle for 30+ min, so important context shared by the user could be lost or delayed. - New MemorySaveTool (navi/tools/memory_save.py): upsert a fact by category/key/value; overwrites existing key so no separate forget needed - Registered as builtin alongside memory_search/memory_forget - Added to all three profiles (secretary, server_admin, smart_home) - persona.txt: explicit "call memory_save immediately when..." guidance so Navi saves stable facts as they arrive, not only post-session Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
	bbc93b5 Browse files » Expose compression summary as collapsible debug card in chat UI ... ContextCompressed event now carries the full summary text produced by the LLM. Compression notice in chat becomes a <details> element showing message count (before→after) with the summary expandable on click. Rendered as markdown via marked.js. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr