root/navi-1

Fork: 0

root / navi-1

History for navi-1 / navi / core / agent.py

2026-04-25	65ffb4d Browse files » Add context providers: dynamic system message injection per LLM call ... - navi/context_providers/ registry + built-in public_url provider (global, always injected) - context_providers/ user directory, hot-reloaded via reload_tools - AgentProfile.context_providers field for per-profile opt-in providers - Agent._collect_context_injections() called before every tool-calling loop - reload_tools now reloads both user tools and user context providers - manuals/write_context_provider.md for Navi, docs/context_providers.md reference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 25 Apr
2026-04-24	b1a5f44 Browse files » Set temperature=1.0, top_k=64, top_p=0.95 for all profiles (Google recommended for gemma4) ... Also fixes discuss profile memory tools: use combined "memory" tool name, not nonexistent split variants. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 24 Apr
2026-04-24	66fb042 Browse files » Add per-phase planning flags and planning_mandatory ... - planning_mandatory: disables DIRECT shortcut, forces all phases to run - planning_phase1_enabled / phase2_enabled / phase3_enabled: per-phase toggles - planning_phase2_enabled replaces planning_reflect_enabled (migrated in loader with backward compat) - Migrate all profile configs; rewrite docs/profiles.md as full config reference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 24 Apr
2026-04-21	f7c7a17 Browse files » Agent improvements: mandatory planning, tool cleanup, smart_edit fixes ... - Planning now mandatory on first message of every session (force_plan) - RESOURCES, COMMITMENTS, ATOMICITY fields added to planning phase 1 - Todo auto-injected at iteration 0 so model tracks steps immediately - Execution trigger injected after plan to prevent model treating plan as response - Split developer profile: tool_developer (Navi tools) vs developer (general code) - Simplified persona.txt: trimmed redundant content now handled by mechanics - AIHelper.ask(): 120s timeout via asyncio.wait_for to prevent smart_edit hangs - filesystem._smart_edit(): atomic write via temp file + os.replace() - Removed 5 junk user tools (game project artifacts, trivial utilities) - Removed instagram tools (to be rewritten); cleaned enabled.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 21 Apr
	e9d1e77 Browse files » Remove code-specific scoping rules from planning prompt ... Keep only the universal comma test heuristic — code-specific rules were too narrow and cluttered the prompt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 21 Apr
	0db1ea6 Browse files » Tighten AGENT step scoping in planning prompt ... Added comma test heuristic: if a step description lists things with 'and' or commas, each item is a separate step. Added code-specific guidance: one step = one file or one focused feature addition, never scaffold + logic + helpers combined. Replaced abstract good/bad examples with concrete code implementation examples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 21 Apr
2026-04-20	98c0be9 Browse files » Adaptive re-plan on todo step failure ... When a todo step is newly marked failed, queue a targeted system message for the next iteration prompting the model to revise its remaining pending steps before continuing. Enabled by adaptive_replan_enabled flag (on by default in developer profile). Zero overhead when no failure occurs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
	9704a92 Browse files » Autonomous reasoning improvements: budget, anchoring, anti-stall, validation ... - AgentProfile: per-profile thinking mechanics flags (think_enabled, iteration_budget_enabled, goal_anchoring, anti_stall, step_validation, planning_reflect, adaptive_replan) — all profiles updated in config.json - Iteration budget: inject remaining iterations into context so model knows when to wrap up; urgency levels at ≤7 and ≤3 remaining - Goal anchoring: inject original goal + todo state every N iterations to prevent drift on long tasks - Anti-stall: two signals — no todo progress for N iterations, or identical tool calls repeated N times; warning injected into context - Todo step validation: marking done requires a validation field describing how result was verified; failed gets a soft nudge with tip for re-planning - stream_complete: add think param to base class, ollama and openai backends - Summarizer: raise max_tokens 1024→3000, expand system prompt with user-preferences section and verbatim-value instructions - Compression card: persist to session.messages (is_compression flag on Message), show expandable summary in webclient with markdown body - ToolResult.to_message_content: always include output on failure so tracebacks and error details reach the model (fixes silent Error: None) - Developer profile: fix subagent profile secretary→developer, add write_tool to subagent_tools, clarify write_tool vs filesystem in system prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
	94e32e9 Browse files » Planning debug panel, todo auto-populate, scratchpad/persona improvements ... - Planning debug panel: new Planning tab in debug/index.html shows raw phase 1/2 outputs and token counts per planning run, stored in session.planning_logs (new column in both SQLite and PostgreSQL) - New GET /sessions/{id}/planning API endpoint - PlanningDebugData internal event wires _run_planning() output into session storage; never forwarded to WebSocket clients - Phase 3 (plan critic) disabled — to be reworked with reflect integration - Todo tool: auto-populated from plan steps after phase 2; model only needs to call update/view, not set - Scratchpad: clarified description and persona instructions; removed context_transfer from user-facing docs (internal mechanism only) - web_search: switched to ddgs package, SearXNG as primary backend, DDG html-only fallback; added find_up action to filesystem tool - Persona: added SCRATCHPAD and TODO sections with clear usage rules; added NAVI.md project context instructions - chat.js: fixed subagent planning event fallthrough into parent UI; statusLabel cleared on first stream delta Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 20 Apr
2026-04-17	59f01b3 Browse files » Route subagent planning events into spawn_agent card in the UI ... Previously PlanningStatus/PlanReady had no is_subagent flag, so subagent planning spinners and plan cards rendered as top-level Navi planning UI. Backend: - Add is_subagent field to PlanningStatus and PlanReady events - _run_planning accepts is_subagent param, passes it through all yields - run_ephemeral calls _run_planning with is_subagent=True - websocket.py forwards is_subagent in planning_status and plan_ready messages Frontend (chat.js): - onPlanningStatus: if is_subagent, set planningLabel on the last spawn_agent card instead of msg.statusLabel - onPlanReady: if is_subagent, push plan into spawn card steps and clear planningLabel; otherwise behave as before Frontend (ToolCard.vue): - Render subagent-planning-indicator (spinner + label) when planningLabel set - Render plan cards inside subagent steps using the same plan-card pattern Also includes leftover session changes: spawn_agent default 40 in description and manual, updated manual content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	2d2d5c4 Browse files » Fix subagent planning isolation and raise default max_iterations to 40 ... - run_ephemeral signature default: max_iterations=20 → 40 (consistent with spawn_agent's explicit default) - _run_planning accepts system_prompt_override; when called from run_ephemeral, passes the subagent's isolated system prompt instead of _build_system_prompt(profile) which includes the full orchestrator persona and profiles block — subagents now plan with only their own executor context Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	3ddd995 Browse files » Fix core subagent misuse: enforce 1 plan step = 1 spawn_agent call ... Root cause: nowhere was it stated that each AGENT step in the plan maps to a separate spawn_agent call. Navi was bundling all AGENT steps into a single call, dumping the full plan on one subagent. spawn_agent description: - Lead with: "Delegate EXACTLY ONE step of your plan" - Explicit: "3 AGENT steps = 3 spawn_agent calls" - Remove "multi-step sub-task" wording that invited bundling - briefing: clarify as static context only (credentials, paths, instructions) Dynamic findings from prior steps → context_transfer, not briefing Planning Phase 2 prompt: - Add AGENT scoping rules: each step = one focused unit, not "do everything" - Add good/bad examples of AGENT step granularity - Show multiple AGENT steps in the format example Secretary & server_admin system prompts: - Add explicit 1:1 rule with counter-example - Show correct multi-agent execution pattern with code example - Clarify briefing vs context_transfer boundary everywhere Persona: - "ONE PLAN STEP = ONE spawn_agent CALL" as first sentence in SUB-AGENTS - Field descriptions tightened: briefing = static, context_transfer = dynamic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	b9bef33 Browse files » Subagent system prompt rework: separate from parent, briefing as system context ... run_ephemeral: - Add briefing param (passed from spawn_agent, injected into system prompt) - Subagent system prompt is now completely separate from parent's system_prompt: 1. profile.subagent_system_prompt (executor persona) 2. custom_system_prompt (role specialisation for this task) 3. briefing (task context as system-level instruction) Fallback to profile.system_prompt only if subagent_system_prompt is not defined spawn_agent: - task → user message only (the goal) - briefing → system prompt (credentials, context, instructions) - system_prompt → role specialisation injected alongside briefing - Removed old user-message composition (## Context / ## Task split) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	9c8ef3d Browse files » Fix NameError in _run_planning: session.context → context after refactor ... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	73cab8a Browse files » Improve subagent system: isolated tools, custom prompts, context transfer, timeout ... AgentProfile: - New fields: subagent_tools, subagent_planning_enabled, subagent_system_prompt - loader.py: loads subagent_tools/subagent_planning_enabled from config.json, reads optional subagent_system_prompt.txt per profile Profiles: - Each profile now has a dedicated subagent_tools list (focused subset, no admin tools) - subagent_planning_enabled: false (configurable per profile) - New subagent_system_prompt.txt per profile with executor-focused instructions run_ephemeral: - Uses profile.subagent_tools instead of enabled_tools - Builds subagent context without persona or profiles block (focused executor) - Injects subagent_system_prompt after profile.system_prompt - Accepts context_transfer: priming exchange injected before task message - Wall-clock timeout (default 5 min) checked per iteration - Returns (result_text, completed: bool) instead of bare string - Optionally runs planning phase if profile.subagent_planning_enabled spawn_agent: - Removed briefing param; task is now fully self-contained - Added system_prompt param: custom injected prompt for this specific task - Auto-reads parent scratchpad context_transfer section via get_section() - Result prefixed with [STATUS: completed\|limit_reached] - Timeout 300s scratchpad: - Added get_section(session_id, section) helper for cross-session reads Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
	0c3dc98 Browse files » Planning phases, context compression, and tool improvements ... Agent: - Planning now a 3-phase async generator: Analysis → Execution plan → AIHelper critic - Yield PlanningStatus events before each phase (UI progress labels) - Phase 1 runs with think=True for deeper analysis - Phase 2 includes available tool list so executor assignments are accurate - Phase 3: independent critic pass validates and corrects TOOL: names against real tool list - Planning converted from list return to async generator (fixes token accounting) Backend: - Context compression threshold: 80% → 70% to trigger earlier - Compressor summary prompt: structured sections (goal, work state, key facts, outputs, errors) - Terminal output capped at 5000 chars to prevent context flooding - Web search: region=wt-wt for DDG, country=ALL for Brave, language=all for SearxNG - Scratchpad: mandate writing a 'goal' section at start of multi-step tasks - secretary max_iterations: 40→25, temperature: 0.7→0.5 - server_admin max_iterations: 40→20 Webclient: - ThinkingCard strips <thought> XML tags leaked by Ollama - planning_status WS event wired to chat.onPlanningStatus() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 17 Apr
2026-04-16	b1dd9ca Browse files » Count AIHelper tokens in session metrics ... Adds prompt/completion token fields to LLMResponse, populated by OllamaBackend.complete(). AIHelper emits AIHelperTokensUsed into the current event sink after each LLM call; run_stream drains it into _subagent_tokens so AIHelper usage is reflected in the turn token delta. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	533f9ee Browse files » Add AIHelper + filesystem query/smart_edit AI actions ... AIHelper (navi/core/ai_helper.py): - Reusable LLM utility for AI-enhanced tools: ask() and ask_json() - Reads current_model ContextVar (set per-turn) so tools always use the session's active model without extra wiring - Robust JSON extraction: strips markdown fences, bracket-matching fallback current_model ContextVar (navi/tools/base.py): - New ContextVar set by run_stream() and run_ephemeral() after profile is resolved; AIHelper reads it to pick the right model automatically filesystem query action: - Natural language question about any file, chunked at ~20k tokens of content (~80k chars) with 30-line overlap between chunks - Single-chunk: one LLM call; multi-chunk: partial answers accumulated then synthesized in a final call filesystem smart_edit action: - Natural language edit instruction on files up to ~200k chars - LLM outputs JSON patch ops: replace / delete / insert (1-based lines) - Ops validated then applied bottom-up to preserve line numbers - Returns unified diff of changes; preserves trailing newline registry: AIHelper created once, OllamaBackend reused (no double init), FilesystemTool receives ai_helper at construction Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	59cdf7f Browse files » Make profile switching autonomous: switch immediately, inform after ... Previously Navi asked for permission before switching profiles. Updated both the injected profiles block in the system prompt and the switch_profile tool description to explicitly say: switch on your own judgment, do not ask, then inform the user which profile is active and why. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	62ad39f Browse files » Add profile discoverability: list_profiles tool + system prompt injection ... - AgentProfile: new short_description (1-line) and full_description (dict with specialization / when_to_use / key_tools) fields - All 3 profile configs: structured descriptions added; list_profiles added to enabled_tools - _build_system_prompt: now accepts full AgentProfile; injects compact "Available profiles" block into every system prompt so Navi always knows what other profiles exist and when to switch — dynamically, no hardcoding - ListProfilesTool: new built-in; returns structured per-profile details (specialization, when_to_use, key_tools); accepts optional profile_id for single-profile lookup - registry: register list_profiles_tool after profiles registry is built Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	af8dfdb Browse files » Fix metrics: net token delta, subagent aggregation, ContextBar always visible ... - run_stream: track _prev_tokens baseline before turn loop; compute net token cost as (context_tokens - prev) + subagent_tokens for per-message cost - run_stream: intercept SubagentComplete in sink drain loop to accumulate subagent token and tool-call counts into the parent turn's totals - run_ephemeral: already emitting SubagentComplete (from prior session) - msg-meta-row: remove margin-left:auto from .msg-meta-time so time groups inline with elapsed/tools/tokens instead of floating right - ContextBar: remove v-if guard so bar is always visible (not only after first LLM response with token data) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	a338f8b Browse files » Add response metrics: elapsed time, tool calls, token count ... Server: - Message model: elapsed_seconds, tool_call_count, token_count fields (display-only, excluded from LLM context via exclude_none) - StreamEnd event: carries same three fields - agent.run_stream: tracks turn start time, counts ToolEvent completions, writes metrics onto the final assistant Message before saving to DB - WebSocket: forwards metrics in stream_end payload Client: - chat.onStreamEnd: attaches elapsed_seconds, tool_call_count, token_count to the streaming message on completion - buildMessageList: scans each assistant group for metrics from history - AssistantMessage: renders .msg-meta-row below the response — timer icon + Xs · wrench icon + N tools · coins icon + Nk tokens · time (each item only shown if present; time pushed right via margin-left: auto) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
	ea5766e Browse files » Persist thinking and plan cards across session reloads ... - Message: add thinking and is_plan fields (display-only, not sent to LLM) - Agent main loop: accumulate thinking per iteration, save with assistant message - _run_planning: also append plan to session.messages with is_plan=True so UI can render plan cards after page reload (context already had the plan) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 16 Apr
2026-04-15	23e0a5d Browse files » Fix Ollama connection leak and empty message bug in agent ... - _iter_stream_guarded: track chunk_task as nullable, cancel in finally block to prevent zombie HTTP connections accumulating under load - Final turn: use `content or None` so empty text isn't saved to DB - client/index.html: point to new Vue webclient build - profiles: add email_manager tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 15 Apr
2026-04-14	33b2880 Browse files » Improve filesystem, web search, context guard, and subagent narration ... filesystem: add find (glob), info (stat), move, append actions; read now supports offset/limit with hard 1MB guard; list shows sizes, dates, optional recursion. web_search: retry DDG across auto/html/lite backends; add optional Brave Search API and SearXNG fallbacks configured via .env. agent: fix ContextTooLargeError to surface as Navi response instead of raw system error; fix _check_context_size to calculate from remaining budget (window - output_reserve) rather than a fixed 92% threshold. persona: add ReAct narration instruction to subagent briefing template. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
	8c88f49 Browse files » Fix LLM hang: stop button during prefill, context guard, timeouts ... Root cause: during prefill (processing input tokens), Ollama emits no HTTP chunks. The `async for chunk in stream_complete()` loop body never executes, so stop_event is never checked — Stop button has no effect. Same issue with complete() calls (planning, compression): blocking await with no cancellation path. Fixes: _iter_stream_guarded() (agent.py, module-level): Wraps any stream_complete() generator. Polls stop_event every 1s while waiting for the next chunk using asyncio.wait() — so Stop works even during multi-minute prefill. On stop or timeout, calls aclose() on the generator which closes the HTTP connection to Ollama → generation halts → GPU drops to idle. Applied to both run_stream() and run_ephemeral(). _check_context_size() (Agent method): Estimates context tokens (chars/4 + 500 per image) before every LLM call. Raises ContextTooLargeError (new NaviError subclass) at 92% of ollama_num_ctx — before Ollama ever receives the request. _run_planning() timeouts: Both complete() calls (phase 1 and 2) wrapped with asyncio.wait_for(). Timeout logged and planning skipped gracefully — execution continues. New config (config.py): llm_complete_timeout = 120s llm_stream_first_chunk_timeout = 180s (prefill budget) llm_stream_chunk_timeout = 60s (inter-token budget) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
	594ad9e Browse files » Improve planning: two-phase pipeline and orchestrator discipline ... agent.py: - _run_planning() now runs two sequential LLM calls: Phase 1 (analysis): reformulate task, identify subtasks and unknowns; skip immediately if DIRECT. Phase 2 (execution plan): assign each subtask an executor — TOOL/AGENT/SELF — using a structured ## Plan format. Phase 2 context = analysis (embedded in system prompt) + last user message only; full history excluded to keep focus on plan structure. - Warn in logs when plan lacks TOOL/AGENT/SELF executor assignments. persona.txt: - MANDATORY sequence: step 0 = scratchpad init before anything else; todo tasks must mirror plan steps exactly (same order, same executors). - PLAN → EXECUTION BINDING: explicit rule — never switch an AGENT step to inline execution silently. - SCRATCHPAD: initialize sections at task start, not after first tool call; write context to scratchpad before briefing subagents. - Fix typo in BRIEFING ("sub-lagent" → "sub-agent"). - Replace stale Knowledge Retrieval Protocol with accurate one-liner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 14 Apr
2026-04-11	b292ddd Browse files » Strengthen Navi planning/delegation, unify toolsets, isolate subagent scratchpad ... persona.txt: - DELEGATION: 'default to spawning, not to doing inline' — stronger default, clearer triggers, explicit when-not-to-spawn rules - PLANNING: ties automatic planning phase to mandatory todo(op='set') as first tool call; reconciles pre-loop plan with in-loop execution discipline - SCRATCHPAD: new section — when to write, section naming conventions, mandatory read before final answer Profiles (secretary, server_admin, smart_home): - All three now share the same 18-tool set (each file independent) - planning_enabled=True on all three - scratchpad and web_search added to smart_home - System prompts updated with scratchpad/todo execution discipline sections agent.py run_ephemeral: - Each subagent gets a unique session ID (subagent_<uuid>) for scratchpad isolation — parallel or sequential subagents no longer share working notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
	efb870f Browse files » Skip planning phase for simple/direct requests ... The planning prompt now asks the model to respond with "DIRECT" if the request doesn't need multiple steps. Added a regex fallback: if the response has no numbered steps it's also discarded. This prevents plan cards appearing for conversational replies that would just duplicate the final message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr
	fe6d7bc Browse files » Add planning phase and scratchpad tool for smarter task execution ... - ScratchpadTool: session-scoped working notepad with named sections (write/append/read/clear). Lets Navi capture intermediate findings between tool calls instead of losing track of them. - Planning phase: when profile.planning_enabled=True, a fast pre-loop LLM call (think=False, no tools) outlines a numbered plan before any actions are taken. The plan is injected into session context as an assistant message so the model naturally continues from it. - PlanReady event + plan_ready WebSocket message + plan card in UI (green-tinted, collapsible, mirroring thinking card design). - secretary and server_admin profiles: planning_enabled=True, scratchpad added to enabled_tools, system prompts updated with explicit execution discipline instructions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Eugene Sukhodolskiy committed on 11 Apr