| 2026-04-25 |
Fix profile prompt inconsistencies
Eugene Sukhodolskiy
committed
on 25 Apr
|
Tune profile sampling configs
Eugene Sukhodolskiy
committed
on 25 Apr
|
temp -
Eugene Sukhodolskiy
committed
on 25 Apr
|
Add context providers: dynamic system message injection per LLM call
...
- navi/context_providers/ registry + built-in public_url provider (global, always injected)
- context_providers/ user directory, hot-reloaded via reload_tools
- AgentProfile.context_providers field for per-profile opt-in providers
- Agent._collect_context_injections() called before every tool-calling loop
- reload_tools now reloads both user tools and user context providers
- manuals/write_context_provider.md for Navi, docs/context_providers.md reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 25 Apr
|
| 2026-04-24 |
Set temperature=1.0, top_k=64, top_p=0.95 for all profiles (Google recommended for gemma4)
...
Also fixes discuss profile memory tools: use combined "memory" tool name, not nonexistent split variants.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 Apr
|
Fix discuss profile planning: phase1 on (DIRECT gate), phase3 off
...
Model can skip planning via DIRECT; if it doesn't, analysis runs silently
without producing a plan card in the UI.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 Apr
|
Add discuss profile; responsive WelcomeScreen for 6+ profiles
...
- New 'discuss' profile: creative Q&A and idea discussion, temp=1.0,
planning phase 3 only, tools: web_search/view, scratchpad, reflect,
memory, image_view, todo
- WelcomeScreen mobile: 2-column grid for profile cards, compact logo
(row layout with subtitle on second line), reduced padding/gaps
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 Apr
|
Add per-phase planning flags and planning_mandatory
...
- planning_mandatory: disables DIRECT shortcut, forces all phases to run
- planning_phase1_enabled / phase2_enabled / phase3_enabled: per-phase toggles
- planning_phase2_enabled replaces planning_reflect_enabled (migrated in loader with backward compat)
- Migrate all profile configs; rewrite docs/profiles.md as full config reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 Apr
|

Add Ollama multi-server fallback with in-memory blacklisting
...
- New FallbackOllamaBackend (navi/llm/fallback.py): tries servers and
models in priority order; on LLMConnectionError blacklists the server
for the process lifetime, on LLMModelNotFoundError blacklists the
(server, model) pair — eliminates latency from repeated failed probes
- OllamaBackend now raises typed LLMConnectionError / LLMModelNotFoundError
instead of bare LLMBackendError; accepts list[str] | str | None for model
- AgentProfile.model changed from str to list[str] (str auto-normalised);
all profiles updated to ["gemma4:31b-cloud", "gemma4:26b-a4b-it-q4_K_M"]
- New config field OLLAMA_BACKENDS_FILE: path to [{host, api_key?}] JSON;
when set, registry creates FallbackOllamaBackend instead of OllamaBackend
- ollama_backends.json template added (gitignored — contains API key)
- current_model ContextVar type widened to list[str] | str | None
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 24 Apr
|
| 2026-04-22 |
Use gemma4 cloud model by default
Eugene Sukhodolskiy
committed
on 22 Apr
|
Support Ollama Cloud API key
Eugene Sukhodolskiy
committed
on 22 Apr
|
| 2026-04-21 |
Agent improvements: mandatory planning, tool cleanup, smart_edit fixes
...
- Planning now mandatory on first message of every session (force_plan)
- RESOURCES, COMMITMENTS, ATOMICITY fields added to planning phase 1
- Todo auto-injected at iteration 0 so model tracks steps immediately
- Execution trigger injected after plan to prevent model treating plan as response
- Split developer profile: tool_developer (Navi tools) vs developer (general code)
- Simplified persona.txt: trimmed redundant content now handled by mechanics
- AIHelper.ask(): 120s timeout via asyncio.wait_for to prevent smart_edit hangs
- filesystem._smart_edit(): atomic write via temp file + os.replace()
- Removed 5 junk user tools (game project artifacts, trivial utilities)
- Removed instagram tools (to be rewritten); cleaned enabled.json
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 21 Apr
|
Add instagram_engine and instagram_viewer tools (Navi-generated)
...
Browser automation tools for scraping public Instagram profiles using
Playwright + stealth. Registered in enabled.json and developer profile.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 21 Apr
|
| 2026-04-20 |
Remove hello_world test tool and incomplete instagram_scraper
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 20 Apr
|
Adaptive re-plan on todo step failure
...
When a todo step is newly marked failed, queue a targeted system message
for the next iteration prompting the model to revise its remaining pending
steps before continuing. Enabled by adaptive_replan_enabled flag (on by
default in developer profile). Zero overhead when no failure occurs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 20 Apr
|

Autonomous reasoning improvements: budget, anchoring, anti-stall, validation
...
- AgentProfile: per-profile thinking mechanics flags (think_enabled,
iteration_budget_enabled, goal_anchoring, anti_stall, step_validation,
planning_reflect, adaptive_replan) — all profiles updated in config.json
- Iteration budget: inject remaining iterations into context so model knows
when to wrap up; urgency levels at ≤7 and ≤3 remaining
- Goal anchoring: inject original goal + todo state every N iterations to
prevent drift on long tasks
- Anti-stall: two signals — no todo progress for N iterations, or identical
tool calls repeated N times; warning injected into context
- Todo step validation: marking done requires a validation field describing
how result was verified; failed gets a soft nudge with tip for re-planning
- stream_complete: add think param to base class, ollama and openai backends
- Summarizer: raise max_tokens 1024→3000, expand system prompt with
user-preferences section and verbatim-value instructions
- Compression card: persist to session.messages (is_compression flag on
Message), show expandable summary in webclient with markdown body
- ToolResult.to_message_content: always include output on failure so
tracebacks and error details reach the model (fixes silent Error: None)
- Developer profile: fix subagent profile secretary→developer, add write_tool
to subagent_tools, clarify write_tool vs filesystem in system prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 20 Apr
|
| 2026-04-17 |
Remove context_transfer from all user-facing prompts — internal mechanism only
...
context_transfer is the scratchpad section name used internally by spawn_agent
to auto-inject parent state. Navi doesn't control it and doesn't need to know
about it. Removed from: persona, secretary, server_admin, spawn_agent description,
manual. Internal code (spawn_agent.py) still reads the section transparently.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|

Fix core subagent misuse: enforce 1 plan step = 1 spawn_agent call
...
Root cause: nowhere was it stated that each AGENT step in the plan
maps to a separate spawn_agent call. Navi was bundling all AGENT steps
into a single call, dumping the full plan on one subagent.
spawn_agent description:
- Lead with: "Delegate EXACTLY ONE step of your plan"
- Explicit: "3 AGENT steps = 3 spawn_agent calls"
- Remove "multi-step sub-task" wording that invited bundling
- briefing: clarify as static context only (credentials, paths, instructions)
Dynamic findings from prior steps → context_transfer, not briefing
Planning Phase 2 prompt:
- Add AGENT scoping rules: each step = one focused unit, not "do everything"
- Add good/bad examples of AGENT step granularity
- Show multiple AGENT steps in the format example
Secretary & server_admin system prompts:
- Add explicit 1:1 rule with counter-example
- Show correct multi-agent execution pattern with code example
- Clarify briefing vs context_transfer boundary everywhere
Persona:
- "ONE PLAN STEP = ONE spawn_agent CALL" as first sentence in SUB-AGENTS
- Field descriptions tightened: briefing = static, context_transfer = dynamic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
Fix subagent instruction conflicts across persona and profiles
...
Persona:
- Fix [STATUS: completed|limit_reached] reference (format was removed)
- Clarify three fields: task / briefing / system_prompt with distinct roles
- Clarify context_transfer vs briefing: transfer = working state, briefing = credentials
Secretary system_prompt:
- Replace vague "write all context to context_transfer" with explicit field breakdown
- task / briefing / system_prompt each described with their purpose
- context_transfer correctly limited to intermediate findings, not credentials
Server admin system_prompt:
- Same fix: explicit field breakdown for spawn_agent
- Remove dangling "see persona" reference for briefing ending
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
Fix spawn_agent: restore briefing, fix status leakage, enable subagent planning
...
spawn_agent:
- Restore briefing param (task = goal, briefing = context — good separation)
- Add system_prompt as third param for role specialisation per task
- Remove [STATUS: ...] prefix that was leaking into Navi's responses and
causing hallucination — replaced with natural-language headers that are
less likely to be regurgitated verbatim
- completed → neutral header; limit_reached → explicit warning about incompleteness
Profiles:
- subagent_planning_enabled: false → true in all three profiles
(planning is on by default, disable per-profile if needed)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
Strengthen orchestration mandate: spawn first, inline last
...
secretary/server_admin system prompts:
- Explicit spawning rule: MUST spawn for any sub-task requiring 3+ tool calls
- Additional mandatory triggers listed (research, file processing, remote ops, large output)
- "If in doubt — spawn" as explicit fallback
- AGENT steps: "MANDATORY, never execute inline — defeats the orchestrator model"
- context_transfer pattern: write to scratchpad before spawning, injected automatically
persona.txt:
- Updated SUB-AGENT BRIEFING section: renamed to SUB-AGENTS
- Reflects new context_transfer automatic injection (no longer needs to be in task)
- Added: check [STATUS: ...] in result before deciding next action
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|

Improve subagent system: isolated tools, custom prompts, context transfer, timeout
...
AgentProfile:
- New fields: subagent_tools, subagent_planning_enabled, subagent_system_prompt
- loader.py: loads subagent_tools/subagent_planning_enabled from config.json,
reads optional subagent_system_prompt.txt per profile
Profiles:
- Each profile now has a dedicated subagent_tools list (focused subset, no admin tools)
- subagent_planning_enabled: false (configurable per profile)
- New subagent_system_prompt.txt per profile with executor-focused instructions
run_ephemeral:
- Uses profile.subagent_tools instead of enabled_tools
- Builds subagent context without persona or profiles block (focused executor)
- Injects subagent_system_prompt after profile.system_prompt
- Accepts context_transfer: priming exchange injected before task message
- Wall-clock timeout (default 5 min) checked per iteration
- Returns (result_text, completed: bool) instead of bare string
- Optionally runs planning phase if profile.subagent_planning_enabled
spawn_agent:
- Removed briefing param; task is now fully self-contained
- Added system_prompt param: custom injected prompt for this specific task
- Auto-reads parent scratchpad context_transfer section via get_section()
- Result prefixed with [STATUS: completed|limit_reached]
- Timeout 300s
scratchpad:
- Added get_section(session_id, section) helper for cross-session reads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|

Planning phases, context compression, and tool improvements
...
Agent:
- Planning now a 3-phase async generator: Analysis → Execution plan → AIHelper critic
- Yield PlanningStatus events before each phase (UI progress labels)
- Phase 1 runs with think=True for deeper analysis
- Phase 2 includes available tool list so executor assignments are accurate
- Phase 3: independent critic pass validates and corrects TOOL: names against real tool list
- Planning converted from list return to async generator (fixes token accounting)
Backend:
- Context compression threshold: 80% → 70% to trigger earlier
- Compressor summary prompt: structured sections (goal, work state, key facts, outputs, errors)
- Terminal output capped at 5000 chars to prevent context flooding
- Web search: region=wt-wt for DDG, country=ALL for Brave, language=all for SearxNG
- Scratchpad: mandate writing a 'goal' section at start of multi-step tasks
- secretary max_iterations: 40→25, temperature: 0.7→0.5
- server_admin max_iterations: 40→20
Webclient:
- ThinkingCard strips <thought> XML tags leaked by Ollama
- planning_status WS event wired to chat.onPlanningStatus()
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
Webclient UI improvements + backend fixes
...
Webclient:
- Draft persistence across page refreshes (localStorage, reactive watch)
- Image lightbox modal using UI kit classes on thumbnail click
- Copy button on user and assistant messages
- Selection reply toolbar: select assistant text → quote inserted into input
- User message rendering: proper HTML escaping, styled blockquote for > replies
- Markdown table fix: preprocessor to inject missing separator rows
- Planning status labels (rebuild dist)
Backend:
- Developer profile: enable subagent delegation, increase max_iterations to 35
- share_file: updated description + manual with absolute path requirement and URL sharing
- persona.txt: instructions for quote replies and GFM table format
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
Audit and trim system prompts (~470 tokens saved)
...
persona.txt:
- Shortened personality paragraph (~30% cuts, no content loss)
- Removed duplicate list_tools instruction
- Removed hardcoded 'developer' profile rule (handled by dynamic profiles block)
- Condensed EXECUTION MODES fundamental blockers to one sentence
- Moved sub-agent briefing boilerplate here (single source of truth)
- Trimmed REFLECTION section (tool description handles the how)
- Removed redundant RESPONSE HYGIENE explanation sentence
- Moved 'never assume file exists' into EXECUTION DISCIPLINE
- Removed DOCUMENTATION section
profiles (all three):
- Replaced ~100-token sub-agent briefing boilerplate with pointer to persona
- developer: removed data persistence code block (covered by _template.py)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
Add reflect tool: three parallel expert perspectives
...
ReflectTool runs Critic / Pragmatist / Detailer advisors concurrently
via asyncio.gather() + AIHelper.ask(). Each role has a distinct system
prompt designed to produce genuinely different analysis:
- Critic: challenges assumptions, surfaces risks and logical gaps
- Pragmatist: finds the simplest path, cuts unnecessary complexity
- Detailer: spots missing requirements, edge cases, ambiguities
Parameters: situation (required), assumptions (required list — the key
input that forces Navi to surface implicit beliefs), tried (optional).
Registered as a builtin with AIHelper injection. Added to all three
profiles. Persona updated with guidance on when to use it (complex or
ambiguous tasks before planning, or when stuck mid-execution).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 17 Apr
|
| 2026-04-16 |
Add profile discoverability: list_profiles tool + system prompt injection
...
- AgentProfile: new short_description (1-line) and full_description (dict
with specialization / when_to_use / key_tools) fields
- All 3 profile configs: structured descriptions added; list_profiles added
to enabled_tools
- _build_system_prompt: now accepts full AgentProfile; injects compact
"Available profiles" block into every system prompt so Navi always knows
what other profiles exist and when to switch — dynamically, no hardcoding
- ListProfilesTool: new built-in; returns structured per-profile details
(specialization, when_to_use, key_tools); accepts optional profile_id
for single-profile lookup
- registry: register list_profiles_tool after profiles registry is built
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 16 Apr
|
| 2026-04-15 |
Fix Ollama connection leak and empty message bug in agent
...
- _iter_stream_guarded: track chunk_task as nullable, cancel in finally
block to prevent zombie HTTP connections accumulating under load
- Final turn: use `content or None` so empty text isn't saved to DB
- client/index.html: point to new Vue webclient build
- profiles: add email_manager tool
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 15 Apr
|
Add autonomous execution mode; clarify code_exec runs locally
...
persona.txt: EXECUTION MODES section — autonomous mode triggered by user phrase,
handles obstacles independently, only stops on fundamental blockers.
server_admin, developer profiles: explicit note that code_exec / terminal /
filesystem run on the LOCAL machine, never on remote hosts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 15 Apr
|
Move orchestration from persona to profiles; tune per-profile delegation strategy
...
persona.txt now contains only: identity, profile switching, workspace,
response hygiene, memory, and documentation. All orchestration instructions
removed from the global scope.
Each profile gets its own orchestration model:
- secretary: full orchestrator — delegate any 2+ tool-call sub-task to agents,
scratchpad as blackboard, todo for milestone tracking
- server_admin: heavy orchestrator — one agent per host / per concern,
parallel delegation, diagnose-before-act discipline
- developer: builder + research delegation — implementation always inline,
spawn only for large API/codebase research tasks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eugene Sukhodolskiy
committed
on 15 Apr
|