Newer
Older
navi-1 / docs / agent.md

Agent Loop

Core execution engine. File: navi/core/agent.py.

Entry points

run_stream(session_id, user_message)AsyncGenerator[AgentEvent]

Streaming. Yields AgentEvent objects in real time. Used by the WebSocket handler. Runs the planning phase if profile.planning_enabled = True.

run(session_id, user_message)str

Non-streaming. Full tool-calling loop, returns final text. No planning phase.

run_ephemeral(user_message, profile_id)str

Non-persistent subagent. No DB reads/writes. Temporary in-memory context. Called by SpawnAgentTool. Uses session ID subagent_<uuid12> to isolate scratchpad.


Planning phase (_run_planning)

Runs before the tool loop when profile.planning_enabled = True.

Phase 1 — Analysis

LLM receives the user request with a classification prompt. Outputs:

  • DIRECT → skip planning entirely (simple request).
  • A structured analysis + REFLECT: yes/no → continue to Phase 2 or 3.

Phase 2 — Structured review (conditional)

Runs only when planning_phase2_enabled = True AND Phase 1 outputs REFLECT: yes. One LLM call reviews the Phase 1 analysis and returns four sections:

  • Critic — wrong assumptions, risks, contradictions, facts to verify
  • Pragmatist — simpler path, unnecessary steps, better executor choices
  • Detailer — missing requirements, source files/docs/tools to inspect, validation gaps
  • Plan Adjustments — concrete changes Phase 3 must apply

The review is embedded into the Phase 3 prompt.

Phase 3 — Execution plan

LLM produces milestones plus a numbered step list. Each step is assigned an executor:

  • TOOL: tool_name — single tool call
  • AGENT: profile_id — bounded 3+ tool-call subtask delegated to a subagent via spawn_agent
  • SELF — handled inline (synthesis, context-dependent action)

Plan depth is adaptive:

  • simple: 1-3 steps
  • medium: 5-9 steps
  • complex or autonomous: 8-15 steps
  • hard maximum: 15 steps

Comma test (enforced in prompt): if a step description lists multiple things with "and" or commas, each item must be a separate step.

The plan is injected into session.context as an assistant message and saved to session.messages with is_plan=True for UI rendering. The todo list is auto-populated from the plan steps.


Thinking mechanics

All flags live on AgentProfile and can be set per-profile in config.json.

Flag Default What it does
think_enabled true Passes think=True to LLM on every main-loop call (extended reasoning)
iteration_budget_enabled true Injects remaining iteration count into context so model wraps up in time
planning_phase2_enabled false Enables Phase 2 structured review (one extra LLM call when Phase 1 outputs REFLECT: yes)
goal_anchoring_enabled true Injects goal-reminder system message every N iterations
goal_anchoring_interval 5 N for goal anchoring
anti_stall_enabled true Detects looping without todo progress and injects a warning
anti_stall_threshold 8 Consecutive iterations without progress before warning fires
step_validation_enabled false Blocks marking a todo step done without a validation field
adaptive_replan_enabled false When a step is marked failed, queues a re-plan prompt for the next iteration
subagent_planning_enabled false Subagents run their own planning phase

Tool-calling loop

Runs up to profile.max_iterations times.

Each iteration:
  1. Check stop_event → yield StreamStopped if set
  2. Build context: _build_context() injects iteration budget and goal anchor (if due)
  3. Check anti-stall: if stalled, append warning message to context
  4. Inject queued adaptive re-plan message (if a step failed last iteration)
  5. llm.stream_complete(context, tool_schemas)
     → ThinkingDelta/ThinkingEnd events during reasoning
     → TextDelta events during text generation
  6a. No tool calls → save session, yield StreamEnd, run workers, return
  6b. Tool calls → execute each, yield ToolEvent, append results to context
  7. Update anti-stall counters, detect newly-failed todo steps
  8. Check if profile switched → reload profile + tools

Sub-agent event forwarding

When spawn_agent runs a subagent, its events arrive through current_event_sink. The parent drains the queue in real time, yielding subagent events marked with is_subagent=True.

Cooperative stop

Stop is signalled via current_stop_event (an asyncio.Event). Checked before each LLM call, during streaming, and after tool execution. Never use task.cancel() — it corrupts WebSocket state.


Workers

Run sequentially after StreamEnd. Currently: CompressionWorker.

Pre-turn compression also runs at the start of run_stream() if session.context_token_count exceeds the threshold. See sessions.md.


System prompt construction (_build_context)

Every LLM call receives:

  1. System message: persona + "---" + profile.system_prompt (injected fresh, never stored).
  2. Optional memory message: "## What I remember about the user\n...".
  3. session.context messages (system messages stripped to avoid duplication).

Profile switches and persona changes take effect immediately.