Agent Loop

Core execution engine. File: navi/core/agent.py.

Entry points

`run_stream(session_id, user_message)` → `AsyncGenerator[AgentEvent]`

Streaming. Yields AgentEvent objects in real time. Used by the WebSocket handler. Runs the planning phase if profile.planning_enabled = True.

`run(session_id, user_message)` → `str`

Non-streaming. Full tool-calling loop, returns final text. No planning phase.

`run_ephemeral(user_message, profile_id)` → `str`

Non-persistent subagent. No DB reads/writes. Temporary in-memory context. Called by SpawnAgentTool. Uses session ID subagent_<uuid12> to isolate scratchpad.

Planning phase (`_run_planning`)

Runs before the tool loop when profile.planning_enabled = True.

Phase 1 — Analysis

LLM receives the user request with a classification prompt. Outputs:

DIRECT → skip planning entirely (simple request).
A structured analysis + REFLECT: yes/no → continue to Phase 2 or 3.

Phase 2 — Structured review (conditional)

Runs only when planning_phase2_enabled = True AND Phase 1 outputs REFLECT: yes. One LLM call reviews the Phase 1 analysis and returns four sections:

Critic — wrong assumptions, risks, contradictions, facts to verify
Pragmatist — simpler path, unnecessary steps, better executor choices
Detailer — missing requirements, source files/docs/tools to inspect, validation gaps
Plan Adjustments — concrete changes Phase 3 must apply

The review is embedded into the Phase 3 prompt.

Phase 3 — Execution plan

LLM produces milestones plus a numbered step list. Each step is assigned an executor:

TOOL: tool_name — single tool call
AGENT: profile_id — bounded 3+ tool-call subtask delegated to a subagent via spawn_agent
SELF — handled inline (synthesis, context-dependent action)

Plan depth is adaptive:

simple: 1-3 steps
medium: 5-9 steps
complex or autonomous: 8-15 steps
hard maximum: 15 steps

Comma test (enforced in prompt): if a step description lists multiple things with "and" or commas, each item must be a separate step.

The plan is injected into session.context as an assistant message and saved to session.messages with is_plan=True for UI rendering. The todo list is auto-populated from the plan steps.

Thinking mechanics

All flags live on AgentProfile and can be set per-profile in config.json.

Flag	Default	What it does
`think_enabled`	`true`	Passes `think=True` to LLM on every main-loop call (extended reasoning)
`iteration_budget_enabled`	`true`	Injects remaining iteration count into context so model wraps up in time
`planning_phase2_enabled`	`false`	Enables Phase 2 structured review (one extra LLM call when Phase 1 outputs `REFLECT: yes`)
`goal_anchoring_enabled`	`true`	Injects goal-reminder system message every N iterations
`goal_anchoring_interval`	`5`	N for goal anchoring
`anti_stall_enabled`	`true`	Detects looping without todo progress and injects a warning
`anti_stall_threshold`	`8`	Consecutive iterations without progress before warning fires
`step_validation_enabled`	`false`	Blocks marking a todo step `done` without a `validation` field
`adaptive_replan_enabled`	`false`	When a step is marked `failed`, queues a re-plan prompt for the next iteration
`subagent_planning_enabled`	`false`	Subagents run their own planning phase

Tool-calling loop

Runs up to profile.max_iterations times.

Each iteration:
  1. Check stop_event → yield StreamStopped if set
  2. Build context: _build_context() injects iteration budget and goal anchor (if due)
  3. Check anti-stall: if stalled, append warning message to context
  4. Inject queued adaptive re-plan message (if a step failed last iteration)
  5. llm.stream_complete(context, tool_schemas)
     → ThinkingDelta/ThinkingEnd events during reasoning
     → TextDelta events during text generation
  6a. No tool calls → save session, yield StreamEnd, run workers, return
  6b. Tool calls → execute each, yield ToolEvent, append results to context
  7. Update anti-stall counters, detect newly-failed todo steps
  8. Check if profile switched → reload profile + tools

Sub-agent event forwarding

When spawn_agent runs a subagent, its events arrive through current_event_sink. The parent drains the queue in real time, yielding subagent events marked with is_subagent=True.

Cooperative stop

Stop is signalled via current_stop_event (an asyncio.Event). Checked before each LLM call, during streaming, and after tool execution. Never use task.cancel() — it corrupts WebSocket state.

Workers

Run sequentially after StreamEnd. Currently: CompressionWorker.

Pre-turn compression also runs at the start of run_stream() if session.context_token_count exceeds the threshold. See sessions.md.

System prompt construction (`_build_context`)

Every LLM call receives:

System message: persona + "---" + profile.system_prompt (injected fresh, never stored).
Optional memory message: "## What I remember about the user\n...".
session.context messages (system messages stripped to avoid duplication).

Profile switches and persona changes take effect immediately.