# Agent Loop

Core execution engine. File: `navi/core/agent.py`.

## Entry points

### `run_stream(session_id, user_message)` → `AsyncGenerator[AgentEvent]`
Streaming. Yields `AgentEvent` objects in real time. Used by the WebSocket handler. Runs the planning phase if `profile.planning_enabled = True`.

### `run(session_id, user_message)` → `str`
Non-streaming. Full tool-calling loop, returns final text. No planning phase.

### `run_ephemeral(user_message, profile_id)` → `str`
Non-persistent subagent. No DB reads/writes. Temporary in-memory context. Called by `SpawnAgentTool`. Uses session ID `subagent_<uuid12>` to isolate scratchpad.

---

## Planning phase (`_run_planning`)

Runs before the tool loop when `profile.planning_enabled = True`.

### Phase 1 — Analysis
LLM receives the user request with a classification prompt. Outputs:
- `DIRECT` → skip planning entirely (simple request).
- A structured analysis + `REFLECT: yes/no` → continue to Phase 2 or 3.

### Phase 2 — Multi-perspective review (conditional)
Runs only when `profile.planning_reflect_enabled = True` AND Phase 1 outputs `REFLECT: yes`.
Three advisors run in parallel (via `asyncio.gather`), each receiving the full chat context and the Phase 1 analysis:
- **Critic** — what could go wrong
- **Pragmatist** — simpler/more direct path
- **Detailer** — missing requirements

Advisor feedback is embedded into the Phase 3 prompt.

### Phase 3 — Execution plan
LLM produces a numbered step list. Each step is assigned an executor:
- `TOOL: tool_name` — single tool call
- `AGENT: profile_id` — delegated to a subagent via `spawn_agent`
- `SELF` — handled inline (synthesis, context-dependent action)

**Comma test (enforced in prompt):** if a step description lists multiple things with "and" or commas, each item must be a separate step.

The plan is injected into `session.context` as an assistant message and saved to `session.messages` with `is_plan=True` for UI rendering. The todo list is auto-populated from the plan steps.

---

## Thinking mechanics

All flags live on `AgentProfile` and can be set per-profile in `config.json`.

| Flag | Default | What it does |
|---|---|---|
| `think_enabled` | `true` | Passes `think=True` to LLM on every main-loop call (extended reasoning) |
| `iteration_budget_enabled` | `true` | Injects remaining iteration count into context so model wraps up in time |
| `planning_reflect_enabled` | `false` | Enables Phase 2 advisor review (3 parallel LLM calls, adds latency) |
| `goal_anchoring_enabled` | `true` | Injects goal-reminder system message every N iterations |
| `goal_anchoring_interval` | `5` | N for goal anchoring |
| `anti_stall_enabled` | `true` | Detects looping without todo progress and injects a warning |
| `anti_stall_threshold` | `8` | Consecutive iterations without progress before warning fires |
| `step_validation_enabled` | `false` | Blocks marking a todo step `done` without a `validation` field |
| `adaptive_replan_enabled` | `false` | When a step is marked `failed`, queues a re-plan prompt for the next iteration |
| `subagent_planning_enabled` | `false` | Subagents run their own planning phase |

---

## Tool-calling loop

Runs up to `profile.max_iterations` times.

```
Each iteration:
  1. Check stop_event → yield StreamStopped if set
  2. Build context: _build_context() injects iteration budget and goal anchor (if due)
  3. Check anti-stall: if stalled, append warning message to context
  4. Inject queued adaptive re-plan message (if a step failed last iteration)
  5. llm.stream_complete(context, tool_schemas)
     → ThinkingDelta/ThinkingEnd events during reasoning
     → TextDelta events during text generation
  6a. No tool calls → save session, yield StreamEnd, run workers, return
  6b. Tool calls → execute each, yield ToolEvent, append results to context
  7. Update anti-stall counters, detect newly-failed todo steps
  8. Check if profile switched → reload profile + tools
```

### Sub-agent event forwarding
When `spawn_agent` runs a subagent, its events arrive through `current_event_sink`. The parent drains the queue in real time, yielding subagent events marked with `is_subagent=True`.

### Cooperative stop
Stop is signalled via `current_stop_event` (an `asyncio.Event`). Checked before each LLM call, during streaming, and after tool execution. Never use `task.cancel()` — it corrupts WebSocket state.

---

## Workers

Run sequentially after `StreamEnd`. Currently: `CompressionWorker`.

Pre-turn compression also runs at the start of `run_stream()` if `session.context_token_count` exceeds the threshold. See [`sessions.md`](sessions.md).

---

## System prompt construction (`_build_context`)

Every LLM call receives:
1. System message: `persona + "---" + profile.system_prompt` (injected fresh, never stored).
2. Optional memory message: `"## What I remember about the user\n..."`.
3. `session.context` messages (system messages stripped to avoid duplication).

Profile switches and persona changes take effect immediately.
