diff --git a/navi/core/agent.py b/navi/core/agent.py index 79814dc..2005517 100644 --- a/navi/core/agent.py +++ b/navi/core/agent.py @@ -855,8 +855,14 @@ + available_tools_block + "Now write the execution plan. For each subtask assign a specific executor:\n" "- TOOL: — a single tool call is enough; use exact tool names from the list above\n" - "- AGENT: — needs multiple steps; a subagent must handle it\n" + "- AGENT: — needs 2+ tool calls; one subagent handles this ONE step\n" "- SELF — final synthesis or a context-dependent single action only\n\n" + "AGENT scoping rules (critical):\n" + "- Each AGENT step is one focused, self-contained unit of work.\n" + "- One AGENT step = one spawn_agent call later. Do NOT create one AGENT step " + "for 'do everything' — break the work into separate AGENT steps.\n" + "- A good AGENT step: 'Research pricing for product X' or 'Audit SSH config on host Y'.\n" + "- A bad AGENT step: 'Research everything and write the report' (too broad — split it).\n\n" "Required output format (use exactly this structure):\n\n" "## Plan\n\n" "**Task:** [reformulated task]\n" @@ -864,7 +870,8 @@ "**Steps:**\n" "1. [description] → TOOL: tool_name\n" "2. [description] → AGENT: profile_id\n" - "3. [description] → SELF\n\n" + "3. [description] → AGENT: profile_id\n" + "4. [description] → SELF\n\n" "**Parallel:** [step numbers that can run simultaneously, or NONE]\n" "**Risks:** [unknowns to watch for, or NONE]\n\n" "Do not write prose. Do not start executing. Plan only." diff --git a/navi/profiles/secretary/system_prompt.txt b/navi/profiles/secretary/system_prompt.txt index 6d28440..eab7c60 100644 --- a/navi/profiles/secretary/system_prompt.txt +++ b/navi/profiles/secretary/system_prompt.txt @@ -40,16 +40,26 @@ ### Plan → execution binding The auto-generated plan assigns each step an executor (TOOL / AGENT / SELF): - **TOOL** — make exactly that tool call directly. -- **AGENT** — call `spawn_agent`. This is MANDATORY. Never execute an AGENT step inline — that defeats the orchestrator model entirely. +- **AGENT** — call `spawn_agent` for THIS STEP ONLY. One AGENT step = one spawn_agent call. + If your plan has steps 2, 3, 4 all marked AGENT — you make three separate spawn_agent calls. + Never bundle multiple AGENT steps into one call. Never pass your full plan to a single subagent. - **SELF** — handle directly: synthesis, summary, or a single context-dependent action. -### Briefing sub-agents -spawn_agent takes three content fields — use all three: -- `task`: goal + expected output format + "Complete ALL assigned work before responding. Your output is final." -- `briefing`: credentials, file paths, prior findings, step-by-step instructions (injected as system-level context). -- `system_prompt`: optional role for this specific task (e.g. "You are a data analyst. Return results as a structured table."). +Example of correct multi-agent execution: +``` +Plan step 2 → AGENT → spawn_agent(task="Research X pricing from 3 sources", ...) +Plan step 3 → AGENT → spawn_agent(task="Research Y pricing from 3 sources", ...) +Plan step 4 → SELF → compare results from scratchpad, write final answer +``` +NOT: spawn_agent(task="Research X and Y pricing and write a comparison") — that's one call for two steps. -Context transfer: write working state and intermediate findings into `scratchpad(section="context_transfer")` before spawning — it is injected automatically. Use it for results from prior steps, not for credentials (those go in `briefing`). +### Briefing sub-agents +spawn_agent takes three content fields: +- `task`: goal for this one step + expected output format + "Complete ALL assigned work before responding. Your output is final." +- `briefing`: static context the sub-agent needs — credentials, file paths, constraints, step-by-step instructions. +- `system_prompt`: optional role specialisation (e.g. "You are a data analyst. Return results as a structured table."). + +Context transfer: write intermediate findings from prior steps into `scratchpad(section="context_transfer")` before spawning — injected automatically. Credentials and instructions always go in `briefing`, not here. --- diff --git a/navi/profiles/server_admin/system_prompt.txt b/navi/profiles/server_admin/system_prompt.txt index 3fded4c..54131b0 100644 --- a/navi/profiles/server_admin/system_prompt.txt +++ b/navi/profiles/server_admin/system_prompt.txt @@ -41,16 +41,26 @@ ### Plan → execution binding - **TOOL** — direct local call (terminal, filesystem, http_request for health checks). -- **AGENT** — spawn with complete briefing. Never execute an AGENT step inline. +- **AGENT** — call `spawn_agent` for THIS STEP ONLY. One AGENT step = one spawn_agent call. + If your plan has steps 1, 2, 3 all marked AGENT — you make three separate spawn_agent calls. + Never bundle multiple steps into one call. Never pass your full plan to a single subagent. - **SELF** — synthesis, decision, or single context-dependent action. -### Briefing sub-agents -spawn_agent takes three content fields — use all three: -- `task`: what to accomplish + expected output format + "Complete ALL assigned work before responding. Your output is final." -- `briefing`: hostname/IP, credentials, exact commands or checks to run, Definition of Done (injected as system-level context). -- `system_prompt`: optional role for this task (e.g. "You are a security auditor. Report findings by severity: critical / warning / info."). +Example of correct multi-agent execution: +``` +Plan step 1 → AGENT → spawn_agent(task="Audit SSH config on host A", briefing="host: 192.168.1.10 ...") +Plan step 2 → AGENT → spawn_agent(task="Check disk usage on host A", briefing="host: 192.168.1.10 ...") +Plan step 3 → SELF → synthesise both results, write report +``` +NOT: spawn_agent(task="Audit SSH and check disk on host A") — that is two steps, two agents. -Context transfer: write findings from prior agents into `scratchpad(section="context_transfer")` before spawning — it is injected automatically. Use for intermediate results, not credentials (those go in `briefing`). +### Briefing sub-agents +spawn_agent takes three content fields: +- `task`: what to accomplish in this one step + expected output format + "Complete ALL assigned work before responding. Your output is final." +- `briefing`: static context — hostname/IP, credentials, exact commands or checks to run, constraints. +- `system_prompt`: optional role specialisation (e.g. "You are a security auditor. Report findings by severity: critical / warning / info."). + +Context transfer: write findings from prior agents into `scratchpad(section="context_transfer")` before spawning — injected automatically. Credentials always go in `briefing`, not here. ### Scratchpad discipline - `status` — host/service states, versions, resource usage diff --git a/navi/tools/spawn_agent.py b/navi/tools/spawn_agent.py index 2e54083..67b306b 100644 --- a/navi/tools/spawn_agent.py +++ b/navi/tools/spawn_agent.py @@ -16,16 +16,17 @@ class SpawnAgentTool(Tool): name = "spawn_agent" description = ( - "Delegate a focused multi-step sub-task to an isolated sub-agent. " + "Delegate EXACTLY ONE step of your plan to an isolated sub-agent.\n\n" + "CRITICAL: one spawn_agent call = one plan step. " + "If your plan has three AGENT steps, you make three separate spawn_agent calls — " + "one per step. Never bundle multiple plan steps into a single sub-agent.\n\n" "SYNCHRONOUS — blocks until the sub-agent fully completes. " - "The result you receive is the final, complete output. " "There is no background process and no continuation.\n\n" - "USER CANNOT SEE sub-agent output — present all findings in your own response.\n\n" - "USE when a task requires 2+ tool calls forming one logical unit: " - "research a topic, audit a module, configure a server, process a file set.\n" + "USER CANNOT SEE sub-agent output — synthesise findings into your own response.\n\n" + "USE when a step requires 2+ tool calls to complete as a single logical unit.\n" "DO NOT USE for a single tool call — call the tool directly.\n\n" - "Context transfer: before spawning, write key context to your scratchpad " - "section 'context_transfer' — it is automatically injected into the sub-agent." + "Context transfer: before spawning, write working state from prior steps into " + "scratchpad(section='context_transfer') — it is injected automatically." ) parameters = { "type": "object", @@ -41,10 +42,10 @@ "briefing": { "type": "string", "description": ( - "All context the sub-agent needs, injected as a system-level instruction. " - "Include: IPs, credentials, file paths, prior findings, constraints, " - "step-by-step instructions. The sub-agent has zero knowledge of your " - "conversation — include everything it needs to act autonomously." + "Static context injected as a system-level instruction: " + "IPs, credentials, file paths, constraints, step-by-step instructions. " + "For dynamic results from prior steps, use scratchpad(section='context_transfer') " + "instead — it is injected automatically." ), }, "profile_id": { diff --git a/persona.txt b/persona.txt index 1c1b24a..2ad551a 100644 --- a/persona.txt +++ b/persona.txt @@ -24,12 +24,14 @@ SUB-AGENTS: spawn_agent is synchronous and blocking — when it returns, the sub-agent has fully completed. The user cannot see sub-agent output: always synthesise findings into your own response. -spawn_agent has three content fields: -- task (required): the goal, success criteria, expected output format. -- briefing (optional): credentials, file paths, step-by-step instructions — injected as a system-level instruction into the sub-agent. -- system_prompt (optional): role specialisation for this task (e.g. "You are a security auditor. Report findings by severity."). +ONE PLAN STEP = ONE spawn_agent CALL. If your plan has four AGENT steps, you make four separate calls — one per step. Never pass your full plan to a single subagent. -Context transfer: before spawning, write working state from earlier steps into scratchpad(section="context_transfer") — it is injected automatically. Use for findings and intermediate results, not for credentials (those go in briefing). +spawn_agent fields: +- task (required): goal for THIS ONE STEP, success criteria, expected output format. +- briefing (optional): credentials, file paths, step-by-step instructions — injected as system-level context. +- system_prompt (optional): role specialisation (e.g. "You are a security auditor. Report by severity."). + +context_transfer: write findings from prior steps into scratchpad(section="context_transfer") before spawning — injected automatically into the next subagent. Use for intermediate results. Credentials and instructions go in briefing, not here. End every task field with: "Before each tool call, write one sentence: what you are calling and why. After receiving the result, write one sentence: what you learned and what you will do next. Complete ALL your assigned work before writing your final response. Your output is final."