Newer
Older
navi-1 / docs / testing.md

Testing Strategy

Backend stack

  • pytest + pytest-asyncio (asyncio_mode = auto)
  • pytest-mockmocker fixture
  • httpxTestClient for FastAPI routes
  • asgi-lifespan — lifespan management in integration tests

Web client stack

  • Vitest + @vue/test-utils — Vue 3 component and composable testing
  • happy-dom — lightweight DOM environment
  • Pinia testing utils@pinia/testing for store mocks

Directory layout

tests/                          # Backend + terminal client (pytest)
├── conftest.py
├── conftest_factory.py
├── unit/
│   ├── api/
│   │   └── test_session_files.py
│   ├── auth/
│   │   ├── test_api_tokens.py
│   │   ├── test_deps.py
│   │   └── test_encrypt.py
│   ├── core/
│   │   ├── test_agent.py
│   │   ├── test_agent_context_size.py
│   │   ├── test_agent_stream_guard.py
│   │   ├── test_anti_stall.py
│   │   ├── test_compressor.py
│   │   ├── test_context_builder.py
│   │   ├── test_events.py
│   │   ├── test_pg_session_store.py
│   │   ├── test_planning.py
│   │   ├── test_registry.py
│   │   ├── test_scheduler.py
│   │   └── test_tool_executor.py
│   ├── llm/
│   │   └── test_ollama.py
│   ├── memory/
│   │   ├── test_extractor.py
│   │   └── test_store.py
│   ├── profiles/
│   │   ├── test_base.py
│   │   └── test_overrides.py
│   ├── store/
│   │   └── test_kv_store.py
│   ├── tools/
│   │   ├── test_code_exec.py
│   │   ├── test_content_publish.py
│   │   ├── test_filesystem.py
│   │   ├── test_image_view.py
│   │   ├── test_memory.py
│   │   ├── test_recall_tools.py
│   │   ├── test_scratchpad.py
│   │   └── ...
│   ├── test_content_store.py
│   ├── test_mcp.py
│   └── test_startup.py
├── integration/
│   ├── conftest.py
│   ├── test_api_routes.py
│   ├── test_auth_disabled.py
│   ├── test_mcp_integration.py
│   ├── test_recall_api.py
│   ├── test_scheduler_loop.py
│   └── test_websocket.py
└── clients/                    # Terminal client tests
    ├── test_terminal_client.py
    ├── test_terminal_ws.py
    ├── test_tui_app.py
    ├── test_tui_export.py
    ├── test_tui_sessions_panel.py
    ├── test_tui_settings.py
    ├── test_tui_themes.py
    ├── test_permissions.py
    ├── test_permission_dialog.py
    ├── test_shell_runner.py
    ├── test_file_refs.py
    └── test_diff_artifact_renderers.py

webclient/tests/                # Web client (Vitest)
├── unit/
│   ├── api/
│   │   └── index.test.js
│   ├── stores/
│   │   ├── chat.test.js
│   │   ├── sessions.test.js
│   │   └── profiles.test.js
│   └── composables/
│       └── useWebSocket.test.js

Mock strategy

LLM — FakeLLMBackend

FakeLLMBackend cycles through a list of pre-defined responses and optionally emits ToolCallRequest objects. This lets us test the agent loop and planning without real Ollama.

from tests.conftest_factory import FakeLLMBackend

backend = FakeLLMBackend(
    responses=["Hello", "DIRECT"],
    tool_calls=[None, None],
    thinking=["Hmm", None],
)
resp = await backend.complete([])  # → LLMResponse(content="Hello")

PostgreSQL — FakePool / FakeConnection

Unit tests mock asyncpg.Pool via an in-memory FakePool/FakeConnection. Integration tests may use a real Postgres instance via TEST_DATABASE_URL.

from tests.conftest_factory import FakeConnection, FakeRecord, make_store_with_pool

conn = FakeConnection()
conn.enqueue(42)  # fetchval result
conn.enqueue([FakeRecord(id="1", key="name", value="Eugene")])  # fetch result
store = make_store_with_pool(conn)

Coverage status

The project is covered by backend (pytest), terminal-client (pytest), and web-client (Vitest) tests. Key areas with dedicated tests:

  • Agent loop & planning: tests/unit/core/test_agent*.py, test_planning.py, test_anti_stall.py.
  • Sessions, compression, events: test_compressor.py, test_events.py, test_pg_session_store.py.
  • Tools: tests/unit/tools/test_*.py.
  • Memory: tests/unit/memory/test_*.py.
  • Auth: tests/unit/auth/test_*.py.
  • MCP & recall: tests/unit/test_mcp.py, tests/integration/test_mcp_integration.py, test_recall_tools.py, test_scheduler.py.
  • Terminal client: tests/clients/test_*.py.
  • Web client: webclient/tests/unit/**/*.test.js.

The detailed coverage roadmap lives in the test directories themselves. Add a regression test whenever a real bug is fixed.

Running tests

# Backend tests
pytest                              # all backend tests
pytest tests/unit                   # unit only
pytest -v tests/unit/core           # verbose
pytest -v tests/unit/core/test_events.py::TestToolStarted::test_to_wire  # single test
TEST_DATABASE_URL=postgresql://... pytest tests/integration

# Web client tests (run from webclient/)
cd webclient && npm test             # all webclient tests
npx vitest run tests/unit/api       # single directory
npx vitest run -t "buildMessageList" # filter by test name

Adding a new test

  1. Create file in the appropriate tests/unit/ or tests/integration/ directory.
  2. Use async def for async tests — pytest-asyncio handles the rest.
  3. Import helpers from tests.conftest_factory for fakes.
  4. Mutations to navi.config.settings are reset automatically by the autouse fixture in conftest.py.

Guidelines

  • Mock at boundaries: LLM calls → FakeLLMBackend, DB → FakePool, filesystem → tmp_path.
  • Avoid real network: Never hit Ollama, OpenAI, or DuckDuckGo in unit tests.
  • Avoid real DB in unit tests: Use in-memory mocks; real Postgres only in tests/integration/.
  • Keep tests deterministic: No randomness, no time-dependent logic without monkeypatching datetime.now.