|
Add stop button and fix context compression hang
Stop generation:
- Client: send button toggles to red ■ during streaming; sends {type:stop} via WS
- Server: _stream_recv concurrently reads incoming messages during streaming using
asyncio.wait — stop signal is handled immediately without polling
- Cooperative stop via asyncio.Event (current_stop_event ContextVar): agent breaks
out of LLM async-for cleanly so aclose() fires → Ollama stream closes gracefully,
model stays in VRAM. No task.cancel() which would eject the model.
- StreamStopped event propagates through run_stream/run_ephemeral; sub-agents stop
via the same shared stop_event inherited through task context
Context compression fix:
- compress_context passes think=False to llm.complete() — no extended reasoning
during summarization which caused GPU hang
- Input truncated to 12k chars before sending to summarizer
- LLMBackend.complete() / OllamaBackend.complete() accept think: bool | None override
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|---|
|
|
| client/js/app.js |
|---|
| client/js/ws.js |
|---|
| client/style.css |
|---|
| navi/api/websocket.py |
|---|
| navi/core/agent.py |
|---|
| navi/core/compressor.py |
|---|
| navi/core/events.py |
|---|
| navi/llm/base.py |
|---|
| navi/llm/ollama.py |
|---|
| navi/llm/openai_backend.py |
|---|
| navi/profiles/secretary.py |
|---|
| navi/profiles/server_admin.py |
|---|
| navi/profiles/smart_home.py |
|---|
| navi/tools/base.py |
|---|
| navi/tools/terminal.py |
|---|