Raise first-chunk timeout to 90s and retry same server+model before fallback
- config.py: llm_stream_first_chunk_timeout 180s → 90s
- fallback.py stream_complete: wrap gen.__anext__() in asyncio.wait_for()
  with llm_stream_first_chunk_timeout; on TimeoutError or LLMConnectionError
  sleep 2s and retry once on the same server+model before blacklisting/fallback

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 80ba2f5 commit 4f780995bb41b217874bcfdfa3497505f3b060e2
@Eugene Sukhodolskiy Eugene Sukhodolskiy authored on 24 May
Showing 2 changed files
View
navi/config.py
View
navi/llm/fallback.py