Add eval system Phase 4 — read endpoints and background runner
REST surface for the debug UI:
- GET /eval/sessions  — overview list with eval status / latest avg /
  feedback counts (single SQL: sessions ⨝ feedback ⨝ latest run)
- GET /eval/sessions/{id} — session detail with all evaluations
- GET /eval/stats — weekly per-axis means; optional complexity-bucket split
- POST /eval/run — fire-and-forget background eval, returns run_id
- GET /eval/run/{id}, GET /eval/runs — poll progress and history

Pulled the runner loop out of cli into runner.py so both the CLI and
the REST endpoint share the same loop. State for in-flight runs lives
in an in-memory registry (single-process, cleared on restart).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 864261a commit 8d5c3510f3bbfc2c7dbbd767efa01c1a80d6d88d
@Eugene Sukhodolskiy Eugene Sukhodolskiy authored on 26 Apr
Showing 4 changed files
View
debug/eval/api.py
View
debug/eval/db.py
View
debug/eval/runner.py 0 → 100644
View
debug/eval/schema.py