feat: backend registry, S2-Pro INT4, progressive segmentation, text cleaning
- VRAM снижен 15.4->9.6 GB (max_seq_len 32768->4096, KV cache -4.7 GB)
- Создан реестр бэкендов tts/__init__.py с @register() декоратором
- Бэкенды саморегистрируются: dummy, s2, fish_speech, f5_tts, xtts_v2
- server.py упрощён: create_engine() из реестра, нет _BACKEND_MAP
- _sync_synthesize универсален (нет isinstance проверок)
- ref_text добавлен в TTSEngine.synthesize base class
- Удалён bnb 4-bit код из inference.py (INT4 авто-детект по пути)
- cleanup_text_for_tts(): эмодзи, HTML, URL, маркдаун, спецсимволы
- Прогрессивная сегментация (fast_start_initial=12, fast_start_count=3)
- Документация: README, .env.example, docs/05_usage.md под S2-Pro
- AGENTS.md: актуальное состояние проекта
- .gitignore: .hf_home/, outputs/, voices/*.{m4a,mp4,flac,bak}
1 parent 2bff5aa commit fcb0f02753674e6a7c3ee82bf4cf42663baf451e
@Eugene Sukhodolskiy Eugene Sukhodolskiy authored 3 hours ago
Showing 20 changed files
View
.env.example
View
AGENTS.md
View
README.md
View
docs/05_usage.md
View
docs/06_technical_notes.md
View
examples/client_browser.html
View
pyproject.toml
View
scripts/benchmark_backends.py 0 → 100644
View
scripts/benchmark_compile.py 0 → 100644
View
src/voice_tts/api/server.py
View
src/voice_tts/config.py
View
src/voice_tts/tts/__init__.py 0 → 100644
View
src/voice_tts/tts/engine.py
View
src/voice_tts/tts/f5_backend.py
View
src/voice_tts/tts/fish_speech_backend.py 0 → 100644
View
src/voice_tts/tts/s2_backend.py 0 → 100644
View
src/voice_tts/tts/segmenter.py
View
src/voice_tts/tts/utils.py
View
src/voice_tts/tts/xtts_backend.py 0 → 100644
View
tests/test_fish_speech_backend.py 0 → 100644