feat: backend registry, S2-Pro INT4, progressive segmentation, text cleaning

Fork: 0

root / voice

Browse code feat: backend registry, S2-Pro INT4, progressive segmentation, text cleaning - VRAM снижен 15.4->9.6 GB (max_seq_len 32768->4096, KV cache -4.7 GB) - Создан реестр бэкендов tts/__init__.py с @register() декоратором - Бэкенды саморегистрируются: dummy, s2, fish_speech, f5_tts, xtts_v2 - server.py упрощён: create_engine() из реестра, нет _BACKEND_MAP - _sync_synthesize универсален (нет isinstance проверок) - ref_text добавлен в TTSEngine.synthesize base class - Удалён bnb 4-bit код из inference.py (INT4 авто-детект по пути) - cleanup_text_for_tts(): эмодзи, HTML, URL, маркдаун, спецсимволы - Прогрессивная сегментация (fast_start_initial=12, fast_start_count=3) - Документация: README, .env.example, docs/05_usage.md под S2-Pro - AGENTS.md: актуальное состояние проекта - .gitignore: .hf_home/, outputs/, voices/*.{m4a,mp4,flac,bak} master
1 parent 2bff5aa commit fcb0f02753674e6a7c3ee82bf4cf42663baf451e Eugene Sukhodolskiy authored 20 days ago

Browse code

- VRAM снижен 15.4->9.6 GB (max_seq_len 32768->4096, KV cache -4.7 GB)
- Создан реестр бэкендов tts/__init__.py с @register() декоратором
- Бэкенды саморегистрируются: dummy, s2, fish_speech, f5_tts, xtts_v2
- server.py упрощён: create_engine() из реестра, нет _BACKEND_MAP
- _sync_synthesize универсален (нет isinstance проверок)
- ref_text добавлен в TTSEngine.synthesize base class
- Удалён bnb 4-bit код из inference.py (INT4 авто-детект по пути)
- cleanup_text_for_tts(): эмодзи, HTML, URL, маркдаун, спецсимволы
- Прогрессивная сегментация (fast_start_initial=12, fast_start_count=3)
- Документация: README, .env.example, docs/05_usage.md под S2-Pro
- AGENTS.md: актуальное состояние проекта
- .gitignore: .hf_home/, outputs/, voices/*.{m4a,mp4,flac,bak}

master

1 parent 2bff5aa commit fcb0f02753674e6a7c3ee82bf4cf42663baf451e

Eugene Sukhodolskiy authored 20 days ago

Patch

Unified Split

Showing 20 changed files

Ignore Space Show notes View .env.example

Ignore Space Show notes View AGENTS.md

Ignore Space Show notes View README.md

Ignore Space Show notes View docs/05_usage.md

Ignore Space Show notes View docs/06_technical_notes.md

Ignore Space Show notes View examples/client_browser.html

Ignore Space Show notes View pyproject.toml

Ignore Space Show notes View scripts/benchmark_backends.py 0 → 100644

Ignore Space Show notes View scripts/benchmark_compile.py 0 → 100644

Ignore Space Show notes View src/voice_tts/api/server.py

Ignore Space Show notes View src/voice_tts/config.py

Ignore Space Show notes View src/voice_tts/tts/__init__.py 0 → 100644

Ignore Space Show notes View src/voice_tts/tts/engine.py

Ignore Space Show notes View src/voice_tts/tts/f5_backend.py

Ignore Space Show notes View src/voice_tts/tts/fish_speech_backend.py 0 → 100644

Ignore Space Show notes View src/voice_tts/tts/s2_backend.py 0 → 100644

Ignore Space Show notes View src/voice_tts/tts/segmenter.py

Ignore Space Show notes View src/voice_tts/tts/utils.py

Ignore Space Show notes View src/voice_tts/tts/xtts_backend.py 0 → 100644

Ignore Space Show notes View tests/test_fish_speech_backend.py 0 → 100644

Show line notes below