| docs | 15 hours ago | ||
| examples | 15 hours ago | ||
| scripts | 12 days ago | ||
| src/ voice_tts | 15 hours ago | ||
| tests | 12 days ago | ||
| voices | 12 days ago | ||
| .env.example | 12 days ago | ||
| .gitignore | 12 days ago | ||
| AGENTS.md | 15 hours ago | ||
| README.md | 12 days ago | ||
| pyproject.toml | 12 days ago | ||
| requirements.txt | 12 days ago | ||
Local GPU-powered real-time text-to-speech pipeline with a WebSocket API, designed to voice AI agents that stream text in chunks.
models/F5TTS_v1_Base/ downloaded).# Create virtual environment (Python 3.10-3.12 recommended) python3.11 -m venv .venv source .venv/bin/activate # Install PyTorch with CUDA 12.6 support first pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu126 # Install remaining dependencies pip install -r requirements.txt # (Optional) Download the F5-TTS model beforehand python scripts/download_f5_tts.py --model F5TTS_v1_Base # Run the server python -m voice_tts.main # Or run in dummy test mode TTS_BACKEND=dummy python -m voice_tts.main
Server will listen on ws://localhost:8765/ws.
See full documentation in docs/03_websocket_protocol.md.