dd05e6d1fdfeb223173ce69a6f7cd7d4470b0186
Piper TTS Server (Local Text-to-Speech)
This repository contains scripts to generate high-quality Neural TTS audio using Piper and stream it directly to Snapcast for multi-room audio announcements.
⚡ Functionality
The primary script speak_direct.sh performs the following pipeline:
- Input: Text string, Voice Model, Speed.
- Generation: Calls the
piperbinary to generate a.wavfile locally. - Processing: Uses
soxto resample audio to 48kHz Stereo (matching Snapserver). - Playback: Pipes raw audio data directly to the Snapserver TCP stream (
/tmp/snapfifoor TCP Port).
📂 File Manifest
speak_direct.sh: Main entry point called by Home Assistant via SSH.- Usage:
./speak_direct.sh "Text to speak" "model_name.onnx" "1.0"
- Usage:
test_snapcast.sh: Debug tool to verify the Snapcast pipe connection (generates a sine wave).play_wav_to_snapcast.sh: Helper utility to stream existing WAV files to the speakers.
🛠️ Integration with Home Assistant
Home Assistant triggers these scripts using a shell_command over SSH.
Home Assistant YAML:
shell_command:
tts_direct_piper: >
ssh -i /config/ssh/id_rsa -o StrictHostKeyChecking=no user@192.168.20.13
'/home/user/speech_piper/speak_direct.sh "{{ text }}" "{{ voice }}" "{{ speed }}"'
📦 Models
Voices are stored in the /data directory (ignored by Git due to size).
Required files for each voice:
voice_name.onnx(Binary model)voice_name.onnx.json(Config file)
Common Voices:
en_US-hal_6409-medium(HAL 9000 style)en_US-trump-medium(Danny - Parody)en_US-picard_7399-medium(Picard)
Description
Languages
Shell
99.2%
Roff
0.8%