speech_piper/README.md

# Piper TTS Server (Local Text-to-Speech)

This repository contains scripts to generate high-quality Neural TTS audio using **Piper** and stream it directly to **Snapcast** for multi-room audio announcements.

## ⚡ Functionality

The primary script `speak_direct.sh` performs the following pipeline:
1.  **Input:** Text string, Voice Model, Speed.
2.  **Generation:** Calls the `piper` binary to generate a `.wav` file locally.
3.  **Processing:** Uses `sox` to resample audio to 48kHz Stereo (matching Snapserver).
4.  **Playback:** Pipes raw audio data directly to the Snapserver TCP stream (`/tmp/snapfifo` or TCP Port).

## 📂 File Manifest

*   **`speak_direct.sh`**: Main entry point called by Home Assistant via SSH.
    *   Usage: `./speak_direct.sh "Text to speak" "model_name.onnx" "1.0"`
*   **`test_snapcast.sh`**: Debug tool to verify the Snapcast pipe connection (generates a sine wave).
*   **`play_wav_to_snapcast.sh`**: Helper utility to stream existing WAV files to the speakers.

## 🛠️ Integration with Home Assistant

Home Assistant triggers these scripts using a `shell_command` over SSH.

**Home Assistant YAML:**
```yaml
shell_command:
  tts_direct_piper: >
    ssh -i /config/ssh/id_rsa -o StrictHostKeyChecking=no user@192.168.20.13
    '/home/user/speech_piper/speak_direct.sh "{{ text }}" "{{ voice }}" "{{ speed }}"'
```

## 📦 Models

Voices are stored in the `/data` directory (ignored by Git due to size).
Required files for each voice:
1.  `voice_name.onnx` (Binary model)
2.  `voice_name.onnx.json` (Config file)

**Common Voices:**
*   `en_US-hal_6409-medium` (HAL 9000 style)
*   `en_US-trump-medium` (Danny - Parody)
*   `en_US-picard_7399-medium` (Picard)