2026-01-09 10:28:44 +11:00
2026-01-09 10:28:44 +11:00
2026-01-09 10:28:44 +11:00
2026-01-09 10:28:44 +11:00
2026-01-09 10:28:44 +11:00
2026-01-09 10:28:44 +11:00

Voice Assistant Bridge (Docker Backend)

This repository contains the backend infrastructure for the local Voice Assistant system. It acts as the middleware between the ESP32 audio stream and Home Assistant.

🏗️ Architecture

graph LR
    ESP32[ESP32 Hardware] -->|MQTT Audio Stream| Bridge
    Bridge[Python Bridge] -->|HTTP Request| Whisper[Faster-Whisper Container]
    Whisper -->|Text| Bridge
    Bridge -->|MQTT Text Command| HA[Home Assistant]
    Bridge -->|MQTT Status| ESP32

🧩 Components

1. docker-compose.yml

Orchestrates two containers:

  • voice-bridge: The logic handler. Listens to MQTT audio, buffers it, detects Wake Words via OpenWakeWord, and handles routing.
  • whisper-api: A lightweight Flask API wrapping faster-whisper for Speech-to-Text conversion.

2. mqtt_audio_bridge.py

The main Python script running inside the voice-bridge container.

  • Input: Listens to raw PCM audio on voice/audio_stream (Broker .13).
  • Processing:
    • Uses OpenWakeWord to detect "Hey Jarvis".
    • Buffers audio and sends to Whisper API.
    • Safety mechanisms (Memory limits, Log rotation).
  • Output: Publishes transcribed text to homeassistant/voice/text (Broker .30).

3. app.py

The Whisper API endpoint.

  • Model: small.en (Optimized for CPU usage with int8 quantization).
  • Language: Locked to English to prevent hallucinations on static/silence.

🚀 Deployment

  1. Requirements: Docker & Docker Compose.
  2. Configuration: Update IP addresses in mqtt_audio_bridge.py for your MQTT brokers.
  3. Run:
    docker compose up -d
    

🔧 Debugging

View Logs:

docker compose logs -f voice_bridge

Restart Stack:

docker compose restart
Description
Voice Assistant uses this. There is a copy of this in the voice assistant.
Readme 217 MiB
Languages
Python 95.8%
Cython 2.7%
C 0.7%
XSLT 0.4%
C++ 0.2%