Add Wyoming voice stack to pyinfra + landscape doc

- Move piper-compose.yaml / whisper-compose.yaml from repo root into
  pyinfra/framework/compose/{piper,whisper}.yml; bind paths shifted to
  /srv/docker/{piper,whisper}/data on the box.
- deploy.py registers both stacks and provisions the data dirs.
- Homepage gets a "Voice" group with informational tiles (Wyoming has
  no web UI, so tiles show container status without click-through).
- New VoiceModels.md captures the May 2026 STT/TTS landscape, why the
  current Wyoming defaults aren't SOTA, and concrete upgrade paths
  (whisper-large-v3-turbo + faster-whisper-server, Kokoro, Sesame CSM,
  F5-TTS for cloning).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-08 13:33:17 -04:00
parent 1816ae2458
commit 36b8cfe835
10 changed files with 292 additions and 32 deletions

View File

@@ -73,6 +73,21 @@
server: localhost-docker
container: phoenix
- Voice:
# Wyoming-protocol services have no web UI; tiles are informational
# (container status + port). Click-through goes nowhere meaningful.
- Whisper:
icon: mdi-microphone-message
description: Speech-to-text (Wyoming :10300)
server: localhost-docker
container: wyoming-whisper
- Piper:
icon: mdi-account-voice
description: Text-to-speech (Wyoming :10200)
server: localhost-docker
container: wyoming-piper
- External:
- SearXNG:
icon: searxng.svg

View File

@@ -21,6 +21,9 @@ layout:
Observability:
style: row
columns: 3
Voice:
style: row
columns: 3
External:
style: row
columns: 3

View File

@@ -0,0 +1,22 @@
# Wyoming Piper — text-to-speech over the Wyoming protocol.
# https://github.com/rhasspy/wyoming-piper
#
# Wyoming is Home Assistant's voice protocol; it's also consumable by any
# Wyoming client. No web UI — this is a protocol server on TCP :10200.
#
# Voice selection: en_US-lessac-medium is the most balanced English voice
# (~63 MB, natural prosody). Browse alternatives at
# https://github.com/rhasspy/piper/blob/master/VOICES.md — pulled into
# /srv/docker/piper/data on first start.
services:
piper:
image: rhasspy/wyoming-piper:latest
container_name: wyoming-piper
restart: unless-stopped
ports:
- "10200:10200"
volumes:
- /srv/docker/piper/data:/data
command:
- --voice
- en_US-lessac-medium

View File

@@ -0,0 +1,26 @@
# Wyoming Whisper — speech-to-text over the Wyoming protocol.
# https://github.com/rhasspy/wyoming-whisper
#
# Wyoming is Home Assistant's voice protocol; it's also consumable by any
# Wyoming client. No web UI — this is a protocol server on TCP :10300.
#
# Model selection: `tiny-int8` is the smallest viable model (~75 MB),
# fast and good enough for command-style transcription. Bump to
# `base-int8` (140 MB) or `small-int8` (480 MB) for general dictation.
# Models are downloaded into /srv/docker/whisper/data on first start.
services:
whisper:
image: rhasspy/wyoming-whisper:latest
container_name: wyoming-whisper
restart: unless-stopped
ports:
- "10300:10300"
volumes:
- /srv/docker/whisper/data:/data
command:
- --model
- tiny-int8
- --language
- en
- --beam-size
- "1"