Files
localgenai/pyinfra/framework/compose/homepage/services.yaml
noisedestroyers 6db46d8f6a Add OpenAI-compatible voice servers (faster-whisper + Kokoro)
Path B from VoiceModels.md — adds two new compose stacks alongside the
Wyoming pair so OpenWebUI/Conduit get voice without a Wyoming-shim:

- compose/faster-whisper.yml — fedirz/faster-whisper-server CPU image,
  large-v3-turbo by default, OpenAI /v1/audio/transcriptions on :8001.
  Built-in web UI for ad-hoc transcription.
- compose/kokoro.yml — ghcr.io/remsky/kokoro-fastapi-cpu, Kokoro-82M,
  OpenAI /v1/audio/speech on :8880.

Both run alongside (not instead of) Wyoming Whisper + Piper — Wyoming
keeps serving HA Assist, OpenAI-API serves OpenWebUI / Conduit. Memory
budget on Strix Halo accommodates everything plus Qwen3-Coder loaded
concurrently with plenty of headroom.

Homepage gets dedicated tiles for both. README documents the
OpenWebUI Audio configuration that wires the new endpoints. Conduit
inherits voice via OpenWebUI without app-side setup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:42:45 -04:00

111 lines
3.3 KiB
YAML

# Service tiles for the localgenai stack. Edit in place — pyinfra
# ships this once and never overwrites.
#
# Widget reference: https://gethomepage.dev/widgets/
- Inference:
- Ollama:
icon: ollama.svg
href: http://framework:11434
description: Local model server (Qwen3-Coder-30B and friends)
server: localhost-docker
container: ollama
widget:
type: ollama
url: http://framework:11434
- llama.cpp:
icon: si-llama
href: http://framework:8080
description: Vulkan-backed llama.cpp server (gfx1151)
server: localhost-docker
container: llama
# No native widget; a ping check confirms liveness.
widget:
type: customapi
url: http://framework:8080/health
refreshInterval: 30000
mappings:
- field: status
label: Status
- vLLM:
icon: mdi-server-network
href: http://framework:8000
description: Batched OpenAI-compatible serving (ROCm)
server: localhost-docker
container: vllm
- Agent UIs:
- OpenWebUI:
icon: open-webui.svg
href: http://framework:3000
description: Chat UI in front of Ollama, with SearXNG search
server: localhost-docker
container: openwebui
- OpenHands:
icon: mdi-robot
href: http://framework:3030
description: Autonomous coding agent in a Docker sandbox
server: localhost-docker
container: openhands
- Observability:
- Beszel:
icon: beszel.svg
href: http://framework:8090
description: Host + container + AMD GPU dashboard
server: localhost-docker
container: beszel
- OpenLIT:
icon: mdi-chart-line-variant
href: http://framework:3001
description: LLM fleet metrics (cost, tokens, latency)
server: localhost-docker
container: openlit
- Phoenix:
icon: arize-phoenix.svg
href: http://framework:6006
description: Per-trace agent waterfall / flamegraph
server: localhost-docker
container: phoenix
- Voice:
# Wyoming-protocol services have no web UI; tiles are informational.
# The OpenAI-compatible servers (faster-whisper, Kokoro) have UIs /
# APIs you can hit directly.
- Whisper (Wyoming):
icon: mdi-microphone-message
description: STT for Home Assistant Assist (Wyoming :10300)
server: localhost-docker
container: wyoming-whisper
- Piper (Wyoming):
icon: mdi-account-voice
description: TTS for Home Assistant Assist (Wyoming :10200)
server: localhost-docker
container: wyoming-piper
- faster-whisper:
icon: mdi-microphone
href: http://framework:8001
description: STT (OpenAI API) — large-v3-turbo, used by OpenWebUI/Conduit
server: localhost-docker
container: faster-whisper
- Kokoro:
icon: mdi-account-music
href: http://framework:8880/web
description: TTS (OpenAI API) — Kokoro-82M, used by OpenWebUI/Conduit
server: localhost-docker
container: kokoro
- External:
- SearXNG:
icon: searxng.svg
href: https://searxng.n0n.io
description: Self-hosted metasearch (used by OpenWebUI + OpenCode)