Commit Graph

7 Commits

Author SHA1 Message Date
6db46d8f6a Add OpenAI-compatible voice servers (faster-whisper + Kokoro)
Path B from VoiceModels.md — adds two new compose stacks alongside the
Wyoming pair so OpenWebUI/Conduit get voice without a Wyoming-shim:

- compose/faster-whisper.yml — fedirz/faster-whisper-server CPU image,
  large-v3-turbo by default, OpenAI /v1/audio/transcriptions on :8001.
  Built-in web UI for ad-hoc transcription.
- compose/kokoro.yml — ghcr.io/remsky/kokoro-fastapi-cpu, Kokoro-82M,
  OpenAI /v1/audio/speech on :8880.

Both run alongside (not instead of) Wyoming Whisper + Piper — Wyoming
keeps serving HA Assist, OpenAI-API serves OpenWebUI / Conduit. Memory
budget on Strix Halo accommodates everything plus Qwen3-Coder loaded
concurrently with plenty of headroom.

Homepage gets dedicated tiles for both. README documents the
OpenWebUI Audio configuration that wires the new endpoints. Conduit
inherits voice via OpenWebUI without app-side setup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:42:45 -04:00
36b8cfe835 Add Wyoming voice stack to pyinfra + landscape doc
- Move piper-compose.yaml / whisper-compose.yaml from repo root into
  pyinfra/framework/compose/{piper,whisper}.yml; bind paths shifted to
  /srv/docker/{piper,whisper}/data on the box.
- deploy.py registers both stacks and provisions the data dirs.
- Homepage gets a "Voice" group with informational tiles (Wyoming has
  no web UI, so tiles show container status without click-through).
- New VoiceModels.md captures the May 2026 STT/TTS landscape, why the
  current Wyoming defaults aren't SOTA, and concrete upgrade paths
  (whisper-large-v3-turbo + faster-whisper-server, Kokoro, Sesame CSM,
  F5-TTS for cloning).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:33:17 -04:00
1816ae2458 Revert OpenCode dual-export — Phoenix only
The OpenLIT secondary exporter regressed tool-call parsing in OpenCode:
OpenLIT's image doesn't currently host an OTLP receiver on 4328, so the
exporter retries failed silently and the failures cascaded into the AI
SDK's telemetry pipeline. Symptom: model output came through as raw
Qwen3-Coder XML tool-call text instead of being parsed into actual tool
invocations.

Re-add when openlit.yml gets an otel-collector sidecar that actually
listens on the receiver ports.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:19:39 -04:00
5d3fce22a1 Open OpenHands UI to all interfaces
Was bound to 127.0.0.1:3030 — overcautious on a Tailscale-only box
where Phoenix/Beszel/OpenWebUI are all reached the same way. Updated
the homepage tile description and added a security note in the README
for the case where the host ever leaves the tailnet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:04:35 -04:00
178d7d3c0f Add Homepage dashboard + dual-export OpenCode traces
Homepage as the front door: single page at framework:7575 with one tile
per service, live widgets where the upstream supports it (Ollama loaded
models, container state via docker.sock, etc.), bookmarks for reference
docs. Config files are pyinfra-managed — source of truth lives in
compose/homepage/, sync by editing there and re-running ./run.sh.

OpenCode plugin now dual-exports spans to Phoenix and OpenLIT in
parallel. Phoenix remains the per-trace waterfall view; OpenLIT picks
up the same data for fleet-level metrics. Each destination has its own
batch processor so a hiccup at one doesn't block the other.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:00:05 -04:00
f23f7b8cc9 Add top-level README
One-page orientation: topology diagram, repo layout, service-port table,
quick-start, and the rationale notes that get glossed over in the
sub-READMEs (bandwidth ceiling, AMD APU monitoring gap, two-layer
observability).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:41:25 -04:00
2c4bfefa95 Initial commit: localgenai stack
Containerized local LLM stack for the Framework Desktop / Strix Halo,
plus the OpenCode harness on the Mac side.

- pyinfra/framework/: pyinfra deploy targeting the box
  - llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override
    for gfx1151), OpenWebUI
  - Beszel (host + container + AMD GPU dashboard via sysfs)
  - OpenLIT (LLM fleet metrics)
  - Phoenix (per-trace agent waterfall)
  - OpenHands (autonomous agent in a Docker sandbox)
- opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter)
  - install.sh deploys to ~/.config/opencode/
- StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md:
  documentation and planning
- testing/qwen3-coder-30b/: small evaluation harness

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:35:10 -04:00