localgenai

Author	SHA1	Message	Date
noisedestroyers	a29793032d	Document current coding-workflow stack state Snapshot of where opencode + Qwen3-Coder + MCPs + Kimi-Linear + voice + Phoenix tracing land today, plus in-flight (oc-tree, kimi-linear context ramp) and next (ComfyUI) items with pointers to per-project NEXT_STEPS.md guides.	2026-05-10 21:14:43 -04:00
noisedestroyers	228fe8d1ac	Build btop 1.4 from source with AMD GPU support apt's btop on 24.04 is 1.3.x, which has no AMD GPU monitoring. 1.4+ adds it but requires C++23, which gcc-13 (24.04 default) doesn't fully support. Plan: - Add ubuntu-toolchain-r/test PPA, install g++-14 (C++23-capable). - Add librocm-smi-dev to ROCm host diagnostics — btop dlopens librocm_smi64 at runtime; the headers are needed at compile time. - Drop btop from apt list, build from a pinned BTOP_VERSION tag with GPU_SUPPORT=true CXX=g++-14 -j; install to /usr/local/bin. - Idempotent — only rebuilds if installed version doesn't match. After deploy: btop → Esc → Options → "show_gpu_info" → On to enable the GPU panel. Also clean up TODO.md — the box is on 24.04 (noble), not 26.04. The libxml2 ABI mismatch / "ROCm gap" section was stale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:34:52 -04:00
noisedestroyers	37b0cd9a58	Build nvtop 3.2 from source (apt's 3.0.2 doesn't detect gfx1151) Ubuntu 26.04 ships nvtop 3.0.2 via apt, which predates the gfx1151 sysfs detection improvements that landed in 3.2.x. Symptom: nvtop runs but the iGPU doesn't appear. Drop nvtop from the apt package list, add a from-source build step that pulls a pinned NVTOP_VERSION, builds with -DAMDGPU_SUPPORT=ON, and installs to /usr/local/bin (which wins over /usr/bin in PATH). Idempotent: only rebuilds when the installed version doesn't match. Run `sudo nvtop` to see container processes — non-root users only see their own /proc/<pid>/fdinfo entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 15:56:10 -04:00
noisedestroyers	d856839526	Fix Kokoro bind-mount permissions The kokoro-fastapi image runs as UID 1000 and downloads models into /app/api/src/models on first start. Our 2775 root:docker permissions weren't writable by that user (the container isn't in the docker group). Symptom: PermissionError on download_model.py, container crashloops. Chown the host dir to 1000:1000 to match the image's user. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 15:19:13 -04:00
noisedestroyers	6db46d8f6a	Add OpenAI-compatible voice servers (faster-whisper + Kokoro) Path B from VoiceModels.md — adds two new compose stacks alongside the Wyoming pair so OpenWebUI/Conduit get voice without a Wyoming-shim: - compose/faster-whisper.yml — fedirz/faster-whisper-server CPU image, large-v3-turbo by default, OpenAI /v1/audio/transcriptions on :8001. Built-in web UI for ad-hoc transcription. - compose/kokoro.yml — ghcr.io/remsky/kokoro-fastapi-cpu, Kokoro-82M, OpenAI /v1/audio/speech on :8880. Both run alongside (not instead of) Wyoming Whisper + Piper — Wyoming keeps serving HA Assist, OpenAI-API serves OpenWebUI / Conduit. Memory budget on Strix Halo accommodates everything plus Qwen3-Coder loaded concurrently with plenty of headroom. Homepage gets dedicated tiles for both. README documents the OpenWebUI Audio configuration that wires the new endpoints. Conduit inherits voice via OpenWebUI without app-side setup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 14:42:45 -04:00
noisedestroyers	36b8cfe835	Add Wyoming voice stack to pyinfra + landscape doc - Move piper-compose.yaml / whisper-compose.yaml from repo root into pyinfra/framework/compose/{piper,whisper}.yml; bind paths shifted to /srv/docker/{piper,whisper}/data on the box. - deploy.py registers both stacks and provisions the data dirs. - Homepage gets a "Voice" group with informational tiles (Wyoming has no web UI, so tiles show container status without click-through). - New VoiceModels.md captures the May 2026 STT/TTS landscape, why the current Wyoming defaults aren't SOTA, and concrete upgrade paths (whisper-large-v3-turbo + faster-whisper-server, Kokoro, Sesame CSM, F5-TTS for cloning). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:33:17 -04:00
noisedestroyers	5d3fce22a1	Open OpenHands UI to all interfaces Was bound to 127.0.0.1:3030 — overcautious on a Tailscale-only box where Phoenix/Beszel/OpenWebUI are all reached the same way. Updated the homepage tile description and added a security note in the README for the case where the host ever leaves the tailnet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 12:04:35 -04:00
noisedestroyers	178d7d3c0f	Add Homepage dashboard + dual-export OpenCode traces Homepage as the front door: single page at framework:7575 with one tile per service, live widgets where the upstream supports it (Ollama loaded models, container state via docker.sock, etc.), bookmarks for reference docs. Config files are pyinfra-managed — source of truth lives in compose/homepage/, sync by editing there and re-running ./run.sh. OpenCode plugin now dual-exports spans to Phoenix and OpenLIT in parallel. Phoenix remains the per-trace waterfall view; OpenLIT picks up the same data for fleet-level metrics. Each destination has its own batch processor so a hiccup at one doesn't block the other. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 12:00:05 -04:00
noisedestroyers	2c4bfefa95	Initial commit: localgenai stack Containerized local LLM stack for the Framework Desktop / Strix Halo, plus the OpenCode harness on the Mac side. - pyinfra/framework/: pyinfra deploy targeting the box - llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override for gfx1151), OpenWebUI - Beszel (host + container + AMD GPU dashboard via sysfs) - OpenLIT (LLM fleet metrics) - Phoenix (per-trace agent waterfall) - OpenHands (autonomous agent in a Docker sandbox) - opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter) - install.sh deploys to ~/.config/opencode/ - StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md: documentation and planning - testing/qwen3-coder-30b/: small evaluation harness Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 11:35:10 -04:00

9 Commits