noisedestroyers 228fe8d1ac Build btop 1.4 from source with AMD GPU support
apt's btop on 24.04 is 1.3.x, which has no AMD GPU monitoring. 1.4+
adds it but requires C++23, which gcc-13 (24.04 default) doesn't fully
support. Plan:

- Add ubuntu-toolchain-r/test PPA, install g++-14 (C++23-capable).
- Add librocm-smi-dev to ROCm host diagnostics — btop dlopens
  librocm_smi64 at runtime; the headers are needed at compile time.
- Drop btop from apt list, build from a pinned BTOP_VERSION tag with
  GPU_SUPPORT=true CXX=g++-14 -j; install to /usr/local/bin.
- Idempotent — only rebuilds if installed version doesn't match.

After deploy: btop → Esc → Options → "show_gpu_info" → On to enable
the GPU panel.

Also clean up TODO.md — the box is on 24.04 (noble), not 26.04. The
libxml2 ABI mismatch / "ROCm gap" section was stale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 16:34:52 -04:00
2026-05-08 11:35:10 -04:00
2026-05-08 11:35:10 -04:00

localgenai

Local LLM stack on a Framework Desktop (AMD Ryzen AI Max+ 395 / "Strix Halo", Radeon 8060S, 128 GB unified LPDDR5x), driven from a Mac over Tailscale. Coding agents, monitoring, voice — all self-hosted.

Topology

┌─────────────────┐    Tailscale     ┌──────────────────────────────┐
│  Mac            │ ───────────────▶ │  Framework Desktop           │
│                 │                  │  (Ubuntu 26.04, gfx1151 iGPU)│
│  • OpenCode     │                  │  • Inference: Ollama,        │
│    + Phoenix    │                  │    llama.cpp, vLLM           │
│    bridge       │                  │  • Agent UIs: OpenWebUI,     │
│  • install.sh   │                  │    OpenHands                 │
│                 │                  │  • Monitoring: Beszel,       │
│                 │                  │    OpenLIT, Phoenix          │
└─────────────────┘                  └──────────────────────────────┘

Repo layout

Path What's there
pyinfra/ pyinfra deploy targeting the Framework Desktop. Host setup (kernel params, Docker, AMD driver bits) plus all docker-compose files for the services below. See pyinfra/framework/README.md.
opencode/ OpenCode config + Phoenix bridge plugin. install.sh deploys it to ~/.config/opencode/ on a Mac. Wires up Playwright / SearXNG MCP and ships every prompt's spans to Phoenix.
StrixHaloSetup.md Original phased bring-up plan for the box.
StrixHaloMemory.md UMA / GTT / bandwidth notes — "what fits, what's slow, why."
Roadmap.md What to build on top of the stack to narrow the gap with hosted Claude Code / Cowork / Skills / Design.
TODO.md Open questions and follow-ups.
testing/qwen3-coder-30b/ Small evaluation harness for Qwen3-Coder.
VoiceModels.md STT / TTS landscape and upgrade paths from the Wyoming defaults.

Service ports (on framework)

Port Service Notes
7575 Homepage Front door — start here. Tile per service with live widgets.
8080 llama.cpp Vulkan backend, --metrics for Prometheus
8000 vLLM ROCm; gfx1151 support varies
11434 Ollama ROCm with HSA_OVERRIDE_GFX_VERSION=11.0.0
3000 OpenWebUI ChatGPT-style UI in front of Ollama
3001 OpenLIT LLM fleet metrics dashboard
3030 OpenHands Autonomous agent + sandbox runtime — Tailscale-only by design
4317 Phoenix OTLP/gRPC Trace ingestion
6006 Phoenix UI / OTLP/HTTP Per-trace agent waterfall (also :6006/v1/traces)
8001 faster-whisper STT (OpenAI API) — large-v3-turbo, for OpenWebUI/Conduit
8090 Beszel Host + container + AMD GPU dashboard
8880 Kokoro TTS (OpenAI API) — Kokoro-82M, for OpenWebUI/Conduit
10200 Piper TTS (Wyoming protocol) — for Home Assistant Assist
10300 Whisper STT (Wyoming protocol) — for Home Assistant Assist

Quick start

On the box (after the pyinfra deploy):

cd /srv/docker/ollama && docker compose up -d
cd /srv/docker/phoenix && docker compose up -d
cd /srv/docker/beszel && docker compose up -d   # then add system per beszel.yml

On the Mac:

cd opencode && ./install.sh
opencode

Send a prompt; watch it land in Phoenix at http://framework:6006.

Why this stack

  • Bandwidth, not VRAM, is the ceiling on Strix Halo. 256 GB/s memory bandwidth means MoE models with small active-params (Qwen3-Coder-30B-A3B, Kimi-Linear-48B-A3B) dominate. See StrixHaloMemory.md.
  • AMD APU monitoring is broken upstream. amd-smi returns N/A on gfx1151, which kills the official Prometheus/Grafana exporters. Beszel reads sysfs directly and works. See the monitoring section of pyinfra/framework/README.md.
  • Two-layer observability. OpenLIT is fleet metrics across sessions; Phoenix is per-trace waterfall for "what did this one prompt do."
  • Reproducible. Every host-side config lives in pyinfra/; every Mac-side config in opencode/install.sh. Re-running either is safe.
Description
No description provided
Readme 290 KiB
Languages
Python 64.7%
Shell 23.8%
JavaScript 9.6%
Dockerfile 1.9%