noisedestroyers 37b0cd9a58 Build nvtop 3.2 from source (apt's 3.0.2 doesn't detect gfx1151)
Ubuntu 26.04 ships nvtop 3.0.2 via apt, which predates the gfx1151
sysfs detection improvements that landed in 3.2.x. Symptom: nvtop
runs but the iGPU doesn't appear.

Drop nvtop from the apt package list, add a from-source build step
that pulls a pinned NVTOP_VERSION, builds with -DAMDGPU_SUPPORT=ON,
and installs to /usr/local/bin (which wins over /usr/bin in PATH).
Idempotent: only rebuilds when the installed version doesn't match.

Run `sudo nvtop` to see container processes — non-root users only
see their own /proc/<pid>/fdinfo entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:56:10 -04:00
2026-05-08 11:35:10 -04:00
2026-05-08 11:35:10 -04:00
2026-05-08 11:35:10 -04:00

localgenai

Local LLM stack on a Framework Desktop (AMD Ryzen AI Max+ 395 / "Strix Halo", Radeon 8060S, 128 GB unified LPDDR5x), driven from a Mac over Tailscale. Coding agents, monitoring, voice — all self-hosted.

Topology

┌─────────────────┐    Tailscale     ┌──────────────────────────────┐
│  Mac            │ ───────────────▶ │  Framework Desktop           │
│                 │                  │  (Ubuntu 26.04, gfx1151 iGPU)│
│  • OpenCode     │                  │  • Inference: Ollama,        │
│    + Phoenix    │                  │    llama.cpp, vLLM           │
│    bridge       │                  │  • Agent UIs: OpenWebUI,     │
│  • install.sh   │                  │    OpenHands                 │
│                 │                  │  • Monitoring: Beszel,       │
│                 │                  │    OpenLIT, Phoenix          │
└─────────────────┘                  └──────────────────────────────┘

Repo layout

Path What's there
pyinfra/ pyinfra deploy targeting the Framework Desktop. Host setup (kernel params, Docker, AMD driver bits) plus all docker-compose files for the services below. See pyinfra/framework/README.md.
opencode/ OpenCode config + Phoenix bridge plugin. install.sh deploys it to ~/.config/opencode/ on a Mac. Wires up Playwright / SearXNG MCP and ships every prompt's spans to Phoenix.
StrixHaloSetup.md Original phased bring-up plan for the box.
StrixHaloMemory.md UMA / GTT / bandwidth notes — "what fits, what's slow, why."
Roadmap.md What to build on top of the stack to narrow the gap with hosted Claude Code / Cowork / Skills / Design.
TODO.md Open questions and follow-ups.
testing/qwen3-coder-30b/ Small evaluation harness for Qwen3-Coder.
VoiceModels.md STT / TTS landscape and upgrade paths from the Wyoming defaults.

Service ports (on framework)

Port Service Notes
7575 Homepage Front door — start here. Tile per service with live widgets.
8080 llama.cpp Vulkan backend, --metrics for Prometheus
8000 vLLM ROCm; gfx1151 support varies
11434 Ollama ROCm with HSA_OVERRIDE_GFX_VERSION=11.0.0
3000 OpenWebUI ChatGPT-style UI in front of Ollama
3001 OpenLIT LLM fleet metrics dashboard
3030 OpenHands Autonomous agent + sandbox runtime — Tailscale-only by design
4317 Phoenix OTLP/gRPC Trace ingestion
6006 Phoenix UI / OTLP/HTTP Per-trace agent waterfall (also :6006/v1/traces)
8001 faster-whisper STT (OpenAI API) — large-v3-turbo, for OpenWebUI/Conduit
8090 Beszel Host + container + AMD GPU dashboard
8880 Kokoro TTS (OpenAI API) — Kokoro-82M, for OpenWebUI/Conduit
10200 Piper TTS (Wyoming protocol) — for Home Assistant Assist
10300 Whisper STT (Wyoming protocol) — for Home Assistant Assist

Quick start

On the box (after the pyinfra deploy):

cd /srv/docker/ollama && docker compose up -d
cd /srv/docker/phoenix && docker compose up -d
cd /srv/docker/beszel && docker compose up -d   # then add system per beszel.yml

On the Mac:

cd opencode && ./install.sh
opencode

Send a prompt; watch it land in Phoenix at http://framework:6006.

Why this stack

  • Bandwidth, not VRAM, is the ceiling on Strix Halo. 256 GB/s memory bandwidth means MoE models with small active-params (Qwen3-Coder-30B-A3B, Kimi-Linear-48B-A3B) dominate. See StrixHaloMemory.md.
  • AMD APU monitoring is broken upstream. amd-smi returns N/A on gfx1151, which kills the official Prometheus/Grafana exporters. Beszel reads sysfs directly and works. See the monitoring section of pyinfra/framework/README.md.
  • Two-layer observability. OpenLIT is fleet metrics across sessions; Phoenix is per-trace waterfall for "what did this one prompt do."
  • Reproducible. Every host-side config lives in pyinfra/; every Mac-side config in opencode/install.sh. Re-running either is safe.
Description
No description provided
Readme 290 KiB
Languages
Python 64.7%
Shell 23.8%
JavaScript 9.6%
Dockerfile 1.9%