Containerized local LLM stack for the Framework Desktop / Strix Halo,
plus the OpenCode harness on the Mac side.
- pyinfra/framework/: pyinfra deploy targeting the box
- llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override
for gfx1151), OpenWebUI
- Beszel (host + container + AMD GPU dashboard via sysfs)
- OpenLIT (LLM fleet metrics)
- Phoenix (per-trace agent waterfall)
- OpenHands (autonomous agent in a Docker sandbox)
- opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter)
- install.sh deploys to ~/.config/opencode/
- StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md:
documentation and planning
- testing/qwen3-coder-30b/: small evaluation harness
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
67 lines
2.7 KiB
YAML
67 lines
2.7 KiB
YAML
# Beszel — host + container + GPU dashboard.
|
|
# https://beszel.dev
|
|
#
|
|
# Picked over Prometheus+Grafana for this box because:
|
|
# - The agent's `amd_sysfs` collector reads /sys/class/drm/card*/device/
|
|
# directly, which is the only reliable GPU metric source on Strix Halo
|
|
# (gfx1151). AMD's amd-smi / Device Metrics Exporter return N/A for
|
|
# util/power/temp on this APU (ROCm#6035), so the official Prometheus
|
|
# exporter path is dead.
|
|
# - Two containers vs six.
|
|
#
|
|
# First-time setup (WebSocket connection model — current Beszel default):
|
|
# 1. `docker compose up -d beszel` (start the hub)
|
|
# 2. Open http://framework:8090, create the admin account
|
|
# 3. Click "Add system" — the dialog gives you a TOKEN and an SSH KEY.
|
|
# 4. Edit /srv/docker/beszel/.env (created empty by pyinfra; pyinfra
|
|
# doesn't overwrite). Add:
|
|
# BESZEL_TOKEN=<token-from-dialog>
|
|
# BESZEL_KEY=ssh-ed25519 AAAA…
|
|
# 5. `docker compose up -d --force-recreate beszel-agent`
|
|
#
|
|
# Docker Compose auto-reads the sibling .env file for ${VAR} interpolation
|
|
# in the environment block below — so secrets stay out of the compose
|
|
# file (which pyinfra overwrites) but the env-var names match exactly
|
|
# what the agent expects.
|
|
#
|
|
# Why both TOKEN and KEY: TOKEN identifies which system this agent is,
|
|
# KEY authenticates the agent (the SSH key is reused as the auth secret
|
|
# in the WebSocket handshake). Rotate either by editing the .env and
|
|
# `docker compose up -d --force-recreate`.
|
|
services:
|
|
beszel:
|
|
image: henrygd/beszel:latest
|
|
container_name: beszel
|
|
restart: unless-stopped
|
|
ports:
|
|
- "8090:8090"
|
|
volumes:
|
|
- /srv/docker/beszel/data:/beszel_data
|
|
|
|
beszel-agent:
|
|
image: henrygd/beszel-agent:latest
|
|
container_name: beszel-agent
|
|
restart: unless-stopped
|
|
# Host networking so the agent sees real CPU/memory/network counters
|
|
# without bridge-NAT distortion.
|
|
network_mode: host
|
|
volumes:
|
|
# Read-only Docker socket for per-container CPU/mem/net.
|
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
|
# Sysfs paths the AMD GPU collector reads.
|
|
- /sys/class/drm:/sys/class/drm:ro
|
|
- /sys/class/hwmon:/sys/class/hwmon:ro
|
|
environment:
|
|
# Pulled from /srv/docker/beszel/.env at compose-parse time.
|
|
TOKEN: "${BESZEL_TOKEN:-}"
|
|
KEY: "${BESZEL_KEY:-}"
|
|
# WebSocket dial-out target — the hub on this same host. The agent
|
|
# is on host networking, so localhost is the host machine, where
|
|
# the hub container exposes port 8090.
|
|
HUB_URL: "http://localhost:8090"
|
|
# Optional fallback: legacy SSH listener for hub-initiated probing.
|
|
# Harmless to keep — hub only uses it if WebSocket is unreachable.
|
|
LISTEN: "45876"
|
|
# Enable the AMD sysfs GPU collector.
|
|
GPU: "true"
|