Initial commit: localgenai stack
Containerized local LLM stack for the Framework Desktop / Strix Halo,
plus the OpenCode harness on the Mac side.
- pyinfra/framework/: pyinfra deploy targeting the box
- llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override
for gfx1151), OpenWebUI
- Beszel (host + container + AMD GPU dashboard via sysfs)
- OpenLIT (LLM fleet metrics)
- Phoenix (per-trace agent waterfall)
- OpenHands (autonomous agent in a Docker sandbox)
- opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter)
- install.sh deploys to ~/.config/opencode/
- StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md:
documentation and planning
- testing/qwen3-coder-30b/: small evaluation harness
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
66
pyinfra/framework/compose/beszel.yml
Normal file
66
pyinfra/framework/compose/beszel.yml
Normal file
@@ -0,0 +1,66 @@
|
||||
# Beszel — host + container + GPU dashboard.
|
||||
# https://beszel.dev
|
||||
#
|
||||
# Picked over Prometheus+Grafana for this box because:
|
||||
# - The agent's `amd_sysfs` collector reads /sys/class/drm/card*/device/
|
||||
# directly, which is the only reliable GPU metric source on Strix Halo
|
||||
# (gfx1151). AMD's amd-smi / Device Metrics Exporter return N/A for
|
||||
# util/power/temp on this APU (ROCm#6035), so the official Prometheus
|
||||
# exporter path is dead.
|
||||
# - Two containers vs six.
|
||||
#
|
||||
# First-time setup (WebSocket connection model — current Beszel default):
|
||||
# 1. `docker compose up -d beszel` (start the hub)
|
||||
# 2. Open http://framework:8090, create the admin account
|
||||
# 3. Click "Add system" — the dialog gives you a TOKEN and an SSH KEY.
|
||||
# 4. Edit /srv/docker/beszel/.env (created empty by pyinfra; pyinfra
|
||||
# doesn't overwrite). Add:
|
||||
# BESZEL_TOKEN=<token-from-dialog>
|
||||
# BESZEL_KEY=ssh-ed25519 AAAA…
|
||||
# 5. `docker compose up -d --force-recreate beszel-agent`
|
||||
#
|
||||
# Docker Compose auto-reads the sibling .env file for ${VAR} interpolation
|
||||
# in the environment block below — so secrets stay out of the compose
|
||||
# file (which pyinfra overwrites) but the env-var names match exactly
|
||||
# what the agent expects.
|
||||
#
|
||||
# Why both TOKEN and KEY: TOKEN identifies which system this agent is,
|
||||
# KEY authenticates the agent (the SSH key is reused as the auth secret
|
||||
# in the WebSocket handshake). Rotate either by editing the .env and
|
||||
# `docker compose up -d --force-recreate`.
|
||||
services:
|
||||
beszel:
|
||||
image: henrygd/beszel:latest
|
||||
container_name: beszel
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8090:8090"
|
||||
volumes:
|
||||
- /srv/docker/beszel/data:/beszel_data
|
||||
|
||||
beszel-agent:
|
||||
image: henrygd/beszel-agent:latest
|
||||
container_name: beszel-agent
|
||||
restart: unless-stopped
|
||||
# Host networking so the agent sees real CPU/memory/network counters
|
||||
# without bridge-NAT distortion.
|
||||
network_mode: host
|
||||
volumes:
|
||||
# Read-only Docker socket for per-container CPU/mem/net.
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
# Sysfs paths the AMD GPU collector reads.
|
||||
- /sys/class/drm:/sys/class/drm:ro
|
||||
- /sys/class/hwmon:/sys/class/hwmon:ro
|
||||
environment:
|
||||
# Pulled from /srv/docker/beszel/.env at compose-parse time.
|
||||
TOKEN: "${BESZEL_TOKEN:-}"
|
||||
KEY: "${BESZEL_KEY:-}"
|
||||
# WebSocket dial-out target — the hub on this same host. The agent
|
||||
# is on host networking, so localhost is the host machine, where
|
||||
# the hub container exposes port 8090.
|
||||
HUB_URL: "http://localhost:8090"
|
||||
# Optional fallback: legacy SSH listener for hub-initiated probing.
|
||||
# Harmless to keep — hub only uses it if WebSocket is unreachable.
|
||||
LISTEN: "45876"
|
||||
# Enable the AMD sysfs GPU collector.
|
||||
GPU: "true"
|
||||
Reference in New Issue
Block a user