folder-per-station

2026-05-07 07:37:40 -04:00
parent f27b96a68e
commit 7b594c71b1
8 changed files with 496 additions and 65 deletions
--- a/README.md
+++ b/README.md
@@ -1,68 +1,12 @@
-# pyinfra: Strix Halo bring-up
+# pyinfra

-Containerized setup for the Framework Desktop (Ryzen AI Max+ 395, Radeon
-8060S, 128 GB). The host stays minimal — kernel + driver + Docker +
-diagnostics. Inference engines (llama.cpp, vLLM, Ollama) run as docker
-compose services, each shipping its own ROCm/Vulkan stack.
+One folder per station. Each subfolder is a self-contained pyinfra
+deploy: `inventory.py`, `deploy.py`, `run.sh`, plus any compose files
+or assets that ship to the host.

-## Manual prerequisites
+| Station | Host | Notes |
+|---------|------|-------|
+| [`framework/`](framework/README.md) | `10.0.0.237` | Framework Desktop (Strix Halo, 128 GB) — local LLM box |

-1. **Phase 0** — update Framework BIOS, set GPU UMA carve-out (96 GB).
-2. **OS install** — Ubuntu Server (24.04 LTS recommended; 26.04 also works
-   but ROCm host-side support is patchy — see `../TODO.md`). Enable SSH,
-   import your laptop key, create user `noise`.
-3. The host must be reachable at `10.0.0.237` over SSH (edit `inventory.py`
-   if it moves).
-4. **NOPASSWD sudo for `noise`** — pyinfra's fact layer doesn't reliably
-   thread sudo passwords. One-time setup:
-   ```sh
-   ssh noise@10.0.0.237 'echo "noise ALL=(ALL) NOPASSWD: ALL" | sudo tee /etc/sudoers.d/noise-nopasswd && sudo chmod 440 /etc/sudoers.d/noise-nopasswd'
-   ```
-
-## Run
-
-```sh
-uv tool install pyinfra
-./run.sh           # equivalent to: pyinfra inventory.py deploy.py
-./run.sh --dry     # any extra args are forwarded to pyinfra
-```
-
-Or run it ephemerally without installing: `uvx pyinfra inventory.py deploy.py`.
-
-## What the deploy does
-
- Base CLI: tmux, vim, htop, btop, nvtop, radeontop, uv
- Tailscale (run `sudo tailscale up` on the box once, interactively)
- Docker engine + compose plugin, user added to `docker` group
- ROCm host diagnostics only (`rocminfo`, `rocm-smi`) — no full toolchain
- `/models/<vendor>/` layout
- `~/docker/{llama,vllm,ollama}/docker-compose.yml` dropped in,
-  not auto-started — you edit the model path then `docker compose up -d`
-
-If a previous run installed the native llama.cpp build / full ROCm /
-native Ollama, those are auto-cleaned the next time `./run.sh` runs.
-
-## After the deploy: starting an inference service
-
-```sh
-ssh noise@10.0.0.237
-sudo tailscale up                            # one-time, interactive
-
-# Drop a GGUF somewhere under /models, then:
-cd ~/docker/llama
-vim docker-compose.yml                       # edit the --model path
-docker compose up -d
-curl localhost:8080/v1/models                # smoke test
-```
-
-Same shape for `vllm` (port 8000) and `ollama` (port 11434, no model edit
-needed — Ollama serves models on demand).
-
-## Tunables
-
-Top of `deploy.py`:
- `ROCM_VERSION` and `AMDGPU_INSTALL_DEB` — bump when AMD ships a newer
-  release. The .deb filename has a build suffix that doesn't derive from
-  the version; find it at https://repo.radeon.com/amdgpu-install/.
-
-Compose images in `compose/{llama,vllm,ollama}.yml` — pin tags here.
+To bring up a station, `cd` into its folder and run `./run.sh`. See the
+station's own README for prerequisites.