Files
localgenai/TODO.md
noisedestroyers 2c4bfefa95 Initial commit: localgenai stack
Containerized local LLM stack for the Framework Desktop / Strix Halo,
plus the OpenCode harness on the Mac side.

- pyinfra/framework/: pyinfra deploy targeting the box
  - llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override
    for gfx1151), OpenWebUI
  - Beszel (host + container + AMD GPU dashboard via sysfs)
  - OpenLIT (LLM fleet metrics)
  - Phoenix (per-trace agent waterfall)
  - OpenHands (autonomous agent in a Docker sandbox)
- opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter)
  - install.sh deploys to ~/.config/opencode/
- StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md:
  documentation and planning
- testing/qwen3-coder-30b/: small evaluation harness

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:35:10 -04:00

45 lines
1.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TODO
## ROCm / vLLM on Strix Halo (gfx1151)
The Framework Desktop runs **Ubuntu 26.04 LTS**; AMD only ships ROCm
7.2.3 packages for jammy (22.04) and noble (24.04). We installed the
noble repo but pulled only `rocminfo` + `rocm-smi-lib` for host-side
diagnostics — all heavy ROCm work runs in containers, which ship their
own ROCm stack. This sidesteps the host-side libxml2 ABI mismatch (noble
ships `libxml2.so.2`, 26.04 ships `libxml2.so.16`) that broke the native
HIP toolchain.
### Open questions
- **Does `rocm/vllm:latest` actually run on Strix Halo's iGPU?** vLLM's
AMD support officially targets datacenter cards (MI300X / gfx942).
gfx1151 (RDNA 3.5 consumer) is a different ISA. If the stock image
doesn't initialize the device, try `rocm/vllm-dev:nightly` or build
from source against ROCm 7.x with `-DAMDGPU_TARGETS=gfx1151`.
- **AMD support for 26.04** — watch https://repo.radeon.com/amdgpu-install/<latest>/ubuntu/
for a directory matching the box's codename. AMD historically lags
Ubuntu LTS by 612 months for ROCm packaging.
### When 26.04 ROCm packages land
If you ever want to do native ROCm work on the host (rather than via
containers):
1. Bump `ROCM_VERSION` and `AMDGPU_INSTALL_DEB` in `pyinfra/deploy.py`
to the new release.
2. Update the apt source URL path in `deploy.py` if AMD adds a new
release codename (currently hardcoded to `noble`).
3. Add a step that runs `amdgpu-install -y --usecase=rocm --no-dkms`
(the current deploy explicitly avoids this to stay slim).
4. `./run.sh`.
For container-only workflows (current default), no action is needed —
container images update independently of the host.
## Pick a coding model (StrixHaloSetup Phase 6)
Open question — research current Strix Halo benchmarks before
committing. Candidates: Qwen3-Coder, DeepSeek-Coder-V3.x, GLM-4.6,
Devstral, Kimi-K2. Track Kimi Linear separately via the weekly routine
referenced in `StrixHaloSetup.md`.