2026-05-08 11:35:10 -04:00
|
|
|
# TODO
|
|
|
|
|
|
|
|
|
|
## ROCm / vLLM on Strix Halo (gfx1151)
|
|
|
|
|
|
2026-05-08 16:34:52 -04:00
|
|
|
The Framework Desktop runs **Ubuntu 24.04 LTS (noble)**, which aligns
|
|
|
|
|
with AMD's ROCm 7.x packaging. The deploy installs `rocminfo` and
|
|
|
|
|
`librocm-smi-dev` host-side; heavier ROCm bits (full HIP toolchain,
|
|
|
|
|
device-mapped libraries) still run inside containers that ship their
|
|
|
|
|
own ROCm stack. The host stays slim by design.
|
2026-05-08 11:35:10 -04:00
|
|
|
|
|
|
|
|
### Open questions
|
|
|
|
|
|
|
|
|
|
- **Does `rocm/vllm:latest` actually run on Strix Halo's iGPU?** vLLM's
|
|
|
|
|
AMD support officially targets datacenter cards (MI300X / gfx942).
|
|
|
|
|
gfx1151 (RDNA 3.5 consumer) is a different ISA. If the stock image
|
|
|
|
|
doesn't initialize the device, try `rocm/vllm-dev:nightly` or build
|
|
|
|
|
from source against ROCm 7.x with `-DAMDGPU_TARGETS=gfx1151`.
|
2026-05-08 16:34:52 -04:00
|
|
|
|
|
|
|
|
### If you ever want full host-side ROCm
|
|
|
|
|
|
|
|
|
|
For native ROCm work on the host (compiling HIP kernels, full toolchain):
|
|
|
|
|
1. Bump `ROCM_VERSION` and `AMDGPU_INSTALL_DEB` in
|
|
|
|
|
`pyinfra/framework/deploy.py` to the latest release.
|
|
|
|
|
2. Add a step that runs `amdgpu-install -y --usecase=rocm --no-dkms`
|
|
|
|
|
(currently avoided to stay slim — ~25 GB toolchain).
|
|
|
|
|
3. `./run.sh`.
|
2026-05-08 11:35:10 -04:00
|
|
|
|
|
|
|
|
For container-only workflows (current default), no action is needed —
|
|
|
|
|
container images update independently of the host.
|
|
|
|
|
|
|
|
|
|
## Pick a coding model (StrixHaloSetup Phase 6)
|
|
|
|
|
|
|
|
|
|
Open question — research current Strix Halo benchmarks before
|
|
|
|
|
committing. Candidates: Qwen3-Coder, DeepSeek-Coder-V3.x, GLM-4.6,
|
|
|
|
|
Devstral, Kimi-K2. Track Kimi Linear separately via the weekly routine
|
|
|
|
|
referenced in `StrixHaloSetup.md`.
|