Files
localgenai/pyinfra/HowToScan.md
noisedestroyers 2c4bfefa95 Initial commit: localgenai stack
Containerized local LLM stack for the Framework Desktop / Strix Halo,
plus the OpenCode harness on the Mac side.

- pyinfra/framework/: pyinfra deploy targeting the box
  - llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override
    for gfx1151), OpenWebUI
  - Beszel (host + container + AMD GPU dashboard via sysfs)
  - OpenLIT (LLM fleet metrics)
  - Phoenix (per-trace agent waterfall)
  - OpenHands (autonomous agent in a Docker sandbox)
- opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter)
  - install.sh deploys to ~/.config/opencode/
- StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md:
  documentation and planning
- testing/qwen3-coder-30b/: small evaluation harness

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:35:10 -04:00

3.1 KiB

How to onboard an existing host into pyinfra

There's no auto-generator that reads a running system and emits a deploy.py — not in pyinfra, not in Ansible, not in any IaC tool. Reverse-engineering "what's intentional vs incidental" needs human judgment. Below is the workflow that actually works.

What pyinfra gives you for free

The facts API queries live state without writing any deploy:

pyinfra @host fact deb.DebPackages           # installed packages
pyinfra @host fact apt.AptSources            # extra apt repos
pyinfra @host fact systemd.SystemdServices   # service state
pyinfra @host fact server.Crontab            # cron
pyinfra @host fact server.Users              # users

Output is JSON. There's no fact → op translator, but the data is easy to grep through and tells you what's actually on the box.

Practical workflow

  1. Run a scanner against the live host that dumps the meaningful state — packages explicitly installed, enabled services, custom configs, dotfiles, sudoers fragments, fstab non-defaults. Skip the distro-default noise.
  2. Hand-pick the bits worth managing into a new <station>/deploy.py.
  3. Run with --dry against the live box; pyinfra's diff shows what's still drifting.
  4. Iterate until --dry reports "no changes."

One-shot scanner

ssh you@host '
echo "=== manually-installed packages ==="
apt-mark showmanual
echo "=== enabled services ==="
systemctl list-unit-files --state=enabled --type=service --no-legend | awk "{print \$1}"
echo "=== extra apt repos ==="
ls /etc/apt/sources.list.d/
echo "=== sudoers fragments ==="
ls /etc/sudoers.d/
echo "=== fstab non-default ==="
grep -vE "^#|^$|/proc|/sys|/dev/pts" /etc/fstab
echo "=== users with login shells ==="
getent passwd | awk -F: "\$3>=1000 && \$3<60000 {print \$1}"
echo "=== docker stacks ==="
ls /srv/docker/ 2>/dev/null
echo "=== home dotfiles ==="
sudo find /home -maxdepth 3 -name ".*rc" -o -name ".*config" 2>/dev/null
'

That gets you ~80% of what's worth managing in roughly 30 lines of pyinfra. The remaining 20% is case-by-case (kernel cmdline tweaks, hand-edited /etc files, dotfile contents) that no scanner fully captures — you write those as you re-discover the box's quirks.

What scanners can't catch

  • The intent behind a config (why this sysctl value, why this cron entry).
  • Hand-edited fragments mixed into otherwise-default files (e.g. a custom line in /etc/ssh/sshd_config).
  • Out-of-band state: data in databases, container volumes, /var/lib/<service>.

For these, treat pyinfra as managing the bones (packages, services, users, configs) and leave the data alone. Re-deploying onto fresh hardware should give you a working system; restoring data is a separate concern.

If "scan and reproduce" is a hard requirement

The native answer is NixOS. The whole OS is declarative; nixos-generate-config reads your machine and writes a full config that recreates it. Different paradigm, big switch — but the right answer if reproducible-from-a-file is core to your workflow and you aren't already deep in Ubuntu-land.