Initial commit: localgenai stack

Containerized local LLM stack for the Framework Desktop / Strix Halo, plus the OpenCode harness on the Mac side. - pyinfra/framework/: pyinfra deploy targeting the box - llama.cpp (Vulkan), vLLM (ROCm), Ollama (ROCm with HSA override for gfx1151), OpenWebUI - Beszel (host + container + AMD GPU dashboard via sysfs) - OpenLIT (LLM fleet metrics) - Phoenix (per-trace agent waterfall) - OpenHands (autonomous agent in a Docker sandbox) - opencode/: OpenCode config + Phoenix bridge plugin (OTel exporter) - install.sh deploys to ~/.config/opencode/ - StrixHaloSetup.md / StrixHaloMemory.md / Roadmap.md / TODO.md: documentation and planning - testing/qwen3-coder-30b/: small evaluation harness Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:35:10 -04:00
commit 2c4bfefa95
36 changed files with 5265 additions and 0 deletions
--- a/opencode/README.md
+++ b/opencode/README.md
@@ -0,0 +1,138 @@
+# opencode setup
+
+Canonical OpenCode config + Phoenix bridge plugin for the localgenai
+stack. `install.sh` deploys it to `~/.config/opencode/` on a Mac.
+
+## What's wired up
+
+- **Local model**: `framework/qwen3-coder:30b` served by Ollama on the
+  Framework Desktop, reachable over Tailscale.
+- **Playwright MCP** ([@playwright/mcp](https://github.com/microsoft/playwright-mcp)) —
+  browser automation. The model can navigate pages, click, fill forms,
+  read DOM snapshots. Closes the agentic-browsing gap.
+- **SearXNG MCP** ([mcp-searxng](https://github.com/ihor-sokoliuk/mcp-searxng)) —
+  web search via your self-hosted instance at <https://searxng.n0n.io>.
+  No external API keys, no rate-limit roulette.
+- **Phoenix bridge plugin** (`.opencode/plugin/phoenix-bridge.js`) —
+  exports OpenTelemetry spans for every LLM call, tool call, and
+  subagent invocation to the Phoenix container running on the Framework
+  Desktop. Per-prompt waterfall / flamegraph viz at
+  <http://framework:6006>.
+
+## Setup
+
+```sh
+./install.sh
+```
+
+Idempotent — re-run after editing `opencode.json` or pulling changes to
+the plugin. Each step checks before doing work. Specifically:
+
+1. Verifies Homebrew is present (won't install it for you)
+2. `brew install node uv jq sst/tap/opencode` (skips if already at latest)
+3. Pre-caches Playwright's chromium so the first MCP call is instant
+4. `npm install` in `.opencode/plugin/` for the Phoenix bridge OTel deps
+5. Generates `~/.config/opencode/opencode.json` from the repo's
+   `opencode.json`, rewriting relative plugin paths to absolute so
+   OpenCode loads the plugin regardless of which directory it's launched
+   from
+
+Step 5 is the reason the deployed config isn't a plain symlink. The
+repo's `opencode.json` uses a relative plugin path (`./...`) so it stays
+valid in place; the deployed copy is generated with that path resolved
+to an absolute one. Edits to the repo's `opencode.json` need a re-run
+of `./install.sh` to take effect.
+
+## Verify
+
+```sh
+# Local model reachable
+curl -s http://framework:11434/v1/models | jq '.data[].id'
+
+# SearXNG instance answers JSON
+curl -s 'https://searxng.n0n.io/search?q=test&format=json' | jq '.results | length'
+```
+
+Then in opencode:
+
+```
+opencode
+> /mcp        # should list playwright and searxng as connected
+> search the web for "qwen3-coder benchmarks"
+> open https://example.com and tell me the H1
+```
+
+## Phoenix tracing
+
+The plugin at `.opencode/plugin/phoenix-bridge.js` boots an OpenTelemetry
+SDK on OpenCode startup and ships every span to Phoenix on the Framework
+Desktop. With `experimental.openTelemetry: true` (already set in
+`opencode.json`), OpenCode emits Vercel AI SDK spans that Phoenix renders
+as a per-turn waterfall: user prompt → main agent's `ai.streamText` →
+each tool call (built-in + MCP) with token counts and latencies inline.
+
+The plugin uses `@opentelemetry/exporter-trace-otlp-proto` (not `-http`)
+because Phoenix's OTLP receiver only speaks protobuf — the JSON variant
+returns 415.
+
+Defaults can be overridden via env vars (set before launching opencode):
+
+| Variable | Default | Purpose |
+|---|---|---|
+| `PHOENIX_OTLP_ENDPOINT` | `http://framework:6006/v1/traces` | OTLP/HTTP target |
+| `PHOENIX_SERVICE_NAME` | `opencode` | Phoenix project name |
+| `PHOENIX_OTEL_DEBUG` | unset | `1` to surface OTel internal logs |
+
+### Verifying
+
+```sh
+: > /tmp/phoenix-bridge.log    # truncate prior runs
+opencode                       # any directory; CWD doesn't matter
+tail -f /tmp/phoenix-bridge.log
+```
+
+Healthy startup looks like:
+```
+plugin function entered
+endpoint=http://framework:6006/v1/traces serviceName=opencode
+OTel imports resolved
+sdk.start() returned
+tracer obtained
+boot span emitted (will flush within ~5s)
+```
+
+Then open <http://framework:6006/projects> — an `opencode` project should
+appear with at least one `phoenix-bridge.boot` span. Send a prompt in
+OpenCode and real LLM-call traces follow.
+
+If the plugin's deps aren't installed, OpenCode logs a warning and the
+plugin no-ops — the rest of OpenCode still works fine.
+
+### Known limitations
+
+- **Subagent nesting is best-effort.** The plugin opens a parent span
+  per session and tries to stitch child sessions (Task-tool subagents)
+  under their parent, but Vercel AI SDK spans live in their own OTel
+  trace context. Until [sst/opencode#6142](https://github.com/sst/opencode/issues/6142)
+  exposes `sessionID` in the `chat.system.transform` hook, child-session
+  spans may show as separate traces in Phoenix.
+- **Console output from plugins is swallowed by OpenCode's TUI.** That's
+  why init progress goes to `/tmp/phoenix-bridge.log` rather than stdout.
+
+## Notes
+
+- **SearXNG JSON output** must be enabled on the instance for the MCP
+  server to work. If `format=json` returns HTML or 403, edit
+  `settings.yml` on the SearXNG box: `search.formats: [html, json]`,
+  restart.
+- **Playwright first-run** downloads ~200 MB of browser binaries into
+  `~/Library/Caches/ms-playwright/`. Subsequent runs are instant.
+- **Tool-calling reliability** with Qwen3-Coder is decent but not
+  Claude-grade. If a tool call hangs or returns malformed JSON, the
+  model is the culprit, not the MCP. Worth trying the same prompt
+  against a hosted Claude or GPT-5 to confirm before debugging the
+  server.
+- **Adding more MCP servers**: drop another entry under the `mcp` key
+  using the same `type/command/enabled` shape. The
+  [official MCP registry](https://registry.modelcontextprotocol.io/)
+  and [Awesome MCP Servers](https://mcpservers.org/) catalog options.