Files
localgenai/opencode/README.md
noisedestroyers a29793032d Document current coding-workflow stack state
Snapshot of where opencode + Qwen3-Coder + MCPs + Kimi-Linear + voice
  + Phoenix tracing land today, plus in-flight (oc-tree, kimi-linear
  context ramp) and next (ComfyUI) items with pointers to per-project
  NEXT_STEPS.md guides.
2026-05-10 21:14:43 -04:00

12 KiB

opencode setup

Canonical OpenCode config + Phoenix bridge plugin for the localgenai stack. install.sh deploys it to ~/.config/opencode/ on a Mac.

What's wired up

  • Local models: two providers, manually switched via /model.

    • framework/qwen3-coder:30b — Qwen3-Coder 30B-A3B via Ollama, the daily-driver coding model. 128K context, 11434.
    • framework-vllm/kimi-linear — Kimi-Linear 48B-A3B via vLLM, the long-context play (hybrid KDA/MLA, MoE 3B active). 32K context for now (ramps further in P3 of the kimi-linear roadmap), 8000. Tools disabled (tool_call: false) — Kimi-Linear is a research architecture release and isn't strongly tool-trained; the model knows the Kimi-K2 tool tokens but emits non-structured output when given an MCP toolbox. Use it for chat / long-context reasoning; switch to framework/qwen3-coder:30b for agentic work.
  • Playwright MCP (@playwright/mcp) — browser automation. The model can navigate pages, click, fill forms, read DOM snapshots. Closes the agentic-browsing gap.

  • SearXNG MCP (mcp-searxng) — web search via your self-hosted instance at https://searxng.n0n.io. No external API keys, no rate-limit roulette.

  • Serena MCP (oraios/serena) — LSP-backed semantic code navigation (find symbol, references, rename, insert before/after). Cuts the tokens a local 70B-class model burns on grep-style flailing by roughly an order of magnitude. Uses a custom trimmed context (serena-ide-trim.yml) that exposes only the 8 unique-LSP-value tools — JetBrains tools, line-level edits redundant with opencode's Edit, Serena's own memory tools (basic-memory MCP is canonical), and onboarding/meta noise are all excluded. Down from 46 raw → 41 ide-context-filtered → 8 active. Scoped to the cwd via --project-from-cwd.

  • basic-memory MCP (basicmachines-co/basic-memory) — Markdown-backed persistent memory across sessions. Storage lives in ~/Documents/obsidian/AI-memory/ (symlinked from ~/basic-memory), so notes are browsable in Obsidian's graph and search. Replaces Claude Code's auto-memory write-back, which opencode lacks natively.

  • sequential-thinking MCP (modelcontextprotocol/servers/sequentialthinking) — externalizes chain-of-thought as tool calls. Helps weaker local models stay on-plan over multi-step work; near-zero cost when not actively used.

  • github MCP (github/github-mcp-server) — GitHub repo / issue / PR / code-search access. Launched with --read-only and a narrowed --toolsets repos,issues,pull_requests,code_security allowlist. With a classic PAT (ghp_…), GitHub's auto-scope-filtering (Jan 2026) trims tools further by hiding ones whose scopes the token lacks — saves ~23k tokens of tool-list overhead, meaningful for a 70B's effective context. Requires GITHUB_PERSONAL_ACCESS_TOKEN to be exported in your shell env (not in opencode.json). Drop --read-only from opencode.json once you trust the model's tool calls.

    Note: This MCP is disabled since the user is utilizing a self-hosted Gitea instance instead of GitHub.

  • task-master MCP (eyaltoledano/claude-task-master) — Workflow / task-gate MCP. File-based: each project gets a .taskmaster/ dir with tasks, complexity, and config — no DB, no external service. OLLAMA_BASE_URL is pre-set in opencode.json so task-master's AI features (parse-prd, expand-task) route through your framework Ollama. The npm-global install also provides a task-master CLI (task-master init to scaffold per-project). Replaces the workflow-gate role originally proposed for Archon, without Supabase.

  • Phoenix bridge plugin (.opencode/plugin/phoenix-bridge.js) — exports OpenTelemetry spans for every LLM call, tool call, and subagent invocation to the Phoenix container running on the Framework Desktop. Per-prompt waterfall / flamegraph viz at http://framework:6006.

Setup

./install.sh

Idempotent — re-run after editing opencode.json or pulling changes to the plugin. Each step checks before doing work. Specifically:

  1. Verifies Homebrew is present (won't install it for you)
  2. brew install node uv jq sst/tap/opencode (skips if already at latest)
  3. Pre-caches Playwright's chromium so the first MCP call is instant
  4. uv tool install serena-agent@latest --prerelease=allow so opencode can launch Serena as a plain serena binary on PATH (faster than re-resolving via uvx on every session)
  5. Creates ~/Documents/obsidian/AI-memory/ and symlinks ~/basic-memory to it, so basic-memory MCP writes into the Obsidian vault by default
  6. brew install github-mcp-server and warns if GITHUB_PERSONAL_ACCESS_TOKEN isn't set in your shell — the MCP needs it to authenticate
  7. npm install -g task-master-ai (workflow MCP, also exposes the task-master CLI for task-master init per project)
  8. npm install in .opencode/plugin/ for the Phoenix bridge OTel deps
  9. Generates ~/.config/opencode/opencode.json from the repo's opencode.json, rewriting relative plugin paths to absolute so OpenCode loads the plugin regardless of which directory it's launched from

Step 9 is the reason the deployed config isn't a plain symlink. The repo's opencode.json uses a relative plugin path (./...) so it stays valid in place; the deployed copy is generated with that path resolved to an absolute one. Edits to the repo's opencode.json need a re-run of ./install.sh to take effect.

Verify

# Local model reachable
curl -s http://framework:11434/v1/models | jq '.data[].id'

# SearXNG instance answers JSON
curl -s 'https://searxng.n0n.io/search?q=test&format=json' | jq '.results | length'

Then in opencode:

opencode
> /mcp        # should list playwright, searxng, serena, basic-memory,
              # sequential-thinking, github, task-master as connected
> search the web for "qwen3-coder benchmarks"
> open https://example.com and tell me the H1
> use serena to find the definition of `parse_request`
> remember: this project ships its memory into the Obsidian vault
> /sequentialthinking think through the trade-offs of X vs Y
> list my recent github PRs across all repos
> task-master init   # then ask the model to plan tasks for this project

For parallel agents, plain tmux + git worktree is enough at the 70B's ~2-pane concurrency ceiling. A two-line zsh helper covers the "new isolated worktree → split tmux pane → start opencode" loop:

work() {
  local name="${1:?usage: work <branch-name>}"
  local wt="../$(basename "$PWD")-$name"
  git worktree add "$wt" -b "$name" && tmux split-window -h -c "$wt" "opencode"
}
unwork() { local wt="$PWD"; cd .. && git worktree remove --force "$wt"; }

Serena's first invocation in a project may take a few seconds — it indexes the workspace via the language server. basic-memory's first write creates the project layout under ~/Documents/obsidian/AI-memory/ which Obsidian will pick up on its next vault scan.

Phoenix tracing

The plugin at .opencode/plugin/phoenix-bridge.js boots an OpenTelemetry SDK on OpenCode startup and ships every span to Phoenix on the Framework Desktop. With experimental.openTelemetry: true (already set in opencode.json), OpenCode emits Vercel AI SDK spans that Phoenix renders as a per-turn waterfall: user prompt → main agent's ai.streamText → each tool call (built-in + MCP) with token counts and latencies inline.

The plugin uses @opentelemetry/exporter-trace-otlp-proto (not -http) because Phoenix's OTLP receiver only speaks protobuf — the JSON variant returns 415.

Spans go to Phoenix only. Earlier versions of this plugin dual-exported to OpenLIT as well, but OpenLIT's container doesn't currently host an OTLP receiver — the failing exporter cascaded into OpenCode's tool-call parsing pipeline and broke tool use. Re-enable once openlit.yml adds an otel-collector sidecar.

Defaults can be overridden via env vars (set before launching opencode):

Variable Default Purpose
PHOENIX_OTLP_ENDPOINT http://framework:6006/v1/traces Phoenix HTTP target
PHOENIX_SERVICE_NAME opencode Phoenix project name
PHOENIX_OTEL_DEBUG unset 1 to surface OTel internal logs

Verifying

: > /tmp/phoenix-bridge.log    # truncate prior runs
opencode                       # any directory; CWD doesn't matter
tail -f /tmp/phoenix-bridge.log

Healthy startup looks like:

plugin function entered
endpoint=http://framework:6006/v1/traces serviceName=opencode
OTel imports resolved
sdk.start() returned
tracer obtained
boot span emitted (will flush within ~5s)

Then open http://framework:6006/projects — an opencode project should appear with at least one phoenix-bridge.boot span. Send a prompt in OpenCode and real LLM-call traces follow.

If the plugin's deps aren't installed, OpenCode logs a warning and the plugin no-ops — the rest of OpenCode still works fine.

Known limitations

  • Subagent nesting is best-effort. The plugin opens a parent span per session and tries to stitch child sessions (Task-tool subagents) under their parent, but Vercel AI SDK spans live in their own OTel trace context. Until sst/opencode#6142 exposes sessionID in the chat.system.transform hook, child-session spans may show as separate traces in Phoenix.
  • Console output from plugins is swallowed by OpenCode's TUI. That's why init progress goes to /tmp/phoenix-bridge.log rather than stdout.

Notes

  • SearXNG JSON output must be enabled on the instance for the MCP server to work. If format=json returns HTML or 403, edit settings.yml on the SearXNG box: search.formats: [html, json], restart.
  • Playwright first-run downloads ~200 MB of browser binaries into ~/Library/Caches/ms-playwright/. Subsequent runs are instant.
  • Tool-calling reliability with Qwen3-Coder is decent but not Claude-grade. If a tool call hangs or returns malformed JSON, the model is the culprit, not the MCP. Worth trying the same prompt against a hosted Claude or GPT-5 to confirm before debugging the server.
  • Adding more MCP servers: drop another entry under the mcp key using the same type/command/enabled shape. The official MCP registry and Awesome MCP Servers catalog options.
  • Tool-list bloat is real on a local 70B. Every tool description costs context. Five MCP servers exposing ~10 tools each puts the active-tool list around 50 — manageable, but adding two more full-spectrum servers (e.g. GitHub MCP at ~70 tools without scope filtering, plus Context7) starts crowding effective context. Prefer servers with toolset filtering or per-agent allow-lists in opencode.
  • basic-memory storage path. The symlink ~/basic-memory~/Documents/obsidian/AI-memory is created by install.sh only if ~/basic-memory doesn't already exist. If you'd previously run basic-memory before this setup, move that directory's contents into AI-memory/ first, then delete ~/basic-memory and re-run install.sh.
  • Serena PATH gotcha. uv tool install puts serena in ~/.local/bin/. If your shell rc doesn't export that, opencode won't find the binary. The script warns; fix is one line in ~/.zshrc: export PATH="$HOME/.local/bin:$PATH".
  • Serena tool trim (serena-ide-trim.yml). The custom context excludes 28 tools beyond what the built-in ide context already filters. To re-expose any of them, edit serena-ide-trim.yml and remove the entry from excluded_tools, then re-run ./install.sh. The path injection (./serena-ide-trim.yml → absolute) is handled by install.sh's jq pass at deploy time.
  • GitHub PAT. Use a classic PAT (ghp_…) — auto-scope-filtering only kicks in for classic tokens, not fine-grained ones. Without it, the GitHub MCP exposes its full ~70-tool surface, which costs ~23k tokens of context the local 70B can ill afford. Generate at https://github.com/settings/tokens with the scopes you actually want exposed.