localgenai/pyinfra/framework/compose/coder/README.md

# Coder — workspace manager (pilot)

Self-hosted control plane at <http://framework:7080> that creates
per-project dev containers from Terraform templates. Each workspace:
its own container, browser code-server with the Claude Code extension
pre-installed, no host-port bookkeeping (everything tunnels through the
dashboard), and idle autostop so parked projects give their RAM back to
the inference stacks.

**Pilot status.** Evaluating against the standalone code-server stack
(`/srv/docker/code-server`, kept as-is meanwhile). Verdict after a week
of real use: does the dashboard + autostop + create-from-template earn
an always-on control plane + Postgres? If yes, the standalone stack
retires; if no, Coder goes and a thin `workspace` spawner script
replaces it.

## Bring-up

```sh
cd /srv/docker/coder && docker compose up -d
```

The sibling `.env` (DOCKER_GROUP_ID, CODER_ACCESS_URL, random Postgres
password) is generated by deploy.py on first `./run.sh` — no hand-fill
needed. First visit to <http://framework:7080> creates the admin
account (pick anything; it's local to the box).

> **HTTPS is required for extension panels.** VS Code webviews (the
> Claude Code panel, markdown preview, etc.) run on service workers,
> which browsers only allow in a secure context — over plain
> `http://framework:7080` the editor works but webview panels render
> blank. Fix via Tailscale Serve (real Let's Encrypt cert for the
> tailnet name; enable "HTTPS Certificates" once in the Tailscale
> admin console):
>
> ```sh
> sudo tailscale serve --bg 7080
> # then in .env: CODER_ACCESS_URL=https://framework.<tailnet>.ts.net
> docker compose up -d        # recreate server with new access URL
> # restart any existing workspace — app URLs derive from the access URL
> ```
>
> Localhost is also a secure context, so
> `ssh -L 7080:localhost:7080 framework` + http://localhost:7080 works
> in a pinch.

## Push the template (one-time + after edits)

The `coder` CLI ships inside the server image; authenticate it once:

```sh
docker compose exec coder coder login http://localhost:7080
# prints a /cli-auth URL — open http://framework:7080/cli-auth in your
# browser, copy the session token, paste it back
```

Then push:

```sh
docker compose exec coder coder templates push code-server \
  --directory /templates/code-server --yes
```

Template source of truth is the repo
(`pyinfra/framework/compose/coder/templates/code-server/main.tf`) —
edit there, `./run.sh`, re-push. Edits on the box get overwritten.

## First workspace

Dashboard → Workspaces → Create → `code-server` template → name it
after the project. Open the code-server app tile, then one-time
Claude sign-in: the extension shows an OAuth URL → open in another tab
→ approve → paste the code back. Credentials live in `~/.claude` on
the workspace's home volume and survive stop/start and rebuilds.

Set **idle autostop** under Template → Settings → Schedule (suggest
1–2 h inactivity). Activity = open code-server tab, SSH, web terminal.

## What persists where

| Thing                            | Where                                        |
| -------------------------------- | -------------------------------------------- |
| Workspace home (repos, ~/.claude, extensions) | named volume `coder-<workspace-id>-home` |
| Control-plane state (users, templates, workspace defs) | Postgres → `/srv/docker/coder/postgres` |
| Template source                  | this repo, shipped to `/srv/docker/coder/templates` |

A stopped workspace's container is deleted; only the home volume
remains. `docker volume ls | grep coder-` to audit.

## Security

Same posture as OpenHands: the server container holds the docker
socket and spawns code-running containers — root-equivalent on the
box. Tailscale-only exposure is the mitigation; never forward :7080
anywhere else. Coder does have real auth (the admin account), but
treat that as defense-in-depth, not as permission to expose it.

## Notes

- Workspaces reach the sibling services via `host.docker.internal`
  (Ollama :11434, LiteLLM :4000, Phoenix :6006, ...).
- Long Claude runs: same rules as anywhere — the process lives in the
  workspace container, so it survives laptop/browser disconnects, but
  **autostop will kill an idle-looking workspace mid-run**. For
  multi-hour unattended tasks either bump the workspace's TTL in the
  dashboard or use the host-side tmux + `claude remote-control`
  pattern (framework README, "Claude Code on the box").
- The base image is `codercom/enterprise-base:ubuntu` (sudo-enabled,
  common toolchain). Per-project images are a later refinement —
  swap the `image` in main.tf or parameterize with `coder_parameter`.