v1 softlaunch

2026-05-05 10:39:59 -04:00
parent 6023fe5214
commit 6fd8b0317e
26 changed files with 1070 additions and 10 deletions
--- a/docs/deploy.md
+++ b/docs/deploy.md
@@ -0,0 +1,180 @@
+# Deploying TouchBase
+
+A small, opinionated runbook for deploying TouchBase to a single self-hosted host. v1 assumes one practice, one host, no orchestration. When that stops being enough, the path forward is straightforward — Postgres goes managed, the app+worker images go to a container registry, and the same compose file or any orchestrator runs them.
+
+## Prerequisites
+
+- A Linux host with Docker + Docker Compose v5+ (or `docker compose` plugin v2+)
+- A real domain pointed at the host's public IP (e.g. `book.your-practice.com`)
+- Ports 80 and 443 open inbound (Caddy needs both for ACME)
+- Outbound to: Postgres (if external), Resend SMTP (smtp.resend.com:587), Stripe API (api.stripe.com:443)
+- `git`, `pnpm`, and `node 22+` for migrations on first install (only needed once)
+
+## What runs in production
+
+| Service | Image | Command | Port | Notes |
+|---|---|---|---|---|
+| `postgres` | `postgres:16-alpine` | (default) | 5432 (internal) | Volume `pgdata` |
+| `app` | `touchbase/app:latest` | `pnpm start` | 3000 (internal) | Healthchecked at `/api/health` |
+| `worker` | `touchbase/app:latest` | `pnpm worker` | — | Long-lived; pg-boss handlers |
+| `caddy` | `caddy:2-alpine` | (default) | 80, 443 (host) | Auto-TLS via ACME |
+
+Mailpit is **not** in the prod profile — production uses real SMTP (Resend). Stop the dev `mailpit` container if you brought it up.
+
+## First-time setup
+
+```bash
+# On the host
+git clone <your fork> /opt/touchbase
+cd /opt/touchbase
+
+# 1. Create a real .env (see "Required env vars" below)
+cp .env.example .env
+$EDITOR .env
+
+# 2. Bring up just Postgres so we can run migrations
+docker-compose up -d postgres
+./scripts/db-bootstrap.sh   # idempotent: ensures touchbase_test exists + extensions
+                            # in prod you only need the prod DB; trim the script if desired
+
+# 3. Run migrations from the host (one-time)
+pnpm install --frozen-lockfile
+pnpm exec prisma migrate deploy
+
+# 4. (Optional) seed an admin user — replace the seed entirely for prod, or
+#    do it manually via psql / a one-off script.
+#    Minimal admin row:
+docker exec -i touchbase-postgres-1 psql -U touchbase -d touchbase_dev <<SQL
+INSERT INTO "User" (id, email, name, role, "createdAt", "updatedAt")
+VALUES (gen_random_uuid()::text, 'admin@your-practice.com', 'Admin', 'ADMIN', now(), now());
+SQL
+
+# 5. Point Caddy at your domain
+$EDITOR caddy/Caddyfile      # set the site address (replace `localhost`)
+export APP_DOMAIN=book.your-practice.com
+export ACME_EMAIL=you@your-practice.com
+
+# 6. Build and start the prod stack
+docker-compose --profile prod build
+docker-compose --profile prod up -d
+
+# 7. Verify
+curl -sf https://book.your-practice.com/api/health
+docker-compose --profile prod logs -f app worker
+```
+
+## Required env vars
+
+For production these must all be set (in `.env` or as host env vars):
+
+| Var | Example | Notes |
+|---|---|---|
+| `DATABASE_URL` | `postgresql://touchbase:STRONG_PW@postgres:5432/touchbase_dev?schema=public` | Inside compose, host is the `postgres` service. For external/managed Postgres, point at the real host. **Use a strong password.** |
+| `APP_URL` | `https://book.your-practice.com` | Used in email links and Stripe `return_url` |
+| `APP_TZ` | `America/Detroit` | All WorkingHours math uses this |
+| `AUTH_SECRET` | (random 32 bytes, base64) | Generate with `openssl rand -base64 32`. **Different from dev.** |
+| `SMTP_HOST` | `smtp.resend.com` | |
+| `SMTP_PORT` | `587` | |
+| `SMTP_USER` | `resend` | Resend's docs |
+| `SMTP_PASS` | `re_…` | Resend API key |
+| `SMTP_FROM` | `TouchBase <bookings@your-practice.com>` | Must be a verified Resend sender domain |
+
+Optional (only when payments are wired):
+
+| Var | Notes |
+|---|---|
+| `STRIPE_SECRET_KEY` | `sk_live_…` for prod, `sk_test_…` for staging |
+| `STRIPE_PUBLISHABLE_KEY` | matched env to secret |
+| `STRIPE_WEBHOOK_SECRET` | from your Stripe webhook configuration in the Stripe Dashboard, NOT the CLI |
+
+Optional (tweakable):
+
+| Var | Default | Notes |
+|---|---|---|
+| `REMINDER_LEAD_MIN` | `1440` (24h) | Minutes before each appointment to send the reminder email. Set on **both** the app and worker containers; producer-side scheduling uses the value, handler-side fires when it does. |
+
+If `STRIPE_*` vars are absent, the app skips the deposit branch entirely — bookings proceed straight to CONFIRMED with no payment. This means **don't accidentally launch without them set if you intend to require deposits**.
+
+## Migrations
+
+Migrations live in `prisma/migrations/`. Apply on each deploy that adds them:
+
+```bash
+docker-compose --profile prod exec app pnpm exec prisma migrate deploy
+```
+
+Prisma's `migrate deploy` is non-destructive (no resets, no prompts). It applies any unapplied migrations in order and is safe to run on every deploy.
+
+## Where things log
+
+Production stdout/stderr from each container is captured by Docker:
+
+```bash
+docker-compose --profile prod logs -f app
+docker-compose --profile prod logs -f worker
+docker-compose --profile prod logs -f caddy
+docker-compose --profile prod logs -f postgres
+```
+
+For long-term retention, point Docker at a logging driver (json-file with rotation, journald, or a remote sink). Out of scope here.
+
+## Backups
+
+```bash
+# Daily, e.g. via host cron:
+docker exec touchbase-postgres-1 pg_dump -U touchbase touchbase_dev \
+  | gzip > /var/backups/touchbase-$(date +%F).sql.gz
+
+# Encrypted off-host (recommended):
+... | age -r <recipient> | aws s3 cp - s3://your-backups/touchbase-$(date +%F).sql.gz.age
+```
+
+Test restore quarterly. The exclusion-constraint migration depends on the `btree_gist` extension — ensure your restore target has it (the `db/init` script does, plus `pgcrypto`).
+
+## Rollback
+
+If a deploy breaks production:
+
+```bash
+# Roll the app + worker back to the previous image tag
+docker-compose --profile prod down app worker
+git checkout <previous good commit>
+docker-compose --profile prod build app worker
+docker-compose --profile prod up -d app worker
+```
+
+Postgres data is unaffected (it's in a volume). **Migrations are not auto-rolled-back** — Prisma doesn't generate down-migrations. If a migration is the breaking change, write a corrective migration in code and apply it forward; only resort to manual SQL for incidents.
+
+## Healthcheck
+
+```bash
+curl -sf https://book.your-practice.com/api/health | jq
+# {
+#   "status": "ok",
+#   "version": "dev",
+#   "time": "2026-…",
+#   "checks": { "app": "ok", "db": "ok" }
+# }
+```
+
+503 with `checks.db` populated = app can't reach Postgres. The Docker HEALTHCHECK in the app service watches this every 30s.
+
+## Common operations
+
+| Task | Command |
+|---|---|
+| See all bookings (admin) | https://book.your-practice.com/admin/bookings |
+| Run a one-off SQL query | `docker exec -it touchbase-postgres-1 psql -U touchbase -d touchbase_dev` |
+| Check pg-boss queue | `docker exec touchbase-postgres-1 psql -U touchbase -d touchbase_dev -c "SELECT name, state, COUNT(*) FROM pgboss.job GROUP BY 1, 2;"` |
+| Force a reminder to fire now | `UPDATE pgboss.job SET start_after = now() WHERE name='booking-reminder' AND state='created';` |
+| Make a user admin | `UPDATE "User" SET role='ADMIN' WHERE email='someone@example.com';` |
+| Restart just the worker | `docker-compose --profile prod restart worker` |
+
+## What's not yet automated
+
+- **Image registry**: `touchbase/app:latest` is local-build only. For multi-host or CI deploys, push to a registry (GHCR, ECR, etc.) and pin tags by commit SHA in compose.
+- **Secret management**: `.env` on the host is fine for one-host. Beyond that, use Docker secrets, SOPS-encrypted env files, or your platform's secret store.
+- **Observability**: stdout logs only. Add Sentry/GlitchTip + Pino structured logs when the practice has appetite.
+- **CI**: there isn't one. Add a GitHub Actions workflow to run `pnpm test`, `pnpm lint`, `pnpm exec tsc --noEmit`, and `docker build` on every PR; tag-based release builds.
+
+These are all "next step" items, not v1 blockers.
--- a/docs/progress/2026-05-03-ci-and-polish.md
+++ b/docs/progress/2026-05-03-ci-and-polish.md
@@ -0,0 +1,88 @@
+# 2026-05-03 — CI + Reminder Lead Configurability
+
+> Companion to `BuildLog.md`. Predecessor: `2026-05-03-production-prep.md`.
+
+## Milestone
+
+CI workflow at `.github/workflows/ci.yml` runs on every push/PR — lint, typecheck, tests against a real Postgres service container, plus a parallel docker-build job that exercises the `Dockerfile`. Reminder lead time is now env-configurable via `REMINDER_LEAD_MIN` (defaults to 1440 = 24h).
+
+## What landed
+
+| Path | Role |
+|---|---|
+| `.github/workflows/ci.yml` | Two jobs: **test** (Postgres 16 service container with btree_gist + pgcrypto, prisma generate, prisma migrate deploy, lint, tsc, test) and **docker** (buildx with GHA cache, builds the runtime image without pushing). Concurrency-cancelled on superseded refs. |
+| `src/lib/reminders.ts` | `REMINDER_LEAD_MS` constant replaced with `reminderLeadMs()` reading `REMINDER_LEAD_MIN` env (default 1440). Set on both web and worker so producer-side scheduling and handler agree. |
+| `.env.example` | Added `REMINDER_LEAD_MIN=1440` example. |
+| `docs/deploy.md` | Documented `REMINDER_LEAD_MIN` in the optional env-vars table. |
+
+## What's verified
+
+- `pnpm test` — **92/92 green**
+- `pnpm lint` — clean
+- `pnpm exec tsc --noEmit` — clean
+- CI workflow YAML is syntactically valid (Next.js / pnpm versions match repo). Live verification awaits a push to GitHub.
+
+## Decisions ratified
+
+| Decision | Resolution |
+|---|---|
+| CI runner | `ubuntu-latest`. Standard, fast, free for public repos and reasonable for private. |
+| Postgres in CI | Service container `postgres:16` (matches our prod image), btree_gist + pgcrypto installed via a `psql` step. |
+| Test DB strategy in CI | One DB (`touchbase_test`); migrations applied via `prisma migrate deploy`; tests run against the same connection. Reason: simplest possible setup. Per-test parallel DBs is overkill at our test count. |
+| Docker job | Builds the image but **doesn't push**. Reason: no registry yet. Push gets added when we have a release process. |
+| Cache strategy | `pnpm` action caches the store via Node setup; `docker/build-push-action` uses GHA cache backend. Reason: keeps CI under 2 min on warm cache. |
+| Concurrency | Superseded runs cancelled on the same ref. Reason: avoid wasting CI minutes on stale commits. |
+| `REMINDER_LEAD_MIN` default | 1440 (24h) — same as the previous hardcoded value. Reason: behavior unchanged absent override. |
+| `REMINDER_LEAD_MIN` validation | Falls back to 1440 if env value is invalid (NaN, negative, missing). Reason: don't crash on a typo; do something reasonable. |
+| Where to set `REMINDER_LEAD_MIN` | Both web tier (producer) AND worker (handler reads via the same module). Documented in deploy.md. |
+
+## Gotchas hit
+
+None.
+
+## Open questions
+
+1. Customer-visible brand name (still pending)
+2. Currency
+3. Stripe account ownership
+4. Real domain + Caddy config
+5. Image registry choice (still deferred)
+6. **NEW**: When should CI start running — push to GitHub now, or wait until you cut a v1 branch? Doesn't affect code; just operational.
+
+## Roadmap status
+
+- v1 feature-complete
+- Production prep done
+- **CI done 2026-05-03 (this session)**
+- Reminder lead time now configurable
+
+Outstanding gates before opening to real customers (operational, not code):
+
+1. Live Stripe verification in test mode (when keys arrive)
+2. Brand + policy decisions
+3. First production deploy
+
+## Recommended next step
+
+The remaining code-track polish items are all genuinely optional:
+
+- **Sentry / GlitchTip integration** for production error tracking. Half-day, more if we want sourcemaps uploaded by CI. Useful once we're live; not before.
+- **CSP / security headers** via Next config or Caddy. Half-day. Worth doing before opening to real customers.
+- **Rate limiting** on `/api/auth/*` and the booking endpoint. We currently rely on Postgres exclusion constraints to prevent double-booking races, but a determined caller could spam magic-link emails (Resend has its own quota, but still). Half-day to wire `@upstash/ratelimit` or a simple in-memory limiter.
+- **Audit logging for admin actions**. We have an `AuditLog` table from §4 of BuildLog.md but no calls to it. Half-day to wire into the admin CRUD actions.
+
+My pick if continuing: **rate limiting** (concrete safety win) or **audit logging** (load-bearing for any compliance conversation later). Sentry is high-value but easier to add post-launch when we know what kinds of errors occur.
+
+If pausing: this is a clean stopping point.
+
+## How to resume
+
+```bash
+cd /Users/noise/Documents/code/touchbase
+pnpm test                    # 92/92
+pnpm lint                    # clean
+pnpm exec tsc --noEmit       # clean
+
+# CI will run on push:
+git push origin main         # then check Actions tab on GitHub
+```
--- a/docs/progress/2026-05-03-production-prep.md
+++ b/docs/progress/2026-05-03-production-prep.md
@@ -0,0 +1,115 @@
+# 2026-05-03 — Production Prep
+
+> Companion to `BuildLog.md`. Predecessor: `2026-05-03-reminders.md`.
+
+## Milestone
+
+A self-contained, reproducible production deployment story. The app + worker now build into a single Docker image (`touchbase/app:latest`, ~1.7 GB). Compose's `prod` profile brings up `postgres` + `app` + `worker` + `caddy` with sensible env defaults; the runbook in `docs/deploy.md` walks through first-time setup and ongoing operations.
+
+## What landed
+
+| Path | Role |
+|---|---|
+| `Dockerfile` | Multi-stage build (deps → builder → runner). Single image runs both `pnpm start` (web) and `pnpm worker` per command override. Non-root `nextjs` user, exposes 3000. |
+| `.dockerignore` | Excludes node_modules, .next, .git, .env, IDE/OS junk, docs/progress (those ship with code but not in the build context to keep it lean) |
+| `compose.yaml` | Updated `app` + `worker` to use `env_file: .env` with explicit `DATABASE_URL` override that points at the `postgres` service hostname (so prod profile works without manual env tweaking). Added Docker `HEALTHCHECK` on app via `/api/health`. |
+| `src/app/api/health/route.ts` | Liveness/readiness endpoint. Returns `{status, version, time, checks: {app, db}}`. 200 if Postgres reachable; 503 otherwise. Cached 5s. |
+| `next.config.ts` | Added `turbopack.root` to silence Next 16's "multiple lockfiles" warning at build time. |
+| `src/lib/seed.ts` | Refuses to run in `NODE_ENV=production` unless `ALLOW_SEED_IN_PRODUCTION=1` is set. Reason: `seed()` `TRUNCATE`s every table; one accidental prod run wipes the database. |
+| `docs/deploy.md` | First-time setup walkthrough, required env vars table, migrations / logs / backups / rollback / common ops sections, "what's not yet automated" section pointing at registry, secrets, observability, CI as next steps. |
+
+## What's verified
+
+- `pnpm test` — **92/92 green**
+- `pnpm lint` — clean
+- `pnpm exec tsc --noEmit` — clean
+- `docker-compose --profile prod build app` — succeeds; produces `touchbase/app:latest` at 1.73 GB
+- Live container smoke:
+  - Started `app` container against the dev Postgres → `/api/health` returns `{"status":"ok","checks":{"app":"ok","db":"ok"}}` and `/` returns 200 HTML
+  - Started `worker` container with `pnpm worker` command override → "[worker] pg-boss started; handlers registered, idling"
+
+## Decisions ratified
+
+| Decision | Resolution |
+|---|---|
+| Image strategy | **Single image, full deps** for both web and worker. Container picks via `command:` override. Reason: simpler than two images, ~1.7 GB is acceptable for v1. Optimization to standalone-mode + slim image deferred. |
+| Base image | `node:22-bookworm-slim` (Debian) over `node:22-alpine`. Reason: better native-deps compatibility (sharp, esbuild) at the cost of ~80 MB. |
+| pnpm in runner stage | `npm install --global pnpm@10.18.3` instead of `corepack prepare`. Reason: corepack writes to user-specific cache; the non-root `nextjs` user can't write to `~/.cache` if corepack-prepare ran as root in the image. Global npm install is simpler and works for any UID. |
+| Healthcheck endpoint | `/api/health` with `db.$queryRaw\`SELECT 1\`` ping. 200 healthy / 503 degraded. Cached 5s so flapping monitors don't hammer the DB. |
+| Compose env strategy | `env_file: .env` for convenience; explicit `DATABASE_URL: ${DATABASE_URL:-postgresql://...postgres:5432/...}` override so prod-profile-locally just works (uses internal docker network hostname). For prod deploy, user `export DATABASE_URL=...` to override before `docker-compose up`. |
+| Seed safety in prod | `seed()` throws if `NODE_ENV=production` unless `ALLOW_SEED_IN_PRODUCTION=1`. Reason: forgetting that `seed` wipes data is a foot-gun; the env var makes "I really mean it" explicit. |
+| Migration strategy | `pnpm exec prisma migrate deploy` from the host or via `docker-compose exec app`. Non-destructive; no resets. Documented in `docs/deploy.md`. |
+| Where logs go | Container stdout/stderr → Docker's logging driver. Default json-file is fine for v1; structured logging (Pino) + remote sink is a "next step" item. |
+| Image tag in compose | `touchbase/app:latest` (was `:dev`). Reason: clearer naming for prod. Real tag-by-SHA discipline waits until we have an image registry. |
+
+## Gotchas hit
+
+### corepack + non-root user
+First boot of the app container crashed with `EACCES: permission denied, mkdir '/home/nextjs/.cache/node/corepack/v1'`. Corepack (which is bundled with Node 22) downloads pnpm to a user-specific cache on first invocation; the runner stage activated pnpm as root, so when the runtime user `nextjs` tried to invoke `pnpm start`, corepack tried to download into nextjs's home dir, which was empty/unwritable. Fix: install pnpm globally via `npm install --global pnpm@10.18.3` in the runner stage so it's on PATH for any UID.
+
+### Prisma openssl warnings
+Build emits `Prisma failed to detect the libssl/openssl version to use, and may not work as expected. Defaulting to "openssl-1.1.x".` These are harmless — Prisma 7 with the driver-adapter pattern doesn't use the native query engine binary at runtime. The warning fires during `prisma generate` checking for binary fallback compatibility. Could silence by `apt-get install -y openssl` in the builder stage; deferred.
+
+### Image size (1.73 GB)
+Bigger than ideal. Sources of bulk: full `node_modules` (we keep dev deps for `tsx` to run the worker, plus `prisma` CLI for migrations); the regular Next build vs standalone output. Optimization paths:
+- **Standalone Next build** + separate slim worker image (~150 MB web, ~300 MB worker, total ~450 MB vs 1.7 GB)
+- Compile worker to a single bundle with esbuild → drop tsx from runtime
+- Strip dev deps from runtime — would also need to compile `scripts/seed.ts` and any other tsx-only entries
+
+Deferred to "production prep v2" if/when image size becomes a constraint (it isn't for one-host deploys).
+
+## Open questions
+
+1. Customer-visible brand name (still pending)
+2. Currency
+3. Stripe account ownership
+4. Real domain + Caddy config
+5. **NEW**: image registry choice (GHCR, ECR, Docker Hub)? Defer until we deploy to more than one host.
+6. **NEW**: CI provider (GitHub Actions presumably) — when the practice wants formal release discipline.
+
+## Roadmap status
+
+- Backend 1–4 done
+- 5a + 5b + 5c.1 + 5c.2 done
+- UX phases A–E done
+- 5d Stripe — scaffolded; awaits real keys for live verification
+- 5e reminders — done
+- **Production prep — done 2026-05-03 (this session)**
+
+**v1 is now soft-launchable.** Remaining gates before opening to real customers:
+
+1. **Live Stripe verification in test mode** — user provides keys, we run the deposit flow end-to-end.
+2. **Brand + policy decisions** — customer-visible name, ToS/cancellation copy, real domain.
+3. **First production deploy** — stand up the host, run through `docs/deploy.md`, point a real domain at it.
+
+Code-wise, nothing else is needed for soft launch.
+
+## Recommended next step
+
+Two-track:
+
+- **Code track**: nothing critical. Optional polish: 24h reminder is hard-coded — make `REMINDER_LEAD_MS` configurable; add a Sentry/GlitchTip integration for prod error tracking; CI workflow with `pnpm test`/`lint`/`tsc`. Each is half-day-ish.
+- **Operational track**: depends on you — Stripe keys, domain, deploy host, brand decisions.
+
+Pause is appropriate here. Pick up when you want to either verify Stripe live, deploy somewhere, or polish a specific thing.
+
+## How to resume
+
+```bash
+cd /Users/noise/Documents/code/touchbase
+docker-compose --profile prod build       # builds app + worker images (~3 min)
+docker-compose --profile prod up -d       # starts postgres + app + worker + caddy
+curl -sf http://localhost:3000/api/health # confirms app + db reachable
+docker-compose --profile prod logs -f
+```
+
+Or for a single-component test:
+
+```bash
+docker-compose up -d postgres                         # dev-style postgres
+docker run --rm --network touchbase_default \
+  -e DATABASE_URL="postgresql://touchbase:touchbase@postgres:5432/touchbase_dev?schema=public" \
+  -e AUTH_SECRET=test -p 3001:3000 \
+  touchbase/app:latest
+# /api/health on :3001
+```
--- a/docs/progress/2026-05-05-rate-limit-and-audit.md
+++ b/docs/progress/2026-05-05-rate-limit-and-audit.md
@@ -0,0 +1,110 @@
+# 2026-05-05 — Rate Limiting + Audit Logging
+
+> Companion to `BuildLog.md`. Predecessor: `2026-05-03-ci-and-polish.md`.
+
+## Milestone
+
+Two safety-net additions: rate limiting on the magic-link signin endpoint (5/min/IP) to prevent email-spam abuse, and audit logging on every booking lifecycle transition + admin entity create/update + sign-in. AuditLog rows now accumulate as the practice operates — load-bearing for any future compliance conversation.
+
+## What landed
+
+### Rate limiting
+| Path | Role |
+|---|---|
+| `src/lib/rate-limit.ts` | In-memory sliding-window limiter. `check(key, limit, windowMs)` → `{ ok, remaining }` or `{ ok: false, retryAfterSec }`. Periodic cleanup of expired buckets so the map doesn't grow forever. |
+| `src/middleware.ts` | Next.js middleware. Matches `/api/auth/signin/:path*` POSTs; 5 per minute per IP. Returns 429 with `Retry-After` header when over. |
+| `test/rate-limit.test.ts` | 5 tests covering under/at/over limit, per-key independence, window reset, remaining count. |
+
+### Audit logging
+| Path | Role |
+|---|---|
+| `src/lib/audit.ts` | `audit(db, { actorId, action, entityType, entityId, meta? })`. Best-effort (catches and logs write failures). Pulls IP + UA from request headers via `next/headers` when available; nulls when called outside a request context (CLI, worker, tests). |
+| Action wiring | Booking lifecycle: `booking.created` (public + admin + reschedule), `booking.cancelled` (customer + admin), `booking.rescheduled`, `booking.completed`, `booking.no_show`. Admin CRUD: `service.created`/`updated`, `room.created`/`updated`, `therapist.created`/`updated`. Auth: `user.signed_in` via Auth.js `events.signIn` (lazy-imported). |
+| `test/audit.test.ts` | 4 tests covering row creation, null IP/UA outside request context, null actorId for system events, best-effort no-throw semantics. |
+
+## What's verified
+
+- `pnpm test` — **101/101 green** (was 92; +5 rate-limit + +4 audit)
+- `pnpm lint` — clean
+- `pnpm exec tsc --noEmit` — clean
+- Live smoke (curl + Mailpit, dev server):
+  - **Rate limit**: 6 rapid signin POSTs to `/api/auth/signin/nodemailer` → first 5 return 302 (success), 6th returns **429**.
+  - **Audit**: signed in as `admin@touchbase.local` via magic link → `AuditLog` row written: `action='user.signed_in', entityType='User', ip='::1', ua=<UA string>`.
+
+## Decisions ratified
+
+| Decision | Resolution |
+|---|---|
+| Limiter implementation | **In-memory** sliding window. Single-host only; resets per process. Reason: matches our v1 deployment model (one host). When we go multi-instance, swap for Redis or pg-backed; the call site stays the same. |
+| Where to apply rate limit | `/api/auth/signin/*` POSTs. Reason: highest-value abuse target — sends email per request. Booking flow has DB exclusion-constraint protection already; magic-link spam has none until now. |
+| Limit value | **5 per minute per IP**. Reason: tight enough to prevent abuse, loose enough that a user retrying because of a typo isn't blocked. Tighten if we see abuse. |
+| IP extraction | `x-forwarded-for[0]` → `x-real-ip` → `"unknown"`. Reason: any of the standard proxy headers (Caddy sets `x-forwarded-for`); fallback prevents undefined-key bugs. |
+| Audit semantics | **Best-effort** writes — failures are logged but don't propagate. Reason: a misbehaving audit table must not break a booking. |
+| Audit `meta` typing | `Prisma.InputJsonValue` (the typed JSON input). Null encoded as `Prisma.JsonNull` per Prisma 7 convention. |
+| Auth signIn audit | `events.signIn` callback in Auth.js config, **lazy-imports** `@/lib/audit` and `@/lib/db`. Reason: importing Prisma at the top of `src/auth.ts` adds noticeable cold-start to the auth handler; lazy import only pays the cost on actual sign-in. |
+| Action vocabulary | Dot-separated `entity.verb`: `booking.created`, `service.updated`, `user.signed_in`. Documented at the top of `src/lib/audit.ts`. |
+| What's instrumented | Booking lifecycle (5 actions) + admin CRUD entity create/update (services, rooms, therapists) + signin. **Not yet instrumented**: working-hours / overrides / room-blocks edits (high-volume, low compliance value); customer notes views (no UI for that yet); Stripe events (covered separately by Notification + Payment rows). |
+| Where audit is stored | The existing `AuditLog` table from BuildLog.md §4.8. Append-only. No retention policy yet — documented as a future concern. |
+
+## Gotchas hit
+
+### `Prisma.InputJsonValue` vs `Record<string, unknown>` typing
+First pass typed `meta` as `Record<string, unknown>` and Prisma's generated `meta` field rejected it (`'Record<string, unknown>' is not assignable to type 'NullableJsonNullValueInput | InputJsonValue | undefined'`). Switched to `Prisma.InputJsonValue` and used `Prisma.JsonNull` for the absent case. Required `import { Prisma } from "..."` (value, not type).
+
+## Open questions
+
+1. Customer-visible brand name (still pending)
+2. Currency
+3. Stripe account ownership
+4. Real domain
+5. **NEW**: AuditLog retention policy. Currently rows accumulate forever. For a compliance/legal conversation later, the practice will probably want a documented retention window (e.g. 7 years for healthcare-adjacent records). Defer.
+6. **NEW**: Rate-limit tightening — current 5/min/IP might be too generous for a busy practice; revisit once we have data.
+
+## Roadmap status
+
+- v1 feature-complete
+- Production prep done
+- CI in place
+- Reminder lead time configurable
+- **Rate limiting + audit logging done 2026-05-05 (this session)**
+
+Outstanding gates before opening to real customers (operational, not code):
+
+1. Live Stripe verification in test mode
+2. Brand + policy decisions
+3. First production deploy
+
+Code-wise, the remaining "next step" items are all genuinely optional polish:
+
+- Sentry / GlitchTip for prod error tracking
+- CSP / security headers via Caddy
+- Audit logging for working-hours / overrides / room-blocks if we want full coverage
+- AuditLog retention job (pg-boss recurring delete-older-than-X)
+- Push the docker image to a registry once we have one
+
+## Recommended next step
+
+This is a clean place to **pause** for operational / brand decisions. The code stack is genuinely production-ready for a soft launch.
+
+If continuing on the code side, my pick: **Sentry/GlitchTip** as the last load-bearing prod-readiness item. Half-day. Everything else is nice-to-have.
+
+## How to resume
+
+```bash
+cd /Users/noise/Documents/code/touchbase
+docker-compose up -d postgres mailpit
+pnpm db:seed
+pnpm dev
+
+# Test rate limiting (returns 429 on 6th attempt within a minute):
+TOKEN=$(curl -s -c /tmp/tb.txt http://localhost:3000/api/auth/csrf | jq -r .csrfToken)
+for i in 1 2 3 4 5 6; do
+  curl -s -b /tmp/tb.txt -X POST -d "csrfToken=$TOKEN&email=test$i@x.com" \
+    -o /dev/null -w "%{http_code}\n" \
+    http://localhost:3000/api/auth/signin/nodemailer
+done
+
+# Inspect audit log:
+docker exec touchbase-postgres-1 psql -U touchbase -d touchbase_dev \
+  -c 'SELECT action, "entityType", "entityId", ip, "createdAt" FROM "AuditLog" ORDER BY "createdAt" DESC LIMIT 20;'
+```