111 lines
6.9 KiB
Markdown
111 lines
6.9 KiB
Markdown
# 2026-05-05 — Rate Limiting + Audit Logging
|
|
|
|
> Companion to `BuildLog.md`. Predecessor: `2026-05-03-ci-and-polish.md`.
|
|
|
|
## Milestone
|
|
|
|
Two safety-net additions: rate limiting on the magic-link signin endpoint (5/min/IP) to prevent email-spam abuse, and audit logging on every booking lifecycle transition + admin entity create/update + sign-in. AuditLog rows now accumulate as the practice operates — load-bearing for any future compliance conversation.
|
|
|
|
## What landed
|
|
|
|
### Rate limiting
|
|
| Path | Role |
|
|
|---|---|
|
|
| `src/lib/rate-limit.ts` | In-memory sliding-window limiter. `check(key, limit, windowMs)` → `{ ok, remaining }` or `{ ok: false, retryAfterSec }`. Periodic cleanup of expired buckets so the map doesn't grow forever. |
|
|
| `src/middleware.ts` | Next.js middleware. Matches `/api/auth/signin/:path*` POSTs; 5 per minute per IP. Returns 429 with `Retry-After` header when over. |
|
|
| `test/rate-limit.test.ts` | 5 tests covering under/at/over limit, per-key independence, window reset, remaining count. |
|
|
|
|
### Audit logging
|
|
| Path | Role |
|
|
|---|---|
|
|
| `src/lib/audit.ts` | `audit(db, { actorId, action, entityType, entityId, meta? })`. Best-effort (catches and logs write failures). Pulls IP + UA from request headers via `next/headers` when available; nulls when called outside a request context (CLI, worker, tests). |
|
|
| Action wiring | Booking lifecycle: `booking.created` (public + admin + reschedule), `booking.cancelled` (customer + admin), `booking.rescheduled`, `booking.completed`, `booking.no_show`. Admin CRUD: `service.created`/`updated`, `room.created`/`updated`, `therapist.created`/`updated`. Auth: `user.signed_in` via Auth.js `events.signIn` (lazy-imported). |
|
|
| `test/audit.test.ts` | 4 tests covering row creation, null IP/UA outside request context, null actorId for system events, best-effort no-throw semantics. |
|
|
|
|
## What's verified
|
|
|
|
- `pnpm test` — **101/101 green** (was 92; +5 rate-limit + +4 audit)
|
|
- `pnpm lint` — clean
|
|
- `pnpm exec tsc --noEmit` — clean
|
|
- Live smoke (curl + Mailpit, dev server):
|
|
- **Rate limit**: 6 rapid signin POSTs to `/api/auth/signin/nodemailer` → first 5 return 302 (success), 6th returns **429**.
|
|
- **Audit**: signed in as `admin@touchbase.local` via magic link → `AuditLog` row written: `action='user.signed_in', entityType='User', ip='::1', ua=<UA string>`.
|
|
|
|
## Decisions ratified
|
|
|
|
| Decision | Resolution |
|
|
|---|---|
|
|
| Limiter implementation | **In-memory** sliding window. Single-host only; resets per process. Reason: matches our v1 deployment model (one host). When we go multi-instance, swap for Redis or pg-backed; the call site stays the same. |
|
|
| Where to apply rate limit | `/api/auth/signin/*` POSTs. Reason: highest-value abuse target — sends email per request. Booking flow has DB exclusion-constraint protection already; magic-link spam has none until now. |
|
|
| Limit value | **5 per minute per IP**. Reason: tight enough to prevent abuse, loose enough that a user retrying because of a typo isn't blocked. Tighten if we see abuse. |
|
|
| IP extraction | `x-forwarded-for[0]` → `x-real-ip` → `"unknown"`. Reason: any of the standard proxy headers (Caddy sets `x-forwarded-for`); fallback prevents undefined-key bugs. |
|
|
| Audit semantics | **Best-effort** writes — failures are logged but don't propagate. Reason: a misbehaving audit table must not break a booking. |
|
|
| Audit `meta` typing | `Prisma.InputJsonValue` (the typed JSON input). Null encoded as `Prisma.JsonNull` per Prisma 7 convention. |
|
|
| Auth signIn audit | `events.signIn` callback in Auth.js config, **lazy-imports** `@/lib/audit` and `@/lib/db`. Reason: importing Prisma at the top of `src/auth.ts` adds noticeable cold-start to the auth handler; lazy import only pays the cost on actual sign-in. |
|
|
| Action vocabulary | Dot-separated `entity.verb`: `booking.created`, `service.updated`, `user.signed_in`. Documented at the top of `src/lib/audit.ts`. |
|
|
| What's instrumented | Booking lifecycle (5 actions) + admin CRUD entity create/update (services, rooms, therapists) + signin. **Not yet instrumented**: working-hours / overrides / room-blocks edits (high-volume, low compliance value); customer notes views (no UI for that yet); Stripe events (covered separately by Notification + Payment rows). |
|
|
| Where audit is stored | The existing `AuditLog` table from BuildLog.md §4.8. Append-only. No retention policy yet — documented as a future concern. |
|
|
|
|
## Gotchas hit
|
|
|
|
### `Prisma.InputJsonValue` vs `Record<string, unknown>` typing
|
|
First pass typed `meta` as `Record<string, unknown>` and Prisma's generated `meta` field rejected it (`'Record<string, unknown>' is not assignable to type 'NullableJsonNullValueInput | InputJsonValue | undefined'`). Switched to `Prisma.InputJsonValue` and used `Prisma.JsonNull` for the absent case. Required `import { Prisma } from "..."` (value, not type).
|
|
|
|
## Open questions
|
|
|
|
1. Customer-visible brand name (still pending)
|
|
2. Currency
|
|
3. Stripe account ownership
|
|
4. Real domain
|
|
5. **NEW**: AuditLog retention policy. Currently rows accumulate forever. For a compliance/legal conversation later, the practice will probably want a documented retention window (e.g. 7 years for healthcare-adjacent records). Defer.
|
|
6. **NEW**: Rate-limit tightening — current 5/min/IP might be too generous for a busy practice; revisit once we have data.
|
|
|
|
## Roadmap status
|
|
|
|
- v1 feature-complete
|
|
- Production prep done
|
|
- CI in place
|
|
- Reminder lead time configurable
|
|
- **Rate limiting + audit logging done 2026-05-05 (this session)**
|
|
|
|
Outstanding gates before opening to real customers (operational, not code):
|
|
|
|
1. Live Stripe verification in test mode
|
|
2. Brand + policy decisions
|
|
3. First production deploy
|
|
|
|
Code-wise, the remaining "next step" items are all genuinely optional polish:
|
|
|
|
- Sentry / GlitchTip for prod error tracking
|
|
- CSP / security headers via Caddy
|
|
- Audit logging for working-hours / overrides / room-blocks if we want full coverage
|
|
- AuditLog retention job (pg-boss recurring delete-older-than-X)
|
|
- Push the docker image to a registry once we have one
|
|
|
|
## Recommended next step
|
|
|
|
This is a clean place to **pause** for operational / brand decisions. The code stack is genuinely production-ready for a soft launch.
|
|
|
|
If continuing on the code side, my pick: **Sentry/GlitchTip** as the last load-bearing prod-readiness item. Half-day. Everything else is nice-to-have.
|
|
|
|
## How to resume
|
|
|
|
```bash
|
|
cd /Users/noise/Documents/code/touchbase
|
|
docker-compose up -d postgres mailpit
|
|
pnpm db:seed
|
|
pnpm dev
|
|
|
|
# Test rate limiting (returns 429 on 6th attempt within a minute):
|
|
TOKEN=$(curl -s -c /tmp/tb.txt http://localhost:3000/api/auth/csrf | jq -r .csrfToken)
|
|
for i in 1 2 3 4 5 6; do
|
|
curl -s -b /tmp/tb.txt -X POST -d "csrfToken=$TOKEN&email=test$i@x.com" \
|
|
-o /dev/null -w "%{http_code}\n" \
|
|
http://localhost:3000/api/auth/signin/nodemailer
|
|
done
|
|
|
|
# Inspect audit log:
|
|
docker exec touchbase-postgres-1 psql -U touchbase -d touchbase_dev \
|
|
-c 'SELECT action, "entityType", "entityId", ip, "createdAt" FROM "AuditLog" ORDER BY "createdAt" DESC LIMIT 20;'
|
|
```
|