# openrun — endurance running analytics A local-first, open-source endurance-training analysis tool. Garmin data in (live API or official export), SQLite on disk, an in-browser dashboard out — plus a Python API and Jupyter notebooks for power users. No accounts, no cloud, no telemetry. Your data lives in one file on your machine. **What it gives you:** Banister CTL/ATL/TSB (fitness/fatigue/form), Pa:HR decoupling per-mile and per-second from FIT, FIT-aware HR zone time-in-zone, GPS route clustering, race-plan projection with forward CTL/ATL/TSB, off-watch activity logging, and a first-run web wizard so non-CLI users can get going in a few minutes. ``` openrun/ ├── pyproject.toml # packaged as `openrun` (src layout, hatchling) ├── openrun.toml # your config — created by the setup wizard ├── data/garmin.db # SQLite, created on first run ├── src/openrun/ │ ├── config.py # UserProfile, HRZones, BanisterParams + TOML reader/writer │ ├── db.py # schema + connect() helper │ ├── model.py # loaders + derived metrics │ ├── plots.py # matplotlib plot helpers │ ├── setup.py # idempotent workspace bootstrap + status │ ├── ingest/ │ │ ├── auth.py # Garmin OAuth login │ │ ├── garmin_api.py # Path B: incremental sync via garth │ │ ├── garmin_export.py # Path A: parse Connect / Takeout export │ │ ├── fit_linker.py # match Takeout FITs by session.start_time │ │ ├── manual.py # CSV → manual_activities (off-watch sessions) │ │ └── time_in_zone.py # per-activity TIZ cache │ └── web/ # Streamlit app │ ├── app.py # controller — st.navigation builds the sidebar │ ├── _helpers.py # cached loaders + sidebar chrome │ └── pages/ # one file per page └── examples/notebooks/ # original analysis notebooks (kept as reference) ``` ## TL;DR — install and use ```bash uv sync uv run openrun-web # opens http://localhost:8501 ``` A first-launch wizard walks you through profile + HR zones + race calendar, then either ingests a Garmin export zip or takes you to manual logging. See [QUICKSTART.md](QUICKSTART.md) for the step-by-step. --- ## The two interfaces ### Web app (recommended for everyone) ```bash uv run openrun-web ``` Eight pages, all wired to the same SQLite DB: | Page | What's on it | |---|---| | 📊 Dashboard | CTL/ATL/TSB tiles + chart, weekly km + ACWR, polarized split, recent runs | | 📋 Activities | Filter by type / date / distance / name → click a row to drill in | | 🏃 Activity detail | Splits + pace-vs-HR scatter, time-in-zone bars, FIT decoupling chart | | 📈 Race plan | Editable plan rows, projected PMC through race day, race-day TSB table | | 📝 Manual log | Form-based logging of off-watch sessions (strength, hikes, unrecorded runs) | | 🛌 Recovery | Sleep stages, HRV + RHR overlay, training→next-morning-HRV correlation | | 🫀 Efficiency | Metres per heartbeat with 30-run rolling median, distance × year heatmap | | 🔄 Sync | Status + in-browser zip ingest with live progress | ### CLI (for scripting / power users) ```bash openrun-init # bootstrap + status dashboard openrun-auth # Garmin OAuth login (live sync) openrun-sync [--full] # incremental Garmin Connect sync (downloads per-second FIT for new activities) openrun-sync --fit-backfill --fit-type running # pull per-second FITs for past runs, no website export openrun-sync --no-fit # sync without the per-second FIT download openrun-ingest # parse a Garmin export openrun-link-fit # match Takeout FITs by start time openrun-link-fit --relink # rewrite paths after moving the export openrun-time-in-zone # populate the TIZ cache openrun-import-manual # bulk off-watch import openrun-web # launch the Streamlit app ``` ### Python API (for notebooks / one-offs) ```python from openrun import open_conn, load_activities, banister, daily_training_load_series conn = open_conn() runs = load_activities(conn, type="running") pmc = banister(daily_training_load_series(conn, include_manual=True)) ``` --- ## 1 — Source data ### Two paths in, intentionally overlapping | Path | Tool | What you get | When to use | |---|---|---|---| | **A. Official data export** (recommended for first load) | `openrun-ingest` | Complete history, FIT files, multi-year wellness | First load; periodic refreshes | | **B. Live API sync** | `openrun-auth` then `openrun-sync` — *or the web app's Sync page* | Last N days, lap-level splits, **per-second FIT files** (downloaded via the API) | Incremental top-ups between exports | The upsert logic only overwrites the raw-JSON column and `fetched_at` on conflict, so Path B values aren't clobbered by Path A re-ingests. **No terminal required.** The web app's **Sync** page now logs in to Garmin (email/password, plus an MFA code when the account needs one) and runs the same pull in-process behind a **🔄 Sync now** button — with checkboxes for full backfill, per-second FIT download, and FIT backfill. Credentials go straight to Garmin; only OAuth tokens are cached in `.secrets/`. The CLI (`openrun-auth` + `openrun-sync`) does the identical thing for scripting. **Per-second from the API.** `openrun-sync` downloads each new activity's original FIT (`/download-service/files/activity/{id}`) into `data/fit/.fit` and links it in `activity_fit_files` — the same table the export-based linker writes, so decoupling, FIT-based time-in-zone, and the route map all work without a website export. Skip it with `--no-fit`. To pull per-second history for activities synced before this existed, run `openrun-sync --fit-backfill [--fit-type running] [--fit-limit N]`, then `openrun-time-in-zone` to refresh the per-second TIZ cache. Downloads are idempotent — skipped when the FIT is already on disk and linked. ### Garmin's two export formats Garmin has two unrelated formats both called "the export": 1. **Garmin Connect data export** (`connect.zip`) — request via Account Management → Export Your Data. Filenames are `_.fit`, SI units throughout. The web app's Sync page accepts this zip directly. 2. **Garmin Takeout dump** — UUID-named folder (`_N/`) with many domains (aviation, Tacx, InReach…). Running data lives under `DI_CONNECT/`. **Different conventions:** - FIT filenames: `_.fit` — upload IDs ≠ activity IDs - `summarizedActivities` uses scaled-integer units: distance **cm**, duration **ms**, elevation **cm**, speed **m/s × 0.1** - `openrun-ingest` detects and converts these; `openrun-link-fit` links FITs by content (parses `session.start_time` and matches activities by ±60 s) since the filename trick doesn't work. For Takeout dumps, the full sequence is: ```bash mkdir -p /DI_CONNECT/DI-Connect-Uploaded-Files/fit unzip -d /DI_CONNECT/DI-Connect-Uploaded-Files/fit \ /DI_CONNECT/DI-Connect-Uploaded-Files/UploadedFiles_0-_Part*.zip uv run openrun-ingest / uv run openrun-link-fit / uv run openrun-time-in-zone ``` ### Off-watch activities Recorded Garmin activity isn't the whole picture — strength work, hikes, runs you forgot to record, group runs on someone else's watch. The web app's **Manual log** writes to a separate `manual_activities` table; loaders union it into the PMC when you pass `include_manual=True` (the Dashboard does this by default). For bulk import, drop a CSV with `activity_date,activity_type,distance_km,duration_min,training_load,notes[,external_id]` and run `uv run openrun-import-manual workouts.csv`. The optional `external_id` makes re-imports idempotent. ### Units worth knowing (Connect vs Takeout) | Field | Live API / Connect | Takeout `summarizedActivities` | Convert by | |---|---|---|---| | distance | m | cm | × 0.01 | | duration / movingDuration | s | ms | × 0.001 | | elevationGain / Loss | m | cm | × 0.01 | | averageSpeed / maxSpeed | m/s | m/s × 0.1 | × 10 | | HR (avg, max), calories, training_effect, vo2_max | bpm / kcal / scalar | — | — | | cadence (`averageRunCadence`) | both-legs SPM | both-legs SPM | — (do **not** double) | | strideLength | cm | cm | × 0.01 for m | | FIT `position_lat/long` | semicircles | semicircles | × (180 / 2³¹) for degrees | | FIT `enhanced_speed` | m/s | m/s | — | The cadence trap is real: `averageRunCadence` is already both-legs (~150–180 spm). Doubling it as if it were single-leg yields 300+, which trips any sane filter. --- ## 2 — Schema `db.py:SCHEMA` is the source of truth; what follows is the gist. | Table | Grain | Origin | Notes | |---|---|---|---| | `activities` | one row per activity | both paths | `raw` JSON has every Garmin field; parsed columns are a curated subset | | `activity_splits` | one row per lap | Path B only | Auto-laps (~1 km or 1 mi). `raw` has cadence, stride, GPS bounds | | `activity_fit_files` | one row per linked FIT | linker | `fit_path` is stored **absolute** at link time. Move the export → run `openrun-link-fit --relink` | | `activity_time_in_zone` | one row per activity | precomputed | `source = 'fit'` (per-second) or `'lap'` (split-average) | | `manual_activities` | one row per logged session | web form / CSV | Parallel to `activities`, never clobbered by Garmin re-ingest | | `race_plan` | one row per ISO-Monday week | web editor | Feeds `banister_forecast` for projected PMC | | `daily_*` (steps / sleep / stress / hrv / body_battery / intensity_minutes / resting_hr) | one row per date | both paths | Each has a `raw` JSON column with the full payload | | `sync_state` | key/value | both paths | last-sync / last-ingest timestamps | The `raw` column on every table is the full upstream JSON. If you need a field the schema doesn't surface, unpack it from `raw`: ```python from openrun import open_conn, load_activities, expand_raw acts = load_activities(open_conn(), type="running") fields = expand_raw(acts) # pd.json_normalize'd companion frame ``` --- ## 3 — Metrics & how they're calculated All loaders and helpers live in [src/openrun/model.py](src/openrun/model.py). Highlights: ### Pace ``` pace_min_per_km = (moving_duration_s / 60) / (distance_m / 1000) ``` Implausible paces (sub-3 min/km or over 30 min/km — covers world-record marathon pace and walks-with-stops) get masked to NaN in `load_activities` rather than dropped, so row count is preserved. ### Acute:Chronic Workload Ratio (ACWR) ``` acute = sum(training_load) over the last 7 days chronic = sum(training_load) over the last 28 days, divided by 4 for week-equivalence ACWR = acute / chronic ``` Sweet spot ~0.8–1.3, > 1.5 = aggressive ramp / injury risk. Surfaced on the Dashboard. ### Aerobic efficiency — "m/beat" ``` m_per_beat = distance_m / (duration_s * avg_hr / 60) ``` Higher = more metres per heartbeat = fitter aerobic system. Best compared YoY within a tight distance bucket so terrain and intent don't dominate. The Efficiency page does this with a 30-run rolling median plus a distance-bucket × year heatmap and an "easy runs only" headline view (HR < 75 % LTHR). ### Decoupling (Pa:Hr drift) Friel's method, two resolutions. *Positive* = pace/HR fell off in the second half (cardiac drift, possibly fuelling deficit). *Negative* = improved (negative split). **Per-mile (lap level)** — `decoupling(splits, min_splits=6)`: ``` For each activity with ≥ min_splits laps: halves = first / second by lap count eff_half = duration-weighted mean(speed_mps) / mean(heart_rate) decoupling_pct = (eff_first / eff_second − 1) × 100 ``` **Per-second (FIT)** — `fit_decoupling(records, segments=N, warmup_min=5, cooldown_min=2, min_speed_mps=0.5)`: ``` records = per-second FIT messages drop first warmup_min and last cooldown_min drop records where speed_mps < min_speed_mps # ignore stopped time slice remaining moving time into N equal-time chunks for each chunk: efficiency = mean(speed_mps) / mean(heart_rate) decoupling_pct = (efficiency[0] / efficiency[i] − 1) × 100 ``` Friel's thresholds (interpret on **steady aerobic** efforts only): - **< 5 %** — aerobically developed - **5–10 %** — sustainable - **> 10 %** — pacing unsustainable for the distance, OR fueling shortfall, OR heat The Activity-detail page plots this with `plot_fit_decoupling`. ### HR zone time-in-zone Zone boundaries come from `[user.hr_zones]` in your `openrun.toml` (matching what Garmin's `heartRateZones.json` advertises). Two implementations; `openrun-time-in-zone` picks the better one per activity and caches it. **From FIT (`source='fit'`)** — `time_in_zone_from_fit(records)`: per-second; large gaps (>30 s, paused recording) are clipped to 30 s. **From splits (`source='lap'`)** — fallback when no FIT is linked: assigns each split's avg HR to a single zone. Biased toward the middle zone (smooths over within-lap variation); the FIT version is ground truth where available. ### Polarized split ``` easy = Z1 + Z2 (recovery + easy aerobic) moderate = Z3 (tempo) hard = Z4 + Z5 (threshold + VO₂) ``` Seiler's polarised target is ~80 % easy, < 10 % moderate, ~20 % hard. The Dashboard shows your last-12-week split as tiles plus a stacked bar. ### Banister fitness / fatigue / form (PMC) The standard endurance lens (TrainingPeaks PMC), in `banister(daily_load, ctl_tau=42, atl_tau=7)`: ``` CTL_today = CTL_yesterday · exp(−1/τ_CTL) + load_today · (1 − exp(−1/τ_CTL)) τ_CTL = 42d ATL_today = ATL_yesterday · exp(−1/τ_ATL) + load_today · (1 − exp(−1/τ_ATL)) τ_ATL = 7d TSB_today = CTL_yesterday − ATL_yesterday # yesterday's values (TP convention) ``` `daily_training_load_series(conn, *, include_manual=True)` is the canonical input — it unions Garmin-recorded TL with `manual_activities`. Rest days are filled with 0; both EWMAs still update. TSB interpretation (used across the app): | TSB | meaning | |---|---| | < −30 | severely fatigued — injury risk | | −10 to −30 | productive overload — heart of a build | | −10 to 0 | balanced building | | 0 to +10 | sharpening | | **+10 to +25** | **fresh / peaked — race-day target** | | > +25 | detrained (taper too long) | `banister_forecast(history, future, *, today=None)` splices historical load with planned future load and runs the same recursion forward, so the Race-plan page can project CTL/ATL/TSB through race day. The planned-future series comes from `plan_to_daily_load(plan, *, tl_per_km, race_day_tl_per_km, race_dates)` — convert your weekly plan rows into per-day load using a long-run-Saturday-heavy distribution (Sat 40 %, Tue 20 %, Thu 20 %, Mon 10 %, Wed 10 %), with race weeks treating race day specially. `calibrate_tl_per_km(conn)` reports the empirical median + IQR of `training_load / km` from your history — that's the number to feed into the plan, rather than a hard-coded constant. ### Personal records `personal_records(activities, distance_bins_km=(5, 10, 21.0975, 42.195, 50), tolerance=0.05)` picks the fastest run within ±5 % of each bin distance. ### GPS route clustering `cluster_routes(lats, lons, radius_km=0.25)` — greedy haversine-radius clustering of run starts. Per-route pace trends control for terrain (the same loop in 2023 vs 2025 says more about fitness than raw monthly pace). Adequate for hundreds of starts; the README's old TODO for DBSCAN at thousands is still open. --- ## 4 — Reference notebooks (kept around for power-user analysis) Each notebook under [examples/notebooks/](examples/notebooks/) bootstraps the same way: ```python from openrun import open_conn, load_activities, ... conn = open_conn() ``` | # | Notebook | Covers | Web equivalent | |---|---|---|---| | 01 | Overview | Data coverage, recent activities | Dashboard | | 02 | Running | Weekly volume, pace-HR, PMC, PRs | Dashboard + Activities | | 03 | Recovery | Sleep composition, HRV, RHR, body battery | Recovery | | 04 | Efficiency | YoY m/beat | Efficiency | | 05 | Intra-run | Decoupling, cadence, route clusters, TIZ | Activity detail | | 06 | Race plan | Periodised plan + projected PMC | Race plan | Notebooks 05 and 06 are generated from `_build_05.py` / `_build_06.py` build scripts — edit those, then `uv run python examples/notebooks/_build_NN.py` regenerates the .ipynb. They're kept as a reference for the math the web app implements. --- ## 5 — Setup ```bash uv sync # creates .venv from pyproject.toml uv run openrun-web # launch the web app ``` The first time you open the app, a setup wizard collects your profile + HR zones, optionally takes a race calendar, and offers to ingest a Garmin export zip in-browser. Your config is written to `openrun.toml` in the current directory — edit by hand any time. If you'd rather work from the CLI directly: ```bash uv run openrun-init # bootstrap (no input required) uv run openrun-ingest # or uv run openrun-auth # OAuth login (Path B) uv run openrun-sync --full # 365-day backfill ``` `garth` (Garmin's live-sync library) is deprecated upstream (). If the live sync ever stops working, fall back to Path A. --- ## 6 — Configuration [openrun.toml](openrun.toml) — created by the setup wizard, hand-editable. ```toml [user] name = "Athlete" hr_max = 200 lthr = 175 resting_hr = 50 [user.hr_zones] Z1 = [100, 120] Z2 = [121, 140] Z3 = [141, 160] Z4 = [161, 180] Z5 = [181, 200] [db] path = "data/garmin.db" [banister] ctl_tau_days = 42.0 atl_tau_days = 7.0 race_day_tl_per_km = 7.0 [[races]] label = "wk 4 — 30K" date = "2026-06-13" ``` Discovery walks up from cwd looking for `openrun.toml`, falls back to `$OPENRUN_CONFIG`, then `~/.config/openrun/config.toml`, then built-in defaults. --- ## 7 — Tests ```bash uv run pytest ``` [tests/unit/](tests/unit/) is the pure-function surface (loaders, derivations, helpers). [tests/integration/](tests/integration/) is ingest-path + cross-cutting (`tmp_conn` fixture in [tests/conftest.py](tests/conftest.py) builds an in-memory schema). The Streamlit pages are smoke-tested via `streamlit.testing.v1.AppTest` — see the test invocations in [ROADMAP.md](ROADMAP.md) sections.