Files
openrun/ROADMAP.md
2026-06-12 05:48:30 -04:00

9.8 KiB

openrun — roadmap

Living backlog. Items are grouped by phase; within a phase, top to bottom is rough priority. Edit this file as we go — items removed because they turned out to be misguided are more valuable signal than items completed.


Done — highlights from past sessions

The major arcs that are already landed (see git history for per-commit detail):

Project structure & test scaffolding

  • src/openrun/ package split out: config (UserProfile / HRZones / BanisterParams + TOML reader & writer), db, model, plots, setup, and ingest/* (auth, garmin_api, garmin_export, fit_linker, manual, time_in_zone).
  • openrun.toml is the single source of user-specific config — no athlete numbers anywhere in src/openrun/.
  • pytest + pytest-cov in [dependency-groups.dev], tests/unit/ and tests/integration/, tmp_conn fixture for in-memory SQLite. 88 tests passing at last count.

Ingest correctness & robustness

  • fit_linker.record_link + relink — absolute paths stored at link time; openrun-link-fit <new_root> --relink rewrites the table after an export moves. _resolve_fit_path is now an O(1) existence check with a clear "run --relink" hint on miss.
  • link() accepts an injected fit_iter for unit testing without real FIT files; warn-and-skip on activity-id collision.
  • handle_fit skips Takeout-style filenames cleanly and surfaces a "run openrun-link-fit" hint in the summary.
  • Schema round-trip tests: ingest JSON → DB row → loader → DataFrame for activities + 7 wellness tables.

Derived metrics & helpers

  • banister_forecast(history, future, *, today=None) — splices historical load with planned future load and runs the same EWMA; load-bearing splice invariant tested.
  • plan_to_daily_load(plan, *, tl_per_km, race_day_tl_per_km, race_dates) — converts weekly km plan rows into daily load with race-week-aware distribution.
  • calibrate_tl_per_km(conn) — empirical median/IQR of TL/km from history, replaces hard-coded constants in race-plan flows.
  • personal_records(activities, distance_bins_km, tolerance) — fastest run within ±tolerance of each bin.
  • weekly_time_in_zone(conn) — ISO-week pivot of the cached TIZ table.
  • load_sleep_stages(conn) — deep/light/rem/awake seconds + percentages, with NaN-aware invariant (present-stage pcts sum to 1).
  • plot_fit_decoupling(records, *, segments) — new openrun.plots submodule (lazy matplotlib import).

Per-second data from the live API (Path B)

  • openrun-sync downloads each new activity's original FIT via /download-service/files/activity/{id}, stores it in data/fit/<id>.fit, and links it through fit_linker.record_link — so decoupling, FIT-based TIZ, and the route map work without a website export.
  • _extract_fit_bytes handles both the zip-wrapped and bare-FIT responses; download_fit is idempotent (skips when on-disk + linked); backfill_fits pulls per-second history for past activities (--fit-backfill [--fit-type] [--fit-limit]).
  • Tested at the garth.download boundary (tests/unit/test_fit_download.py); network call mocked per ROADMAP conventions.

Off-watch volume integration

  • New manual_activities table + openrun-import-manual <csv> CLI + web form on the Manual log page.
  • daily_training_load_series(..., include_manual=True) unions both sources via UNION ALL; same-day rows sum.

Web app (Streamlit)

  • openrun-web launches a multipage browser UI on localhost:8501.
  • Pages: Home, Dashboard, Activities (with row-click drill-in to Activity detail), Race plan (editable + projected PMC), Manual log, Recovery, Efficiency, Sync (in-browser zip ingest), Welcome wizard.
  • st.navigation-driven sidebar: pages grouped into sections; Welcome only appears when openrun.toml is missing or the DB is empty.
  • First-run wizard writes openrun.toml, runs init_workspace, and offers to ingest a zip in the same flow — no terminal needed.

Phase 1 — code gaps still open

1.2 Unhandled Takeout JSON categories

The Takeout dump includes several JSON categories we currently mark unrecognized: TrainingReadinessDTO, EnduranceScore, HillScore, RunRacePredictions (skip HydrationLog — low value).

  • Priority: TrainingReadinessDTO first (Garmin's own readiness score — useful as a sanity check against derived TSB) and RunRacePredictions (a free baseline for the projected-PMC plan).
  • Each needs a new SQLite table + dispatch entry + a fixture JSON pulled from a real Takeout dump.
  • Test plan: per category, test_garmin_export.py::test_handle_<category> does fixture-driven insert; one schema-roundtrip test per table.

1.6 Schema round-trip leftovers

activity_splits, activity_fit_files, and activity_time_in_zone aren't reached through a single Takeout JSON handler (splits are sync-only, FITs are linker-driven, TIZ is precomputed). Each needs a different fixture/path. The existing helper-level tests in test_fit_linker.py and test_weekly_tiz.py cover most of what these would, but a true round-trip test would close the gap.


Phase 2 — analytical features still open

2.1 DBSCAN route clustering at scale

The greedy haversine clusterer is fine for hundreds of starts; switch to sklearn.cluster.DBSCAN(metric='haversine') once a dataset gets to thousands. Add cluster_routes(..., method='dbscan') as an alternate path; keep greedy as the baseline.

  • Adds dependency: scikit-learn. Defer until there's a concrete dataset that needs it.
  • Test plan: test_geo.py::test_haversine_known_pairs (two cities vs published value within 0.1 km) and test_cluster_routes_greedy_vs_dbscan_agree on a synthetic 200-point dataset.

Phase 3 — web app polish (post-MVP)

The web app covers the full workflow but a few rough edges are worth picking off when there's time.

3.1 Browser-driven Garmin live sync — done

Garmin's garth uses screen-scraped email/password/MFA login, not real OAuth. The Sync page now drives the full multi-step flow in the browser: email/password → (if required) MFA code → token store in .secrets/🔄 Sync now (activities + per-second FIT + wellness), with a log-out that forgets tokens.

  • openrun.ingest.auth gained web-friendly, input()-free helpers: has_tokens, resume, current_user, begin_login (returns ("ok", user) or ("needs_mfa", state)), complete_mfa. The MFA client_state is held in st.session_state between the password and code steps.
  • openrun.ingest.garmin_api.run_sync is the auth-free orchestrator shared by the CLI main() and the Sync page; it streams step progress via a progress callback.
  • Done in tests: test_auth.py (token-store helpers + no-MFA and MFA-required login paths, mocked at garth.sso), test_run_sync.py (orchestration contract).
  • Possible follow-up: a Streamlit AppTest smoke test for the page (the repo has none yet); auto-run openrun-time-in-zone after a sync that pulled new FITs so the TIZ cache never lags.

3.2 Route map on Activity detail

Each activity with linked FIT has per-second position_lat/long (semicircles). Render the route on a map (pydeck or folium), highlight HR-zone or pace via segment colouring. Pure visualisation; the data is already there.

  • Adds dependency: pydeck or folium.

3.3 Activity-detail polish

  • Heart rate over time (5-min rolling), with zone bands shaded.
  • Splits sortable.
  • "Compare against another activity" — overlay this run's pace/HR curve on a prior run of similar distance.

3.4 Race-plan UX

  • Inline race-week badges in the data-editor table.
  • "Suggest a plan" button that auto-generates a ramp given the race calendar + current CTL.
  • Save vs Reset — currently the editor blows away the plan table on every save.

3.5 Multi-athlete support

The config already accepts a per-user profile. Promoting openrun.toml to allow multiple [athletes.<name>] blocks + a profile selector in the sidebar is straightforward but premature for the single-user case the README is written for.


Phase 4 — distribution

When the tool is solid enough to share with non-Python folks:

  • One-command installer. A make install / ./run.sh that handles uv sync + first launch. The only terminal step left after the wizard.
  • Electron/Tauri desktop wrap. Bundle a Python runtime + the wheel + a thin native shell that opens the Streamlit server in a webview. Mechanical; the HTTP server design is already wrap-ready.
  • Public release. PyPI, GitHub releases, screenshots in README. Pick an actual name (openrun is provisional — squat-check before shipping).

Test conventions

  • Layout:
    • tests/unit/ — pure-function metrics (model.py public surface). One file per concept: test_banister.py, test_race_plan.py, test_sleep_stages.py, etc.
    • tests/integration/ — ingest pipelines, loaders, cross-cutting flows.
    • tests/fixtures/ — pinned JSON / FIT / CSV samples. Anonymised, small.
  • Granularity: one assertion per test when reasonable; closed-form math checks for derivations (impulse response, step response).
  • What we don't test:
    • garth.connectapi — network-dependent, fragile, upstream-deprecated. Mock at _safe_call if we ever need to.
    • Notebook content. The build scripts are easier to read than the generated JSON; we test what they import.
    • Streamlit visual output. AppTest framework smoke-tests page rendering (zero exceptions on a real DB); we don't visual-regress.
  • Run: uv run pytest. pytest-cov is in dev deps; coverage isn't a target.

Working agreement

When picking up an item: write the failing test first against the API the test plan describes, then make it pass. If the test plan turns out to be wrong (the function shouldn't behave that way after all), update this file in the same PR. Items removed because they turned out to be misguided are more valuable signal than items completed.