Files
impakt/docs/QA-2026-04-11_0459.md

7.0 KiB

Quality Assessment -- 2026-04-11

Score: 78.3 / 100 — Grade: B Baseline assessment. Strong architecture and documentation (9.0 each), solid test suite (240/240 pass), held back by lint debt (89 violations) and type errors (49 mypy strict). Clear path to B+ with low-effort fixes.

Dimension Score
Test Health 8.0
Type Safety 6.5
Lint Hygiene 6.0
Architecture 9.0
Documentation 9.0
Complexity 7.0
Security 8.5
Maintainability 8.5

Version: 0.1.0 Assessed by: Claude Opus 4.6 Previous assessment: None (baseline)


Inventory

Metric Value
Source files 72
Source lines 10,325
Test files 30
Test lines 2,736
Test:source ratio 0.26
Direct dependencies 10 core + 1 optional + 4 dev

Raw Metrics

Test Suite

240 passed, 7 warnings in 9.15s
  • Tests collected: 240
  • Tests passed: 240
  • Tests failed: 0
  • Test duration: 9.15s

Type Safety (mypy --strict)

Found 49 errors in 20 files (checked 72 source files)
  • Total errors: 49
  • Files with errors: 20 / 72 (72% clean)
  • Top error categories:
    • [type-arg] 17 -- missing generic parameters on dict, list
    • [attr-defined] 9 -- attribute access on loosely typed objects
    • [no-any-return] 5 -- returning Any from typed functions
    • [var-annotated] 3
    • [import-untyped] 3
    • [assignment] 3

Lint (ruff)

Found 89 errors.
[*] 73 fixable with the `--fix` option (9 hidden fixes can be enabled with the `--unsafe-fixes` option).
  • Total violations: 89
  • Auto-fixable: 73 (82%)
  • Top violation rules:
    • F401 61 -- unused imports
    • I001 9 -- unsorted imports
    • F601 8 -- duplicate dictionary keys
    • E501 5 -- line too long
    • F841 3 -- unused variables
    • F541 1 -- f-string without placeholder

Complexity

  • File size: min=1 / median=133 / mean=143 / max=693
  • Files >300 lines: 6 / 72
  • High-complexity files (branch density >15):
  80  src/impakt/io/mme.py (693 lines)           -- ISO 13499 parser, justified
  44  src/impakt/web/components/criteria.py (343) -- UI assembly with protocol logic
  30  src/impakt/channel/model.py (456)           -- core data model, multiple classes
  27  src/impakt/web/state.py (274)               -- app state with multi-test support
  27  src/impakt/protocol/euro_ncap.py (238)      -- sliding-scale scoring tables
  25  src/impakt/web/callbacks/plot_callbacks.py (249) -- transform pipeline orchestration
  21  src/impakt/protocol/iihs.py (180)           -- G/A/M/P rating logic
  20  src/impakt/plot/engine.py (257)             -- Plotly rendering with corridors
  19  src/impakt/script/cli.py (140)              -- CLI arg parsing
  17  src/impakt/web/components/channel_grid.py (368) -- DataTable assembly
  16  src/impakt/web/callbacks/channel_callbacks.py (195) -- selection/filter callbacks

Documentation

  • Docstring coverage: 414 / 454 definitions (91%)
  • Modules with __all__: 6 / 11 public modules
  • README: 1,266 lines with 16+ Mermaid diagrams
  • Architectural diagrams: yes

Security

  • eval/exec (sandboxed): 1 -- math_expr.py:151, restricted builtins {} + token blocklist
  • eval/exec (unsandboxed): 0
  • subprocess: 0
  • Hardcoded secrets: 0
  • Bare except: 0

Maintainability

  • TODO: 0
  • FIXME: 0
  • HACK: 0
  • Logging calls: 48
  • try/except blocks: 52
  • Internal imports (coupling): 190

Scorecard

# Dimension Weight Score Weighted Justification
1 Test Health 20% 8.0 16.0 240/240 pass. 0.26 ratio (below 0.5 ideal). Integration tests with 5 real datasets. No coverage % configured.
2 Type Safety 15% 6.5 9.8 mypy strict enabled (strong). 49 errors remain, concentrated in web layer. Mostly cosmetic (type-arg).
3 Lint Hygiene 10% 6.0 6.0 89 violations but 82% auto-fixable. Dominated by unused imports (F401). 8 duplicate dict keys need manual fix.
4 Architecture 15% 9.0 13.5 Clean 4-layer design. Plugin system. Immutable data patterns. 6/11 modules export __all__. No layer violations detected.
5 Documentation 10% 9.0 9.0 91% docstring coverage. Comprehensive README with Mermaid diagrams. STATUS.md. No generated API reference.
6 Complexity 10% 7.0 7.0 Median 133 (healthy). 6 files >300. mme.py at 693/80 complexity is the outlier -- justified as a format parser.
7 Security 10% 8.5 8.5 Single eval with {"__builtins__": {}} + blocklist. No subprocess, no secrets. AST-based eval would eliminate residual risk.
8 Maintainability 10% 8.5 8.5 Zero debt markers. Zero bare excepts. Consistent logging. Modern tooling (uv, hatchling, ruff, mypy).

Composite Score: 78.3 / 100

Grade: B


Delta from Previous Assessment

Dimension Previous Current Change
Test Health -- 8.0 --
Type Safety -- 6.5 --
Lint Hygiene -- 6.0 --
Architecture -- 9.0 --
Documentation -- 9.0 --
Complexity -- 7.0 --
Security -- 8.5 --
Maintainability -- 8.5 --
Composite -- 78.3 baseline

Top Improvements Since Last Assessment

Baseline assessment -- no prior data.


# Action Effort Impact Dimensions Affected
1 Run uv run ruff check --fix src/ to clear 73 auto-fixable violations 1 min +2.0 lint Lint
2 Fix 8 duplicate dict keys in channel/lookup.py (F601) 10 min +0.5 lint Lint
3 Add --cov --cov-report=term to pytest, target 80%+ 30 min +1.0 test Test Health
4 Resolve 17 [type-arg] mypy errors (add dict[str, X] generics) 1 hr +1.0 type Type Safety
5 Add __all__ to io, plugin, report, script, template 30 min +0.5 arch Architecture

Projected score after all 5 actions: ~83 (B+)


Notes

  • Architecture score is qualitative. Import graph was inspected: no layer violations found (data layer does not import from web/plot). The web module correctly sits at the top of the dependency tree.
  • Complexity scoring for mme.py was lenient because format parsers inherently have high branch density. If it grows beyond ~800 lines, consider extracting sub-parsers.
  • The eval in math_expr.py is sandboxed (empty __builtins__, token blocklist for import, exec, eval, subprocess, os, sys, __). An AST-based evaluator would be safer but is lower priority given the blocklist approach.
  • Test:source ratio of 0.26 is common for alpha-stage projects with stable computation modules. The ratio should improve as edge-case tests are added. Protocol and criteria modules are the highest-value targets for additional test coverage.
  • Unused imports (F401) account for 69% of all lint violations. These are likely remnants from refactoring and are trivially fixable.