Added QA process - subagent, md template, scorecards.
This commit is contained in:
210
docs/QA-2026-04-11_0528.md
Normal file
210
docs/QA-2026-04-11_0528.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# Quality Assessment -- 2026-04-11
|
||||
|
||||
> **Score: 78.3 / 100 — Grade: B**
|
||||
> No code changes since baseline. All metrics identical. Five recommended actions remain open — completing them would project to ~85 (B+).
|
||||
|
||||
| Dimension | Score |
|
||||
|-----------|-------|
|
||||
| Test Health | 8.0 |
|
||||
| Type Safety | 6.5 |
|
||||
| Lint Hygiene | 6.0 |
|
||||
| Architecture | 9.0 |
|
||||
| Documentation | 9.0 |
|
||||
| Complexity | 7.0 |
|
||||
| Security | 8.5 |
|
||||
| Maintainability | 8.5 |
|
||||
|
||||
**Version:** 0.1.0
|
||||
**Assessed by:** Claude Opus 4.6
|
||||
**Previous assessment:** QA-2026-04-11_0459.md
|
||||
|
||||
---
|
||||
|
||||
## Inventory
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Source files | 72 |
|
||||
| Source lines | 10,325 |
|
||||
| Test files | 30 |
|
||||
| Test lines | 2,736 |
|
||||
| Test:source ratio | 0.26 |
|
||||
| Direct dependencies | 10 core + 1 optional + 4 dev |
|
||||
|
||||
---
|
||||
|
||||
## Raw Metrics
|
||||
|
||||
### Test Suite
|
||||
|
||||
```
|
||||
240 passed, 7 warnings in 9.26s
|
||||
```
|
||||
|
||||
- Tests collected: 240
|
||||
- Tests passed: 240
|
||||
- Tests failed: 0
|
||||
- Test duration: 9.26s
|
||||
|
||||
### Type Safety (mypy --strict)
|
||||
|
||||
```
|
||||
Found 49 errors in 20 files (checked 72 source files)
|
||||
```
|
||||
|
||||
- Total errors: 49
|
||||
- Files with errors: 20 / 72 (72% clean)
|
||||
- Top error categories:
|
||||
- `[type-arg]` 17 -- missing generic parameters on `dict`, `list`
|
||||
- `[attr-defined]` 9 -- attribute access on loosely typed objects
|
||||
- `[no-any-return]` 5 -- returning `Any` from typed functions
|
||||
- `[var-annotated]` 3
|
||||
- `[import-untyped]` 3
|
||||
- `[assignment]` 3
|
||||
- `[valid-type]` 2
|
||||
- `[return-value]` 2
|
||||
- `[no-untyped-call]` 2
|
||||
- `[unused-ignore]` 1
|
||||
- `[no-untyped-def]` 1
|
||||
- `[comparison-overlap]` 1
|
||||
|
||||
### Lint (ruff)
|
||||
|
||||
```
|
||||
Found 89 errors.
|
||||
[*] 73 fixable with the `--fix` option (9 hidden fixes can be enabled with the `--unsafe-fixes` option).
|
||||
```
|
||||
|
||||
- Total violations: 89
|
||||
- Auto-fixable: 73 (82%)
|
||||
- Top violation rules:
|
||||
- `F401` 61 -- unused imports
|
||||
- `I001` 9 -- unsorted imports
|
||||
- `F601` 8 -- duplicate dictionary keys
|
||||
- `E501` 5 -- line too long
|
||||
- `F841` 3 -- unused variables
|
||||
- `P035` 2 -- string concatenation in f-string
|
||||
- `F541` 1 -- f-string without placeholder
|
||||
|
||||
### Complexity
|
||||
|
||||
- File size: min=1 / median=133 / mean=143 / max=693
|
||||
- Files >300 lines: 6 / 72
|
||||
- High-complexity files (branch density >15):
|
||||
|
||||
```
|
||||
80 src/impakt/io/mme.py (693 lines) -- ISO 13499 parser, justified
|
||||
44 src/impakt/web/components/criteria.py (343) -- UI assembly with protocol logic
|
||||
30 src/impakt/channel/model.py (456) -- core data model, multiple classes
|
||||
27 src/impakt/web/state.py (274) -- app state with multi-test support
|
||||
27 src/impakt/protocol/euro_ncap.py (238) -- sliding-scale scoring tables
|
||||
25 src/impakt/web/callbacks/plot_callbacks.py (249) -- transform pipeline orchestration
|
||||
21 src/impakt/protocol/iihs.py (180) -- G/A/M/P rating logic
|
||||
20 src/impakt/plot/engine.py (257) -- Plotly rendering with corridors
|
||||
19 src/impakt/script/cli.py (140) -- CLI arg parsing
|
||||
17 src/impakt/web/components/channel_grid.py (368) -- DataTable assembly
|
||||
16 src/impakt/web/callbacks/channel_callbacks.py (195) -- selection/filter callbacks
|
||||
```
|
||||
|
||||
### Documentation
|
||||
|
||||
- Docstring coverage: 414 / 454 definitions (91%)
|
||||
- Modules with `__all__`: 6 / 11 public modules
|
||||
- channel: YES
|
||||
- criteria: YES
|
||||
- io: NO
|
||||
- plot: YES
|
||||
- plugin: NO
|
||||
- protocol: YES
|
||||
- report: NO
|
||||
- script: NO
|
||||
- template: NO
|
||||
- transform: YES
|
||||
- web: YES
|
||||
- README: 1,266 lines with 20 Mermaid diagram references
|
||||
- Architectural diagrams: yes
|
||||
|
||||
### Security
|
||||
|
||||
- eval/exec (sandboxed): 1 -- `math_expr.py:151`, restricted builtins `{}` + token blocklist (flagged `# noqa: S307`)
|
||||
- eval/exec (unsandboxed): 0
|
||||
- subprocess: 0 (the string "subprocess" appears only as a blocklist entry in `math_expr.py`)
|
||||
- Hardcoded secrets: 0
|
||||
- Bare except: 0
|
||||
|
||||
### Maintainability
|
||||
|
||||
- TODO: 0
|
||||
- FIXME: 0
|
||||
- HACK: 0
|
||||
- Logging calls: 48
|
||||
- try/except blocks: 52
|
||||
- Bare excepts: 0
|
||||
- Internal imports (coupling): 190
|
||||
|
||||
---
|
||||
|
||||
## Scorecard
|
||||
|
||||
| # | Dimension | Weight | Score | Weighted | Justification |
|
||||
|---|-----------|--------|-------|----------|---------------|
|
||||
| 1 | Test Health | 20% | 8.0/10 | 16.0 | 240/240 pass. 0.26 ratio (within 0.2-0.5 band). Integration tests with real datasets. No coverage % configured. |
|
||||
| 2 | Type Safety | 15% | 6.5/10 | 9.75 | mypy strict enabled. 49 errors remain in 20 files, concentrated in web layer. Mostly cosmetic (`type-arg` 17). Between 6 (<50 errors) and 8 (<10 errors). |
|
||||
| 3 | Lint Hygiene | 10% | 6.0/10 | 6.0 | 89 violations, 82% auto-fixable. Dominated by unused imports (F401=61). 8 duplicate dict keys (F601) need manual fix. Rubric: <100, mostly auto-fixable = 6. |
|
||||
| 4 | Architecture | 15% | 9.0/10 | 13.5 | Clean 4-layer design (data -> transform -> protocol -> web). Plugin system. No layer violations found. 6/11 modules export `__all__`. Docked 1 point for 5 missing `__all__`. |
|
||||
| 5 | Documentation | 10% | 9.0/10 | 9.0 | 91% docstring coverage (>90%). README with 20 Mermaid diagrams. No generated API reference docs, so not a full 10. |
|
||||
| 6 | Complexity | 10% | 7.0/10 | 7.0 | Median 133 (<150). 6 files >300 lines. `mme.py` at 693/80 complexity is the outlier -- justified as a format parser. Between 8 (<=3 files >300) and 6 (<=10 files >300). |
|
||||
| 7 | Security | 10% | 8.5/10 | 8.5 | Single eval sandboxed with `{"__builtins__": {}}` + 16-item token blocklist. No subprocess, no secrets. Between 9 (sandboxed) and 7 (partially sandboxed). |
|
||||
| 8 | Maintainability | 10% | 8.5/10 | 8.5 | Zero debt markers. Zero bare excepts. 48 logging calls across codebase. Modern tooling (uv, hatchling, ruff, mypy). Between 10 (perfect) and 8 (<5 markers). |
|
||||
|
||||
### Composite Score: **78.3 / 100**
|
||||
### Grade: **B**
|
||||
|
||||
Calculation: (8.0*0.20 + 6.5*0.15 + 6.0*0.10 + 9.0*0.15 + 9.0*0.10 + 7.0*0.10 + 8.5*0.10 + 8.5*0.10) * 10 = (1.60 + 0.975 + 0.60 + 1.35 + 0.90 + 0.70 + 0.85 + 0.85) * 10 = 78.25 -> 78.3
|
||||
|
||||
---
|
||||
|
||||
## Delta from Previous Assessment
|
||||
|
||||
| Dimension | Previous | Current | Change |
|
||||
|-----------|----------|---------|--------|
|
||||
| Test Health | 8.0 | 8.0 | 0.0 |
|
||||
| Type Safety | 6.5 | 6.5 | 0.0 |
|
||||
| Lint Hygiene | 6.0 | 6.0 | 0.0 |
|
||||
| Architecture | 9.0 | 9.0 | 0.0 |
|
||||
| Documentation | 9.0 | 9.0 | 0.0 |
|
||||
| Complexity | 7.0 | 7.0 | 0.0 |
|
||||
| Security | 8.5 | 8.5 | 0.0 |
|
||||
| Maintainability | 8.5 | 8.5 | 0.0 |
|
||||
| **Composite** | **78.3** | **78.3** | **0.0** |
|
||||
|
||||
---
|
||||
|
||||
## Top Improvements Since Last Assessment
|
||||
|
||||
No code changes since previous assessment -- all metrics are identical.
|
||||
|
||||
---
|
||||
|
||||
## Recommended Actions (Priority Order)
|
||||
|
||||
| # | Action | Effort | Impact | Dimensions Affected |
|
||||
|---|--------|--------|--------|---------------------|
|
||||
| 1 | Run `uv run ruff check --fix src/` to clear 73 auto-fixable violations | 1 min | +2.0 lint -> 8.0 | Lint Hygiene |
|
||||
| 2 | Fix 8 duplicate dict keys in `channel/lookup.py` (F601) and remaining manual lint fixes | 15 min | +1.0 lint -> 9.0+ | Lint Hygiene |
|
||||
| 3 | Add `--cov --cov-report=term` to pytest config, target 80%+ branch coverage | 30 min | +1.0 test | Test Health |
|
||||
| 4 | Resolve 17 `[type-arg]` mypy errors (add `dict[str, X]` generics to web layer) | 1 hr | +1.0 type | Type Safety |
|
||||
| 5 | Add `__all__` to `io`, `plugin`, `report`, `script`, `template` modules | 30 min | +0.5 arch | Architecture |
|
||||
|
||||
**Projected score after actions 1-5: ~85 (B+)**
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- **No code changes detected** between this assessment and the prior one (QA-2026-04-11_0459). All raw metrics are identical, yielding the same scores. The recommended actions from the baseline remain open.
|
||||
- **Architecture score is qualitative.** Import graph was inspected: no layer violations found (data layer does not import from web/plot). The `web` module sits at the top of the dependency tree as expected.
|
||||
- **Security eval in `math_expr.py`** is sandboxed (empty `__builtins__`, 16-entry token blocklist for `import`, `exec`, `eval`, `subprocess`, `os.`, `sys.`, `__`, etc.). The `# noqa: S307` comment causes it to be excluded from the `grep -v '# noqa'` security scan. An AST-based evaluator would be safer but is lower priority given the blocklist approach.
|
||||
- **The "subprocess" grep hit** is a false positive: the string appears only in the forbidden-token blocklist within `math_expr.py`, not as an actual subprocess invocation.
|
||||
- **Complexity scoring for `mme.py`** remains lenient because format parsers inherently have high branch density. If it grows beyond ~800 lines, consider extracting sub-parsers.
|
||||
- **Test:source ratio of 0.26** has not changed. Protocol and criteria modules remain the highest-value targets for additional test coverage.
|
||||
Reference in New Issue
Block a user