104 lines
6.0 KiB
Markdown
104 lines
6.0 KiB
Markdown
|
|
---
|
|||
|
|
name: solar-health-check
|
|||
|
|
description: >-
|
|||
|
|
Top-level health snapshot of the whole solar/power install — 2 LVX6048
|
|||
|
|
inverters, 6 EG4 LifePower4 packs, and the OpenEVSE charger — with cross-checks
|
|||
|
|
and a green/yellow/red verdict. Use when the user asks "how's the solar /
|
|||
|
|
battery / power system doing", "is everything ok", "check the install", wants a
|
|||
|
|
status report, or as the first step before deeper troubleshooting. For deep
|
|||
|
|
dives into one subsystem, hand off to troubleshoot-inverter, troubleshoot-battery,
|
|||
|
|
or power-usage.
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# solar-health-check
|
|||
|
|
|
|||
|
|
A fast, read-only sweep of every subsystem that ends in a clear verdict. Do NOT
|
|||
|
|
change settings here; if something needs a restart, that's allowed (see policy).
|
|||
|
|
|
|||
|
|
## 0. Load context
|
|||
|
|
Skills run with the shell cwd at the repo root, so anchor paths there:
|
|||
|
|
```bash
|
|||
|
|
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
|
|||
|
|
```
|
|||
|
|
Read `$ROOT/.claude/skills/REFERENCE.md` (system map, entity names, snapshot helper,
|
|||
|
|
action policy) before proceeding.
|
|||
|
|
|
|||
|
|
## 1. Services up?
|
|||
|
|
```bash
|
|||
|
|
systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service lvx-resolve-links.service
|
|||
|
|
```
|
|||
|
|
`lvx-resolve-links` is a oneshot → expect `active`/`exited` (not `failed`). Any
|
|||
|
|
`failed`/`inactive` on the others is RED. For a wedged data-plane daemon, a
|
|||
|
|
restart is allowed (see §6).
|
|||
|
|
|
|||
|
|
## 2. Capture live telemetry
|
|||
|
|
```bash
|
|||
|
|
"$SNAP" -w 10 -g 'lvx6048_[12]_(device_mode|fault_code|battery_voltage|battery_capacity|ac_output_active_power|mppt1_input_power|mppt2_input_power|grid_voltage|inverter_heat_sink_temperature|parallel_instance_number)/' 'homeassistant/sensor/+/state'
|
|||
|
|
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current|cell_voltage_delta_mv|temperature_pcb)/' 'homeassistant/sensor/+/state'
|
|||
|
|
"$SNAP" -w 6 'openevse/status' 'openevse/amp' 'openevse/power' 'openevse/session_energy'
|
|||
|
|
```
|
|||
|
|
If a family returns "(no messages)": the feeding daemon is silent → that subsystem
|
|||
|
|
is RED regardless of `is-active` (running but not publishing). EVSE idle/unplugged
|
|||
|
|
publishing nothing is normal — confirm via `openevse/status`.
|
|||
|
|
|
|||
|
|
## 3. Cross-checks (this is the value-add — single sensors can each look fine)
|
|||
|
|
- **Battery voltage agreement**: each inverter's `battery_voltage` should be within
|
|||
|
|
~1 V of the pack stack voltage (`pack_voltage` ≈ 51–55 V). **Known anomaly:** the
|
|||
|
|
inverter reading is *intermittently* wrong (correct ~54 V on 2026-06-20, ~9–10 V
|
|||
|
|
after the Jun 22 reboot) — a post-reboot glitch, not a permanent bug. If it reads
|
|||
|
|
~10 V, note it and suggest a powermon restart; use the `lifepower4_*` pack entities,
|
|||
|
|
never the inverter reading, for any battery math (see REFERENCE known-issues).
|
|||
|
|
- **Cross-unit PV production (catches a silently-dead inverter)**: compare
|
|||
|
|
`lvx6048_1_mppt1_input_power` vs `lvx6048_2_mppt1_input_power`. In daylight (the
|
|||
|
|
*other* unit clearly producing), one unit pinned at **0 W** = that inverter is down
|
|||
|
|
and being masked by its sibling — RED → troubleshoot-inverter. This is exactly the
|
|||
|
|
2026-06-20 fault-08 failure mode (unit 1 sat at 0 W for ~1.8 days). At night/heavy
|
|||
|
|
shade both at 0 W is normal.
|
|||
|
|
- **SoC spread across packs**: `max(soc) - min(soc)` over the 6 packs. BUT first
|
|||
|
|
cross-check against `pack_voltage`/`cell_voltage_max`: the packs are paralleled, so
|
|||
|
|
if all `pack_voltage` agree (±0.1 V) the packs are physically at the same charge and
|
|||
|
|
any SoC spread is **counter drift**, not real imbalance (pack 6 ran 76 % while
|
|||
|
|
reading the same 53.4 V / 3.337 V/cell as packs at 50–55 % on 2026-06-24). Real
|
|||
|
|
imbalance = pack voltages actually diverge. Drift → note it, recommend a calibration
|
|||
|
|
charge; >20 % spread with diverging voltages = RED → troubleshoot-battery.
|
|||
|
|
- **Cell imbalance**: any pack with `cell_voltage_delta_mv` > 50 = YELLOW, > 100 = RED.
|
|||
|
|
- **Parallel master/slave**: exactly one inverter should report
|
|||
|
|
`parallel_instance_number` 0 (master); the other 1+. Two masters or two slaves = RED.
|
|||
|
|
- **Faults**: any `fault_code` non-zero, or `device_mode` = Fault = RED → troubleshoot-inverter.
|
|||
|
|
- **Temps**: pack `temperature_pcb` > 55 °C or inverter heat-sink > 75 °C = YELLOW.
|
|||
|
|
- **Power balance sanity**: PV in (`mppt*_input_power`) vs AC out vs pack
|
|||
|
|
`pack_current` should roughly conserve. Gross mismatch = investigate via power-usage.
|
|||
|
|
|
|||
|
|
## 4. Verdict
|
|||
|
|
Print a compact table (subsystem → state → one-line reason), then an overall
|
|||
|
|
GREEN / YELLOW / RED with the top 1–3 issues and which deeper skill to run.
|
|||
|
|
|
|||
|
|
## 5. Recent error scan (only if anything looked off)
|
|||
|
|
```bash
|
|||
|
|
for s in powermon powermon2 eg4-battery lvx-control; do
|
|||
|
|
echo "== $s =="; journalctl -u $s.service --since "15 min ago" --no-pager | grep -iE 'error|timeout|fail|crc|nak|reconnect' | tail -5
|
|||
|
|
done
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 5b. Historical sanity — did anything fail while unattended? (needs HA token)
|
|||
|
|
Live snapshots miss faults that already cleared and silent-unit spells that ended.
|
|||
|
|
If `~/.config/ha/token` exists (see REFERENCE), scan the recorder for the last few
|
|||
|
|
days. Use the REAL HA entity_ids (doubled slug — see REFERENCE), not MQTT names:
|
|||
|
|
```bash
|
|||
|
|
"$HIST" -s "5 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
|
|||
|
|
# silent-unit hunt: sample midday PV both units across recent days; one pinned 0 while
|
|||
|
|
# the other produced = it was down. e.g. check a midday window per day:
|
|||
|
|
"$HIST" -s "2 days ago" sensor.lvx6048_lvx6048_1_mppt1_input_power sensor.lvx6048_lvx6048_2_mppt1_input_power | head -40
|
|||
|
|
```
|
|||
|
|
Any fault-08 / silent-unit episode → report with timestamps and hand off to
|
|||
|
|
troubleshoot-inverter §2–§5. No token → say so and point the user at REFERENCE to add one.
|
|||
|
|
|
|||
|
|
## 6. Allowed remediation
|
|||
|
|
If a daemon is `failed` or running-but-silent, restarting it is permitted:
|
|||
|
|
```bash
|
|||
|
|
sudo systemctl restart eg4-battery.service # or powermon / powermon2 / lvx-control
|
|||
|
|
```
|
|||
|
|
Re-run the relevant snapshot to confirm data resumes. Anything beyond a restart
|
|||
|
|
(settings, flash, cabling) → report and hand the user the exact command. Never
|
|||
|
|
publish to `solar/control/lvx6048/*` from this skill.
|