Four Skill-tool skills under .claude/skills/ that let an agent monitor and
troubleshoot the install (2x LVX6048, 6x EG4 LifePower4, OpenEVSE), grounded
in the real MQTT/HA topology rather than generic advice:
- solar-health-check : whole-system sweep + cross-checks + R/Y/G verdict,
incl. cross-unit "silently-dead inverter" detection
- troubleshoot-inverter: FWS fault decode, parallel sync, USB link recovery
- troubleshoot-battery : per-pack imbalance vs SoC-counter-drift, RS485 silence
- power-usage : PV/load/grid/battery balance + EVSE sessions
Shared lib:
- solar-snapshot : live MQTT capture (creds from powermon.yaml, no hardcoding)
- ha-history : HA recorder lookback (token from ~/.config/ha/token)
REFERENCE.md documents topology, real HA entity_ids (doubled slug), known
issues, and a safe-remediation-only action policy (restarts yes; setters no).
Action boundary: diagnose + restart wedged daemons / recover USB links;
never touches inverter/battery setters or flash.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.0 KiB
name, description
| name | description |
|---|---|
| solar-health-check | Top-level health snapshot of the whole solar/power install — 2 LVX6048 inverters, 6 EG4 LifePower4 packs, and the OpenEVSE charger — with cross-checks and a green/yellow/red verdict. Use when the user asks "how's the solar / battery / power system doing", "is everything ok", "check the install", wants a status report, or as the first step before deeper troubleshooting. For deep dives into one subsystem, hand off to troubleshoot-inverter, troubleshoot-battery, or power-usage. |
solar-health-check
A fast, read-only sweep of every subsystem that ends in a clear verdict. Do NOT change settings here; if something needs a restart, that's allowed (see policy).
0. Load context
Skills run with the shell cwd at the repo root, so anchor paths there:
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
Read $ROOT/.claude/skills/REFERENCE.md (system map, entity names, snapshot helper,
action policy) before proceeding.
1. Services up?
systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service lvx-resolve-links.service
lvx-resolve-links is a oneshot → expect active/exited (not failed). Any
failed/inactive on the others is RED. For a wedged data-plane daemon, a
restart is allowed (see §6).
2. Capture live telemetry
"$SNAP" -w 10 -g 'lvx6048_[12]_(device_mode|fault_code|battery_voltage|battery_capacity|ac_output_active_power|mppt1_input_power|mppt2_input_power|grid_voltage|inverter_heat_sink_temperature|parallel_instance_number)/' 'homeassistant/sensor/+/state'
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current|cell_voltage_delta_mv|temperature_pcb)/' 'homeassistant/sensor/+/state'
"$SNAP" -w 6 'openevse/status' 'openevse/amp' 'openevse/power' 'openevse/session_energy'
If a family returns "(no messages)": the feeding daemon is silent → that subsystem
is RED regardless of is-active (running but not publishing). EVSE idle/unplugged
publishing nothing is normal — confirm via openevse/status.
3. Cross-checks (this is the value-add — single sensors can each look fine)
- Battery voltage agreement: each inverter's
battery_voltageshould be within ~1 V of the pack stack voltage (pack_voltage≈ 51–55 V). Known anomaly: the inverter reading is intermittently wrong (correct ~54 V on 2026-06-20, ~9–10 V after the Jun 22 reboot) — a post-reboot glitch, not a permanent bug. If it reads ~10 V, note it and suggest a powermon restart; use thelifepower4_*pack entities, never the inverter reading, for any battery math (see REFERENCE known-issues). - Cross-unit PV production (catches a silently-dead inverter): compare
lvx6048_1_mppt1_input_powervslvx6048_2_mppt1_input_power. In daylight (the other unit clearly producing), one unit pinned at 0 W = that inverter is down and being masked by its sibling — RED → troubleshoot-inverter. This is exactly the 2026-06-20 fault-08 failure mode (unit 1 sat at 0 W for ~1.8 days). At night/heavy shade both at 0 W is normal. - SoC spread across packs:
max(soc) - min(soc)over the 6 packs. BUT first cross-check againstpack_voltage/cell_voltage_max: the packs are paralleled, so if allpack_voltageagree (±0.1 V) the packs are physically at the same charge and any SoC spread is counter drift, not real imbalance (pack 6 ran 76 % while reading the same 53.4 V / 3.337 V/cell as packs at 50–55 % on 2026-06-24). Real imbalance = pack voltages actually diverge. Drift → note it, recommend a calibration charge; >20 % spread with diverging voltages = RED → troubleshoot-battery. - Cell imbalance: any pack with
cell_voltage_delta_mv> 50 = YELLOW, > 100 = RED. - Parallel master/slave: exactly one inverter should report
parallel_instance_number0 (master); the other 1+. Two masters or two slaves = RED. - Faults: any
fault_codenon-zero, ordevice_mode= Fault = RED → troubleshoot-inverter. - Temps: pack
temperature_pcb> 55 °C or inverter heat-sink > 75 °C = YELLOW. - Power balance sanity: PV in (
mppt*_input_power) vs AC out vs packpack_currentshould roughly conserve. Gross mismatch = investigate via power-usage.
4. Verdict
Print a compact table (subsystem → state → one-line reason), then an overall GREEN / YELLOW / RED with the top 1–3 issues and which deeper skill to run.
5. Recent error scan (only if anything looked off)
for s in powermon powermon2 eg4-battery lvx-control; do
echo "== $s =="; journalctl -u $s.service --since "15 min ago" --no-pager | grep -iE 'error|timeout|fail|crc|nak|reconnect' | tail -5
done
5b. Historical sanity — did anything fail while unattended? (needs HA token)
Live snapshots miss faults that already cleared and silent-unit spells that ended.
If ~/.config/ha/token exists (see REFERENCE), scan the recorder for the last few
days. Use the REAL HA entity_ids (doubled slug — see REFERENCE), not MQTT names:
"$HIST" -s "5 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
# silent-unit hunt: sample midday PV both units across recent days; one pinned 0 while
# the other produced = it was down. e.g. check a midday window per day:
"$HIST" -s "2 days ago" sensor.lvx6048_lvx6048_1_mppt1_input_power sensor.lvx6048_lvx6048_2_mppt1_input_power | head -40
Any fault-08 / silent-unit episode → report with timestamps and hand off to troubleshoot-inverter §2–§5. No token → say so and point the user at REFERENCE to add one.
6. Allowed remediation
If a daemon is failed or running-but-silent, restarting it is permitted:
sudo systemctl restart eg4-battery.service # or powermon / powermon2 / lvx-control
Re-run the relevant snapshot to confirm data resumes. Anything beyond a restart
(settings, flash, cabling) → report and hand the user the exact command. Never
publish to solar/control/lvx6048/* from this skill.