Add solar monitoring/troubleshooting skills for agents
Four Skill-tool skills under .claude/skills/ that let an agent monitor and
troubleshoot the install (2x LVX6048, 6x EG4 LifePower4, OpenEVSE), grounded
in the real MQTT/HA topology rather than generic advice:
- solar-health-check : whole-system sweep + cross-checks + R/Y/G verdict,
incl. cross-unit "silently-dead inverter" detection
- troubleshoot-inverter: FWS fault decode, parallel sync, USB link recovery
- troubleshoot-battery : per-pack imbalance vs SoC-counter-drift, RS485 silence
- power-usage : PV/load/grid/battery balance + EVSE sessions
Shared lib:
- solar-snapshot : live MQTT capture (creds from powermon.yaml, no hardcoding)
- ha-history : HA recorder lookback (token from ~/.config/ha/token)
REFERENCE.md documents topology, real HA entity_ids (doubled slug), known
issues, and a safe-remediation-only action policy (restarts yes; setters no).
Action boundary: diagnose + restart wedged daemons / recover USB links;
never touches inverter/battery setters or flash.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 11:46:20 -04:00
|
|
|
|
# Solar install — system map (shared reference for the solar skills)
|
|
|
|
|
|
|
|
|
|
|
|
This file is the ground truth the `solar-*` / `troubleshoot-*` / `power-usage`
|
|
|
|
|
|
skills build on. Read it once at the start of any solar task. Everything below
|
|
|
|
|
|
was verified live on this host (the monitoring Pi) on 2026-06-23; re-verify
|
|
|
|
|
|
anything load-bearing before acting on it.
|
|
|
|
|
|
|
|
|
|
|
|
## Topology
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
6× EG4 LifePower4 v2 packs ──RS485 (1 FTDI each)──┐
|
|
|
|
|
|
2× MPP Solar LVX6048 inverters ──USB-HID/PI18─────┤ this Pi ──MQTT──► HA broker
|
|
|
|
|
|
1× OpenEVSE charger (10.0.0.249) ──its own WiFi───┘ (daemons) 10.0.0.41:1883
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
All telemetry lands on the **MQTT broker at 10.0.0.41:1883** under HA
|
|
|
|
|
|
auto-discovery (`homeassistant/<class>/<entity>/config` retained, `.../state`
|
|
|
|
|
|
republished each poll cycle — **state topics are NOT retained**, so to read
|
|
|
|
|
|
current values you must listen for a window: use `lib/solar-snapshot`).
|
|
|
|
|
|
|
|
|
|
|
|
Broker credentials live in `~/.config/powermon/powermon.yaml`
|
|
|
|
|
|
(`mqttbroker.{name,port,username,password}`). **Never hardcode them** — every
|
|
|
|
|
|
tool here reads them from that file. `lib/solar-snapshot` does too.
|
|
|
|
|
|
|
|
|
|
|
|
## The snapshot helper
|
|
|
|
|
|
|
|
|
|
|
|
`./lib/solar-snapshot` (relative to this skills dir) captures the latest value of
|
|
|
|
|
|
every matching MQTT topic over a short window and prints a table. This is the
|
|
|
|
|
|
primary read tool — prefer it over raw `mosquitto_sub`.
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
solar-snapshot [-w SECONDS] [-g GREP_RE] [-f] TOPIC_FILTER...
|
|
|
|
|
|
```
|
|
|
|
|
|
MQTT `+` matches one WHOLE level, so `lifepower4_+` matches nothing. Subscribe to
|
|
|
|
|
|
`homeassistant/sensor/+/state` and narrow with `-g`:
|
|
|
|
|
|
```
|
|
|
|
|
|
solar-snapshot -g 'lvx6048_1_' 'homeassistant/sensor/+/state'
|
|
|
|
|
|
solar-snapshot -w 16 -g 'lifepower4_[1-6]_soc/' 'homeassistant/sensor/+/state'
|
|
|
|
|
|
solar-snapshot 'openevse/#' # EVSE publishes on-change; idle when unplugged
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
## The history helper
|
|
|
|
|
|
|
|
|
|
|
|
`solar-snapshot` only sees *now*. For "when did X last happen / show last week",
|
|
|
|
|
|
use `./lib/ha-history`, which queries **Home Assistant's recorder** (the only
|
|
|
|
|
|
store that keeps history — local journald is volatile, ~1 day, wiped on reboot;
|
|
|
|
|
|
no solar data goes to InfluxDB). Default window 7 days; HA recorder default
|
|
|
|
|
|
retention is 10 days.
|
|
|
|
|
|
```
|
|
|
|
|
|
ha-history [-s SINCE] [-e END] [-m REGEX] [-a] ENTITY...
|
|
|
|
|
|
ha-history -s "10 days ago" sensor.lvx6048_lvx6048_1_device_mode sensor.lvx6048_lvx6048_2_device_mode
|
|
|
|
|
|
ha-history -s "10 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
|
|
|
|
|
|
```
|
|
|
|
|
|
**HA entity_ids ≠ MQTT object names.** powermon's hass output doubles the device
|
|
|
|
|
|
slug and is inconsistent across commands, so you must use the real ids, e.g.:
|
|
|
|
|
|
- device mode: `sensor.lvx6048_lvx6048_{1,2}_device_mode` (device slug `lvx6048`)
|
|
|
|
|
|
- fault code: `sensor.lvx6048_0{1,2}_lvx6048_{1,2}_fault_code` (slug `lvx6048_01`/`_02`)
|
|
|
|
|
|
- PV/batt/load: `sensor.lvx6048_lvx6048_1_{mppt1_input_power,mppt1_input_voltage,battery_voltage,ac_output_active_power}`
|
|
|
|
|
|
- EG4 packs follow the same doubling, e.g. `sensor.lifepower4_*`. When unsure, list
|
|
|
|
|
|
them: `curl -s -H "Authorization: Bearer $(cat ~/.config/ha/token)" $HA/api/states
|
|
|
|
|
|
| python3 -c 'import sys,json;[print(s["entity_id"]) for s in json.load(sys.stdin) if "lvx6048" in s["entity_id"]]'`
|
|
|
|
|
|
Auth: reads a long-lived token from `~/.config/ha/token` (mode 600) or `$HA_TOKEN`
|
|
|
|
|
|
— never on the command line, never hardcoded. Base URL `$HA_URL` else
|
|
|
|
|
|
`~/.config/ha/url` else `http://10.0.0.41:8123`. If it reports "no token", the user
|
|
|
|
|
|
must create one (HA → Profile → Security → Long-lived access tokens) and write it
|
|
|
|
|
|
to `~/.config/ha/token`; tell them which file, don't ask them to paste it in chat.
|
|
|
|
|
|
Recorder excludes (per `eg4battery/homeassistant/recorder.yaml`) drop EG4
|
|
|
|
|
|
per-cell/register/string entities — those have no history; the inverter
|
|
|
|
|
|
`device_mode`/`fault_code` and pack `soc`/`pack_voltage` etc. are recorded.
|
|
|
|
|
|
|
|
|
|
|
|
## Services (this Pi)
|
|
|
|
|
|
|
|
|
|
|
|
| Service | Role | Entities it feeds |
|
|
|
|
|
|
|---|---|---|
|
|
|
|
|
|
| `powermon.service` | LVX6048 #1 poller (PI18/USB) | `lvx6048_1_*` |
|
|
|
|
|
|
| `powermon2.service` | LVX6048 #2 poller (PI18/USB) | `lvx6048_2_*` |
|
|
|
|
|
|
| `lvx-resolve-links.service` | oneshot: maps `/dev/hidraw*` → `/dev/lvx6048-{1,2}` by PI18 serial; runs before powermon | (links) |
|
|
|
|
|
|
| `lvx-control.service` | bridges `solar/control/lvx6048/*` → powermon adhoc queue | (control) |
|
|
|
|
|
|
| `eg4-battery.service` | polls all 6 packs over RS485/Modbus | `lifepower4_1..6_*` |
|
|
|
|
|
|
|
|
|
|
|
|
Quick health: `systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service`
|
|
|
|
|
|
Logs: `journalctl -u <svc> --since "10 min ago" --no-pager`
|
|
|
|
|
|
|
|
|
|
|
|
## Entities cheat-sheet
|
|
|
|
|
|
|
|
|
|
|
|
**Inverters** `lvx6048_{1,2}_*` (PI18 GS/MOD/PIRI/FWS/ET):
|
|
|
|
|
|
`device_mode` (Power-On/Standby/Bypass/Battery/Fault/Charge…), `fault_code`,
|
|
|
|
|
|
`battery_voltage`, `battery_capacity` (%), `ac_output_active_power` (W),
|
|
|
|
|
|
`ac_output_voltage`, `grid_voltage`, `mppt1_input_power`/`mppt2_input_power` (W, PV),
|
|
|
|
|
|
`inverter_heat_sink_temperature`, `parallel_instance_number` (0 = master, 1+ = slave).
|
|
|
|
|
|
|
|
|
|
|
|
**Packs** `lifepower4_{1..6}_*` (Modbus): `soc`, `soc_alt`, `pack_voltage`,
|
|
|
|
|
|
`pack_current` (signed, + = charging), `cell_01..16_voltage`,
|
|
|
|
|
|
`cell_voltage_delta_mv` (imbalance), `cell_voltage_min`/`max`, `capacity_ah`,
|
|
|
|
|
|
`temperature_01..04`, `temperature_pcb`, `model`, `firmware_version`,
|
|
|
|
|
|
`firmware_date`, warning/protection bits, `register_NN` raw. There are 16 cells/pack.
|
|
|
|
|
|
|
|
|
|
|
|
**EVSE** `openevse/<key>` and `openevse_*` HA entities: `power` (W), `voltage`,
|
|
|
|
|
|
`amp` (mA raw → A in HA), `pilot`, `max_current`, `session_energy` (Wh),
|
|
|
|
|
|
`total_energy`, `status` (active/sleeping/disabled…), `state`, `temp`,
|
|
|
|
|
|
`vehicle` (plug). Charger HTTP UI at http://10.0.0.249.
|
|
|
|
|
|
|
|
|
|
|
|
Derived HA template sensors (`lifepower4_N_pack_power`, `_temperature_max`,
|
|
|
|
|
|
`_cell_imbalance_pct`, `lifepower4_stack_*`) are computed **inside HA**, not on
|
|
|
|
|
|
MQTT — compute them yourself from the raw entities when working off the Pi.
|
|
|
|
|
|
|
|
|
|
|
|
## Known issues / gotchas (check memory for the canonical versions)
|
|
|
|
|
|
|
|
|
|
|
|
- **Inverter `battery_voltage` is INTERMITTENTLY wrong** — read a correct ~54 V on
|
|
|
|
|
|
2026-06-20 (verified via HA history), but ~9–10 V on 2026-06-23/24 after the Jun 22
|
|
|
|
|
|
14:18 reboot, with packs steady at ~52–53 V throughout. So it's a post-reboot /
|
|
|
|
|
|
re-init glitch (the inverter or PI18 GS field not settling after restart), NOT a
|
|
|
|
|
|
permanent scaling bug. Implication: treat the inverter battery reading as
|
|
|
|
|
|
untrustworthy and use the `lifepower4_*` pack entities for any battery math; if it
|
|
|
|
|
|
reads ~10 V right now, a powermon (or inverter) restart may clear it — worth testing.
|
|
|
|
|
|
- **Pack 6 is an oddball**: Modbus addr `0x01` @ 115200 (packs 1–5 are `0x40` @
|
Add calibration-charge skill to fix EG4 SoC counter drift (improvement #1)
The everyday profile caps grid charging at 54V, so the bank can go weeks
without a full charge and the EG4 BMS coulomb counters drift (proven: pack 6
read 76% SoC while at the same 53.4V/3.337V-per-cell as packs reading 50-55%
— all paralleled, so physically equal charge; the spread is pure drift).
- profiles/eg4-lp4-v2-calibration.yaml: temporary profile, identical to
canonical except stop_charge_voltage 54.0 -> 0 (Full), so grid can finish a
full charge to the 56.4V absorption hold that re-anchors every pack to 100%.
- calibration-charge skill: guided runbook (pre-flight safety, two methods
solar-only / grid-assist, live monitoring with cell-voltage/temp aborts,
re-anchor verification, mandatory revert).
- REFERENCE: scoped action-policy exception (this skill alone may flip
stop_charge, both units, user-confirmed, must revert); corrected pack-6 /
SoC-drift notes to the verified equal-voltage-different-SoC signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:11:47 -04:00
|
|
|
|
9600). It reads SoC high (76 % on 2026-06-24 vs 50–55 % on packs 1–5) — but at the
|
|
|
|
|
|
SAME pack_voltage (53.4 V) and cell voltage (3.337 V), so that's **counter drift,
|
|
|
|
|
|
not real imbalance**: all packs are paralleled and physically at the same charge.
|
Add solar monitoring/troubleshooting skills for agents
Four Skill-tool skills under .claude/skills/ that let an agent monitor and
troubleshoot the install (2x LVX6048, 6x EG4 LifePower4, OpenEVSE), grounded
in the real MQTT/HA topology rather than generic advice:
- solar-health-check : whole-system sweep + cross-checks + R/Y/G verdict,
incl. cross-unit "silently-dead inverter" detection
- troubleshoot-inverter: FWS fault decode, parallel sync, USB link recovery
- troubleshoot-battery : per-pack imbalance vs SoC-counter-drift, RS485 silence
- power-usage : PV/load/grid/battery balance + EVSE sessions
Shared lib:
- solar-snapshot : live MQTT capture (creds from powermon.yaml, no hardcoding)
- ha-history : HA recorder lookback (token from ~/.config/ha/token)
REFERENCE.md documents topology, real HA entity_ids (doubled slug), known
issues, and a safe-remediation-only action policy (restarts yes; setters no).
Action boundary: diagnose + restart wedged daemons / recover USB links;
never touches inverter/battery setters or flash.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 11:46:20 -04:00
|
|
|
|
- **EG4 SoC never re-anchors** (drifts because packs rarely hit 100 % to reset the
|
Add calibration-charge skill to fix EG4 SoC counter drift (improvement #1)
The everyday profile caps grid charging at 54V, so the bank can go weeks
without a full charge and the EG4 BMS coulomb counters drift (proven: pack 6
read 76% SoC while at the same 53.4V/3.337V-per-cell as packs reading 50-55%
— all paralleled, so physically equal charge; the spread is pure drift).
- profiles/eg4-lp4-v2-calibration.yaml: temporary profile, identical to
canonical except stop_charge_voltage 54.0 -> 0 (Full), so grid can finish a
full charge to the 56.4V absorption hold that re-anchors every pack to 100%.
- calibration-charge skill: guided runbook (pre-flight safety, two methods
solar-only / grid-assist, live monitoring with cell-voltage/temp aborts,
re-anchor verification, mandatory revert).
- REFERENCE: scoped action-policy exception (this skill alone may flip
stop_charge, both units, user-confirmed, must revert); corrected pack-6 /
SoC-drift notes to the verified equal-voltage-different-SoC signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:11:47 -04:00
|
|
|
|
coulomb counter). Verified live via the equal-voltage/different-SoC signature above.
|
|
|
|
|
|
Fix = the `calibration-charge` skill (periodic full charge). See memory
|
|
|
|
|
|
`project_eg4_soc_drift_remediation`.
|
Add solar monitoring/troubleshooting skills for agents
Four Skill-tool skills under .claude/skills/ that let an agent monitor and
troubleshoot the install (2x LVX6048, 6x EG4 LifePower4, OpenEVSE), grounded
in the real MQTT/HA topology rather than generic advice:
- solar-health-check : whole-system sweep + cross-checks + R/Y/G verdict,
incl. cross-unit "silently-dead inverter" detection
- troubleshoot-inverter: FWS fault decode, parallel sync, USB link recovery
- troubleshoot-battery : per-pack imbalance vs SoC-counter-drift, RS485 silence
- power-usage : PV/load/grid/battery balance + EVSE sessions
Shared lib:
- solar-snapshot : live MQTT capture (creds from powermon.yaml, no hardcoding)
- ha-history : HA recorder lookback (token from ~/.config/ha/token)
REFERENCE.md documents topology, real HA entity_ids (doubled slug), known
issues, and a safe-remediation-only action policy (restarts yes; setters no).
Action boundary: diagnose + restart wedged daemons / recover USB links;
never touches inverter/battery setters or flash.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 11:46:20 -04:00
|
|
|
|
- **RS485 daisy-chain silences slave packs** — each pack needs its own FTDI; an
|
|
|
|
|
|
inter-pack chain demotes slaves. See memory `project_eg4_daisy_chain_silences_slaves`.
|
|
|
|
|
|
- **No per-day inverter energy** — PI18 only gives `ET` (lifetime Wh); ED/EM/EY NAK.
|
|
|
|
|
|
Daily kWh must come from HA recorder or ET deltas.
|
|
|
|
|
|
- **Parallel cluster**: changing inverter settings on only one unit risks fault 86
|
|
|
|
|
|
(desync). `lvx-control` always mirrors to both — that's why setters go through it.
|
|
|
|
|
|
|
|
|
|
|
|
## Action policy for these skills
|
|
|
|
|
|
|
|
|
|
|
|
**Allowed (safe remediation):**
|
|
|
|
|
|
- Read anything: `solar-snapshot`, `mosquitto_sub`, `journalctl`, `systemctl status/is-active`.
|
|
|
|
|
|
- Restart the data-plane daemons when they're wedged:
|
|
|
|
|
|
`sudo systemctl restart powermon.service` / `powermon2.service` / `eg4-battery.service` / `lvx-control.service`
|
|
|
|
|
|
- Recover inverter USB links: `sudo systemctl restart lvx-resolve-links.service`
|
|
|
|
|
|
or `sudo /usr/local/sbin/lvx-resolve-links`.
|
|
|
|
|
|
|
|
|
|
|
|
**Forbidden (escalate to the user instead — propose the exact command, don't run it):**
|
|
|
|
|
|
- Any inverter/battery **setter**: `solar/control/lvx6048/*` publishes
|
|
|
|
|
|
(charger priority, max charge current, output priority, …).
|
|
|
|
|
|
- `lvx-flash/flash.py apply` and `dump`/`compare`/`sync-check` — they contend for
|
|
|
|
|
|
exclusive USB and stop powermon; advanced, user-driven only.
|
|
|
|
|
|
- Anything that writes battery thresholds, output mode, or factory resets.
|
|
|
|
|
|
- Power-cycling hardware, moving cables, breaker changes.
|
|
|
|
|
|
|
|
|
|
|
|
When a fix is outside the allowed set, report the finding and hand the user the
|
|
|
|
|
|
precise command(s) to run.
|
Add calibration-charge skill to fix EG4 SoC counter drift (improvement #1)
The everyday profile caps grid charging at 54V, so the bank can go weeks
without a full charge and the EG4 BMS coulomb counters drift (proven: pack 6
read 76% SoC while at the same 53.4V/3.337V-per-cell as packs reading 50-55%
— all paralleled, so physically equal charge; the spread is pure drift).
- profiles/eg4-lp4-v2-calibration.yaml: temporary profile, identical to
canonical except stop_charge_voltage 54.0 -> 0 (Full), so grid can finish a
full charge to the 56.4V absorption hold that re-anchors every pack to 100%.
- calibration-charge skill: guided runbook (pre-flight safety, two methods
solar-only / grid-assist, live monitoring with cell-voltage/temp aborts,
re-anchor verification, mandatory revert).
- REFERENCE: scoped action-policy exception (this skill alone may flip
stop_charge, both units, user-confirmed, must revert); corrected pack-6 /
SoC-drift notes to the verified equal-voltage-different-SoC signature.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:11:47 -04:00
|
|
|
|
|
|
|
|
|
|
**Scoped exception — `calibration-charge` skill only:** that one skill may change
|
|
|
|
|
|
exactly one setting (`stop_charge_voltage` → Full and back) via the prepared
|
|
|
|
|
|
`eg4-lp4-v2-calibration.yaml` profile, on BOTH inverters, and ONLY after explicit
|
|
|
|
|
|
in-session user confirmation, and it must REVERT afterward. No other skill and no
|
|
|
|
|
|
other setting. This does not loosen the policy above for anything else.
|