Files
shaggy-solar/.claude/skills/REFERENCE.md
noise 56b2cc2bf1 Add calibration-charge skill to fix EG4 SoC counter drift (improvement #1)
The everyday profile caps grid charging at 54V, so the bank can go weeks
without a full charge and the EG4 BMS coulomb counters drift (proven: pack 6
read 76% SoC while at the same 53.4V/3.337V-per-cell as packs reading 50-55%
— all paralleled, so physically equal charge; the spread is pure drift).

- profiles/eg4-lp4-v2-calibration.yaml: temporary profile, identical to
  canonical except stop_charge_voltage 54.0 -> 0 (Full), so grid can finish a
  full charge to the 56.4V absorption hold that re-anchors every pack to 100%.
- calibration-charge skill: guided runbook (pre-flight safety, two methods
  solar-only / grid-assist, live monitoring with cell-voltage/temp aborts,
  re-anchor verification, mandatory revert).
- REFERENCE: scoped action-policy exception (this skill alone may flip
  stop_charge, both units, user-confirmed, must revert); corrected pack-6 /
  SoC-drift notes to the verified equal-voltage-different-SoC signature.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:11:47 -04:00

156 lines
8.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Solar install — system map (shared reference for the solar skills)
This file is the ground truth the `solar-*` / `troubleshoot-*` / `power-usage`
skills build on. Read it once at the start of any solar task. Everything below
was verified live on this host (the monitoring Pi) on 2026-06-23; re-verify
anything load-bearing before acting on it.
## Topology
```
6× EG4 LifePower4 v2 packs ──RS485 (1 FTDI each)──┐
2× MPP Solar LVX6048 inverters ──USB-HID/PI18─────┤ this Pi ──MQTT──► HA broker
1× OpenEVSE charger (10.0.0.249) ──its own WiFi───┘ (daemons) 10.0.0.41:1883
```
All telemetry lands on the **MQTT broker at 10.0.0.41:1883** under HA
auto-discovery (`homeassistant/<class>/<entity>/config` retained, `.../state`
republished each poll cycle — **state topics are NOT retained**, so to read
current values you must listen for a window: use `lib/solar-snapshot`).
Broker credentials live in `~/.config/powermon/powermon.yaml`
(`mqttbroker.{name,port,username,password}`). **Never hardcode them** — every
tool here reads them from that file. `lib/solar-snapshot` does too.
## The snapshot helper
`./lib/solar-snapshot` (relative to this skills dir) captures the latest value of
every matching MQTT topic over a short window and prints a table. This is the
primary read tool — prefer it over raw `mosquitto_sub`.
```
solar-snapshot [-w SECONDS] [-g GREP_RE] [-f] TOPIC_FILTER...
```
MQTT `+` matches one WHOLE level, so `lifepower4_+` matches nothing. Subscribe to
`homeassistant/sensor/+/state` and narrow with `-g`:
```
solar-snapshot -g 'lvx6048_1_' 'homeassistant/sensor/+/state'
solar-snapshot -w 16 -g 'lifepower4_[1-6]_soc/' 'homeassistant/sensor/+/state'
solar-snapshot 'openevse/#' # EVSE publishes on-change; idle when unplugged
```
## The history helper
`solar-snapshot` only sees *now*. For "when did X last happen / show last week",
use `./lib/ha-history`, which queries **Home Assistant's recorder** (the only
store that keeps history — local journald is volatile, ~1 day, wiped on reboot;
no solar data goes to InfluxDB). Default window 7 days; HA recorder default
retention is 10 days.
```
ha-history [-s SINCE] [-e END] [-m REGEX] [-a] ENTITY...
ha-history -s "10 days ago" sensor.lvx6048_lvx6048_1_device_mode sensor.lvx6048_lvx6048_2_device_mode
ha-history -s "10 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
```
**HA entity_ids ≠ MQTT object names.** powermon's hass output doubles the device
slug and is inconsistent across commands, so you must use the real ids, e.g.:
- device mode: `sensor.lvx6048_lvx6048_{1,2}_device_mode` (device slug `lvx6048`)
- fault code: `sensor.lvx6048_0{1,2}_lvx6048_{1,2}_fault_code` (slug `lvx6048_01`/`_02`)
- PV/batt/load: `sensor.lvx6048_lvx6048_1_{mppt1_input_power,mppt1_input_voltage,battery_voltage,ac_output_active_power}`
- EG4 packs follow the same doubling, e.g. `sensor.lifepower4_*`. When unsure, list
them: `curl -s -H "Authorization: Bearer $(cat ~/.config/ha/token)" $HA/api/states
| python3 -c 'import sys,json;[print(s["entity_id"]) for s in json.load(sys.stdin) if "lvx6048" in s["entity_id"]]'`
Auth: reads a long-lived token from `~/.config/ha/token` (mode 600) or `$HA_TOKEN`
— never on the command line, never hardcoded. Base URL `$HA_URL` else
`~/.config/ha/url` else `http://10.0.0.41:8123`. If it reports "no token", the user
must create one (HA → Profile → Security → Long-lived access tokens) and write it
to `~/.config/ha/token`; tell them which file, don't ask them to paste it in chat.
Recorder excludes (per `eg4battery/homeassistant/recorder.yaml`) drop EG4
per-cell/register/string entities — those have no history; the inverter
`device_mode`/`fault_code` and pack `soc`/`pack_voltage` etc. are recorded.
## Services (this Pi)
| Service | Role | Entities it feeds |
|---|---|---|
| `powermon.service` | LVX6048 #1 poller (PI18/USB) | `lvx6048_1_*` |
| `powermon2.service` | LVX6048 #2 poller (PI18/USB) | `lvx6048_2_*` |
| `lvx-resolve-links.service` | oneshot: maps `/dev/hidraw*``/dev/lvx6048-{1,2}` by PI18 serial; runs before powermon | (links) |
| `lvx-control.service` | bridges `solar/control/lvx6048/*` → powermon adhoc queue | (control) |
| `eg4-battery.service` | polls all 6 packs over RS485/Modbus | `lifepower4_1..6_*` |
Quick health: `systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service`
Logs: `journalctl -u <svc> --since "10 min ago" --no-pager`
## Entities cheat-sheet
**Inverters** `lvx6048_{1,2}_*` (PI18 GS/MOD/PIRI/FWS/ET):
`device_mode` (Power-On/Standby/Bypass/Battery/Fault/Charge…), `fault_code`,
`battery_voltage`, `battery_capacity` (%), `ac_output_active_power` (W),
`ac_output_voltage`, `grid_voltage`, `mppt1_input_power`/`mppt2_input_power` (W, PV),
`inverter_heat_sink_temperature`, `parallel_instance_number` (0 = master, 1+ = slave).
**Packs** `lifepower4_{1..6}_*` (Modbus): `soc`, `soc_alt`, `pack_voltage`,
`pack_current` (signed, + = charging), `cell_01..16_voltage`,
`cell_voltage_delta_mv` (imbalance), `cell_voltage_min`/`max`, `capacity_ah`,
`temperature_01..04`, `temperature_pcb`, `model`, `firmware_version`,
`firmware_date`, warning/protection bits, `register_NN` raw. There are 16 cells/pack.
**EVSE** `openevse/<key>` and `openevse_*` HA entities: `power` (W), `voltage`,
`amp` (mA raw → A in HA), `pilot`, `max_current`, `session_energy` (Wh),
`total_energy`, `status` (active/sleeping/disabled…), `state`, `temp`,
`vehicle` (plug). Charger HTTP UI at http://10.0.0.249.
Derived HA template sensors (`lifepower4_N_pack_power`, `_temperature_max`,
`_cell_imbalance_pct`, `lifepower4_stack_*`) are computed **inside HA**, not on
MQTT — compute them yourself from the raw entities when working off the Pi.
## Known issues / gotchas (check memory for the canonical versions)
- **Inverter `battery_voltage` is INTERMITTENTLY wrong** — read a correct ~54 V on
2026-06-20 (verified via HA history), but ~910 V on 2026-06-23/24 after the Jun 22
14:18 reboot, with packs steady at ~5253 V throughout. So it's a post-reboot /
re-init glitch (the inverter or PI18 GS field not settling after restart), NOT a
permanent scaling bug. Implication: treat the inverter battery reading as
untrustworthy and use the `lifepower4_*` pack entities for any battery math; if it
reads ~10 V right now, a powermon (or inverter) restart may clear it — worth testing.
- **Pack 6 is an oddball**: Modbus addr `0x01` @ 115200 (packs 15 are `0x40` @
9600). It reads SoC high (76 % on 2026-06-24 vs 5055 % on packs 15) — but at the
SAME pack_voltage (53.4 V) and cell voltage (3.337 V), so that's **counter drift,
not real imbalance**: all packs are paralleled and physically at the same charge.
- **EG4 SoC never re-anchors** (drifts because packs rarely hit 100 % to reset the
coulomb counter). Verified live via the equal-voltage/different-SoC signature above.
Fix = the `calibration-charge` skill (periodic full charge). See memory
`project_eg4_soc_drift_remediation`.
- **RS485 daisy-chain silences slave packs** — each pack needs its own FTDI; an
inter-pack chain demotes slaves. See memory `project_eg4_daisy_chain_silences_slaves`.
- **No per-day inverter energy** — PI18 only gives `ET` (lifetime Wh); ED/EM/EY NAK.
Daily kWh must come from HA recorder or ET deltas.
- **Parallel cluster**: changing inverter settings on only one unit risks fault 86
(desync). `lvx-control` always mirrors to both — that's why setters go through it.
## Action policy for these skills
**Allowed (safe remediation):**
- Read anything: `solar-snapshot`, `mosquitto_sub`, `journalctl`, `systemctl status/is-active`.
- Restart the data-plane daemons when they're wedged:
`sudo systemctl restart powermon.service` / `powermon2.service` / `eg4-battery.service` / `lvx-control.service`
- Recover inverter USB links: `sudo systemctl restart lvx-resolve-links.service`
or `sudo /usr/local/sbin/lvx-resolve-links`.
**Forbidden (escalate to the user instead — propose the exact command, don't run it):**
- Any inverter/battery **setter**: `solar/control/lvx6048/*` publishes
(charger priority, max charge current, output priority, …).
- `lvx-flash/flash.py apply` and `dump`/`compare`/`sync-check` — they contend for
exclusive USB and stop powermon; advanced, user-driven only.
- Anything that writes battery thresholds, output mode, or factory resets.
- Power-cycling hardware, moving cables, breaker changes.
When a fix is outside the allowed set, report the finding and hand the user the
precise command(s) to run.
**Scoped exception — `calibration-charge` skill only:** that one skill may change
exactly one setting (`stop_charge_voltage` → Full and back) via the prepared
`eg4-lp4-v2-calibration.yaml` profile, on BOTH inverters, and ONLY after explicit
in-session user confirmation, and it must REVERT afterward. No other skill and no
other setting. This does not loosen the policy above for anything else.