Compare commits

...

10 Commits

Author SHA1 Message Date
8dafce7dfe Docs: reflect this session's findings across the repo
- top-level README.md (new): system overview, subsystem map, skills pointer,
  notable findings.
- eg4battery README/NOTES: 3 -> 6 packs (pack 6 oddball 0x01/115200); SoC drift
  + calibration section; closed-loop comms evaluated and rejected (loses per-pack
  telemetry, no native protocol, doesn't fix drift); how to force a grid charge
  via output-priority SUB.
- LVX6048 README: closed-loop pending item -> resolved decision; new "SoC
  calibration & known firmware quirks" section (POP single-digit/POP01, MCHGC
  charge-lock, re_discharge=re-discharge can't exceed float, PIRI lag, powermon
  adhoc wedge); skills pointer.
- lvx-control README: POP encoding fix, POP crc-but-applies quirk, verify-by-
  behavior, grid-charge-via-SUB usage.
- troubleshoot-inverter skill: corrected the stale "dead string per inverter"
  claim — both strings healthy; low PV is tilt/heat/shade/curtailment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 21:46:02 -04:00
4bfa021719 Fix POP01 encoding bug + harden grid-cal revert verification
Root cause of the grid-calibration auto-revert silently failing: lvx-control
and flash.py encode output_priority solar_battery_utility as "POP01", but PI18
POP is single-digit — the inverter silently rejects "POP01" (raw "POP1" works,
matches the POP_PIRI decoder). Compounded by powermon's adhoc queue wedging,
which dropped commands entirely until a restart. So the monitor logged "revert
done" while the cluster sat in SUB/grid mode for ~1.5h (no harm: battery full,
just running loads on grid).

- lvx-control + flash.py: POP_MAP "01" -> "1" (also patched the live
  /usr/local/bin/lvx-control + restarted; verified it now emits POP1).
- grid-cal-monitor: revert now VERIFIES via behavior (line_power_direction
  leaves 'input'), and on failure restarts powermon and re-sends raw POP1/PCP0,0,
  with a loud manual-fallback message. No more trust-the-publish.

Recovery for the live run: restarted powermon (unstuck adhoc) + raw POP1 + PCP0,0;
confirmed POP=Solar-Battery-Utility, PCP=Solar First, mode=Battery, line_dir=donothing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 21:37:16 -04:00
76765a95ed Grid calibration: correct lever is output-priority SUB, add grid-cal-monitor
Discovered live 2026-06-25 driving an actual grid calibration: forcing a full
grid charge is done via OUTPUT PRIORITY, not voltage thresholds.
- SBU (everyday) won't grid-charge unless the bank is critically low; setting
  charger_priority=solar_and_utility alone does nothing at 52V.
- SUB (output_priority=solar_utility_battery) runs loads on grid AND charges the
  battery to full. Combined with charger_priority=solar_and_utility, grid charging
  engages (device_mode->Hybrid/Line, line_dir->input, pack current jumps to ~120A).
- Both POP/PCP set via lvx-control (all-mode-safe, atomic, no flash/USB). Revert
  POP->solar_battery_utility, PCP->solar_first when done.

The re_discharge/flash.py approach is dead (firmware NAKs stop_charge>float);
profile eg4-lp4-v2-calibration.yaml marked DEPRECATED.

- grid-cal-monitor: supervises a SUB grid charge, safety aborts (cell>3.60V/
  temp>45C), detects re-anchor (all 6 packs ->100%), auto-reverts POP+PCP (trap).
- calibration-charge skill §3 rewritten to the POP lever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:35:51 -04:00
34d34f6e6c Add solar-morning-run controller (solar-only calibration default)
Unattended morning runner for the calibration top-off. DEFAULT is solar-only
@ 60A: no setter, reads telemetry, weather-gates (PV<4kW by 10:30 -> abort),
monitors the charge with cell>3.65V / temp>45C aborts, verifies all 6 packs
re-anchor to 100%. Validated end-to-end via --dry-run against live HA.

Key firmware finding baked in (confirmed live): MCHGC is LOCKED while charging
(NAKs even in device_mode 'Battery' when charger_status='charging') -- so the
80A throttle test is opt-in (THROTTLE=1), gated on a true pre-charge idle
window, with retry-on-revert and a guaranteed-safe fallback (cap stays 80A
until idle if revert NAKs). No clean noon A/B is possible; documented as such.

Also handles the HA pack-temperature unit trap (entities report degF; the
script reads unit_of_measurement and converts to degC for the safety check).

REFERENCE: documented the MCHGC charging-lock under known issues.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 08:07:49 -04:00
4d6c6c109b Correct PV analysis: no down string; peak is charge-cap-limited
Clear-noon peak (2026-06-24 13:44, HA recorder): each inverter ~4.7kW @ ~300V
@ ~16A. A down string reads ~10A; ~16A = both parallel strings live. The
"one string down per inverter" belief was from confounded afternoon/rainy
samples + transient pv_loss_warning (now off) and is refuted.

~66% of nameplate at peak is explained by 45-deg tilt (wrong for high June sun),
heat derate, and tree shading. AND the peak was demand-limited: battery charge
pinned at the 120A cap (121A) at 65% SoC with only 3.3kW load -- reversing the
earlier off-peak "not clipping" note. Documented the throttle test to resolve
true array ceiling before raising max_charging_current.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 14:41:32 -04:00
5e175d4d0b Fix calibration grid-assist lever: firmware NAKs stop_charge=0/Full
Live run 2026-06-24: flash.py apply NAK'd BUCD480,000 on both inverters — the
firmware rejects stop_charge_voltage=0 ("Full"). flash.py aborts on first setter
failure, so nothing changed and the cluster stayed in sync (verified).

The field flash.py calls stop_charge_voltage is actually the inverter's
battery_re_discharge_voltage (HA: sensor.lvx6048_*_battery_re_discharge_voltage):
the V at which loads switch back to battery after grid charging. 54.0 tops grid
charge to ~54V; raising to 56.0 is the corrected (but UNVALIDATED) lever and may
band-oscillate rather than hold absorption.

- calibration profile: 0 -> 56.0, with the finding documented.
- skill: solar-only is now the RECOMMENDED/known-good method; grid-assist demoted
  to advanced/unvalidated with a mandatory diff-preview gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 14:14:35 -04:00
f1128e807a Record decision to keep 48.0V discharge floor (improvement #3)
Reviewed the cutoff/stop-discharge floor; deliberately kept at 48.0V (3.00
V/cell) for cycle-life margin. Lowering to 46-47V unlocks only a few % of LFP
capacity and isn't worth deeper cycling on this large, non-capacity-constrained
bank. Comment only; no value change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:28:44 -04:00
b5b69f95c1 Correct EG4 profile sizing 3->6 packs; keep 60A charge cap (improvement #2)
- eg4-lp4-v2.yaml: rationale updated 300Ah->600Ah/~30.7kWh; 120A combined is
  now ~0.2C. Charge current deliberately NOT raised to 80A: not clipping — at
  solar noon on a clear day (2026-06-24, ~5.8kW PV) the bank took only 71A of
  the 120A cap. Real harvest limiter is PV (5.8 of 14.4kW nameplate, suspected
  down strings), not the ceiling.
- REFERENCE: pack HA entity_ids are triple-prefixed
  (sensor.eg4_lifepower4_lifepower4_1_lifepower4_1_*) — discover, don't construct.

No setting values changed; documentation/accuracy only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:25:48 -04:00
56b2cc2bf1 Add calibration-charge skill to fix EG4 SoC counter drift (improvement #1)
The everyday profile caps grid charging at 54V, so the bank can go weeks
without a full charge and the EG4 BMS coulomb counters drift (proven: pack 6
read 76% SoC while at the same 53.4V/3.337V-per-cell as packs reading 50-55%
— all paralleled, so physically equal charge; the spread is pure drift).

- profiles/eg4-lp4-v2-calibration.yaml: temporary profile, identical to
  canonical except stop_charge_voltage 54.0 -> 0 (Full), so grid can finish a
  full charge to the 56.4V absorption hold that re-anchors every pack to 100%.
- calibration-charge skill: guided runbook (pre-flight safety, two methods
  solar-only / grid-assist, live monitoring with cell-voltage/temp aborts,
  re-anchor verification, mandatory revert).
- REFERENCE: scoped action-policy exception (this skill alone may flip
  stop_charge, both units, user-confirmed, must revert); corrected pack-6 /
  SoC-drift notes to the verified equal-voltage-different-SoC signature.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 12:11:47 -04:00
aa97d65b0c Add solar monitoring/troubleshooting skills for agents
Four Skill-tool skills under .claude/skills/ that let an agent monitor and
troubleshoot the install (2x LVX6048, 6x EG4 LifePower4, OpenEVSE), grounded
in the real MQTT/HA topology rather than generic advice:

- solar-health-check  : whole-system sweep + cross-checks + R/Y/G verdict,
                        incl. cross-unit "silently-dead inverter" detection
- troubleshoot-inverter: FWS fault decode, parallel sync, USB link recovery
- troubleshoot-battery : per-pack imbalance vs SoC-counter-drift, RS485 silence
- power-usage         : PV/load/grid/battery balance + EVSE sessions

Shared lib:
- solar-snapshot : live MQTT capture (creds from powermon.yaml, no hardcoding)
- ha-history     : HA recorder lookback (token from ~/.config/ha/token)
REFERENCE.md documents topology, real HA entity_ids (doubled slug), known
issues, and a safe-remediation-only action policy (restarts yes; setters no).

Action boundary: diagnose + restart wedged daemons / recover USB links;
never touches inverter/battery setters or flash.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 11:46:20 -04:00
20 changed files with 1522 additions and 25 deletions

2
.claude/skills/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
__pycache__/
*.pyc

167
.claude/skills/REFERENCE.md Normal file
View File

@@ -0,0 +1,167 @@
# Solar install — system map (shared reference for the solar skills)
This file is the ground truth the `solar-*` / `troubleshoot-*` / `power-usage`
skills build on. Read it once at the start of any solar task. Everything below
was verified live on this host (the monitoring Pi) on 2026-06-23; re-verify
anything load-bearing before acting on it.
## Topology
```
6× EG4 LifePower4 v2 packs ──RS485 (1 FTDI each)──┐
2× MPP Solar LVX6048 inverters ──USB-HID/PI18─────┤ this Pi ──MQTT──► HA broker
1× OpenEVSE charger (10.0.0.249) ──its own WiFi───┘ (daemons) 10.0.0.41:1883
```
All telemetry lands on the **MQTT broker at 10.0.0.41:1883** under HA
auto-discovery (`homeassistant/<class>/<entity>/config` retained, `.../state`
republished each poll cycle — **state topics are NOT retained**, so to read
current values you must listen for a window: use `lib/solar-snapshot`).
Broker credentials live in `~/.config/powermon/powermon.yaml`
(`mqttbroker.{name,port,username,password}`). **Never hardcode them** — every
tool here reads them from that file. `lib/solar-snapshot` does too.
## The snapshot helper
`./lib/solar-snapshot` (relative to this skills dir) captures the latest value of
every matching MQTT topic over a short window and prints a table. This is the
primary read tool — prefer it over raw `mosquitto_sub`.
```
solar-snapshot [-w SECONDS] [-g GREP_RE] [-f] TOPIC_FILTER...
```
MQTT `+` matches one WHOLE level, so `lifepower4_+` matches nothing. Subscribe to
`homeassistant/sensor/+/state` and narrow with `-g`:
```
solar-snapshot -g 'lvx6048_1_' 'homeassistant/sensor/+/state'
solar-snapshot -w 16 -g 'lifepower4_[1-6]_soc/' 'homeassistant/sensor/+/state'
solar-snapshot 'openevse/#' # EVSE publishes on-change; idle when unplugged
```
## The history helper
`solar-snapshot` only sees *now*. For "when did X last happen / show last week",
use `./lib/ha-history`, which queries **Home Assistant's recorder** (the only
store that keeps history — local journald is volatile, ~1 day, wiped on reboot;
no solar data goes to InfluxDB). Default window 7 days; HA recorder default
retention is 10 days.
```
ha-history [-s SINCE] [-e END] [-m REGEX] [-a] ENTITY...
ha-history -s "10 days ago" sensor.lvx6048_lvx6048_1_device_mode sensor.lvx6048_lvx6048_2_device_mode
ha-history -s "10 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
```
**HA entity_ids ≠ MQTT object names.** powermon's hass output doubles the device
slug and is inconsistent across commands, so you must use the real ids, e.g.:
- device mode: `sensor.lvx6048_lvx6048_{1,2}_device_mode` (device slug `lvx6048`)
- fault code: `sensor.lvx6048_0{1,2}_lvx6048_{1,2}_fault_code` (slug `lvx6048_01`/`_02`)
- PV/batt/load: `sensor.lvx6048_lvx6048_1_{mppt1_input_power,mppt1_input_voltage,battery_voltage,ac_output_active_power}`
- EG4 packs are TRIPLE-prefixed (even worse), e.g.
`sensor.eg4_lifepower4_lifepower4_1_lifepower4_1_pack_current` — device slug
`eg4_lifepower4_lifepower4_1` + object `lifepower4_1_pack_current`. Always discover,
don't hand-construct. When unsure, list them:
`curl -s -H "Authorization: Bearer $(cat ~/.config/ha/token)" $HA/api/states
| python3 -c 'import sys,json;[print(s["entity_id"]) for s in json.load(sys.stdin) if "lvx6048" in s["entity_id"]]'`
Auth: reads a long-lived token from `~/.config/ha/token` (mode 600) or `$HA_TOKEN`
— never on the command line, never hardcoded. Base URL `$HA_URL` else
`~/.config/ha/url` else `http://10.0.0.41:8123`. If it reports "no token", the user
must create one (HA → Profile → Security → Long-lived access tokens) and write it
to `~/.config/ha/token`; tell them which file, don't ask them to paste it in chat.
Recorder excludes (per `eg4battery/homeassistant/recorder.yaml`) drop EG4
per-cell/register/string entities — those have no history; the inverter
`device_mode`/`fault_code` and pack `soc`/`pack_voltage` etc. are recorded.
## Services (this Pi)
| Service | Role | Entities it feeds |
|---|---|---|
| `powermon.service` | LVX6048 #1 poller (PI18/USB) | `lvx6048_1_*` |
| `powermon2.service` | LVX6048 #2 poller (PI18/USB) | `lvx6048_2_*` |
| `lvx-resolve-links.service` | oneshot: maps `/dev/hidraw*``/dev/lvx6048-{1,2}` by PI18 serial; runs before powermon | (links) |
| `lvx-control.service` | bridges `solar/control/lvx6048/*` → powermon adhoc queue | (control) |
| `eg4-battery.service` | polls all 6 packs over RS485/Modbus | `lifepower4_1..6_*` |
Quick health: `systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service`
Logs: `journalctl -u <svc> --since "10 min ago" --no-pager`
## Entities cheat-sheet
**Inverters** `lvx6048_{1,2}_*` (PI18 GS/MOD/PIRI/FWS/ET):
`device_mode` (Power-On/Standby/Bypass/Battery/Fault/Charge…), `fault_code`,
`battery_voltage`, `battery_capacity` (%), `ac_output_active_power` (W),
`ac_output_voltage`, `grid_voltage`, `mppt1_input_power`/`mppt2_input_power` (W, PV),
`inverter_heat_sink_temperature`, `parallel_instance_number` (0 = master, 1+ = slave).
**Packs** `lifepower4_{1..6}_*` (Modbus): `soc`, `soc_alt`, `pack_voltage`,
`pack_current` (signed, + = charging), `cell_01..16_voltage`,
`cell_voltage_delta_mv` (imbalance), `cell_voltage_min`/`max`, `capacity_ah`,
`temperature_01..04`, `temperature_pcb`, `model`, `firmware_version`,
`firmware_date`, warning/protection bits, `register_NN` raw. There are 16 cells/pack.
**EVSE** `openevse/<key>` and `openevse_*` HA entities: `power` (W), `voltage`,
`amp` (mA raw → A in HA), `pilot`, `max_current`, `session_energy` (Wh),
`total_energy`, `status` (active/sleeping/disabled…), `state`, `temp`,
`vehicle` (plug). Charger HTTP UI at http://10.0.0.249.
Derived HA template sensors (`lifepower4_N_pack_power`, `_temperature_max`,
`_cell_imbalance_pct`, `lifepower4_stack_*`) are computed **inside HA**, not on
MQTT — compute them yourself from the raw entities when working off the Pi.
## Known issues / gotchas (check memory for the canonical versions)
- **Inverter `battery_voltage` is INTERMITTENTLY wrong** — read a correct ~54 V on
2026-06-20 (verified via HA history), but ~910 V on 2026-06-23/24 after the Jun 22
14:18 reboot, with packs steady at ~5253 V throughout. So it's a post-reboot /
re-init glitch (the inverter or PI18 GS field not settling after restart), NOT a
permanent scaling bug. Implication: treat the inverter battery reading as
untrustworthy and use the `lifepower4_*` pack entities for any battery math; if it
reads ~10 V right now, a powermon (or inverter) restart may clear it — worth testing.
- **Pack 6 is an oddball**: Modbus addr `0x01` @ 115200 (packs 15 are `0x40` @
9600). It reads SoC high (76 % on 2026-06-24 vs 5055 % on packs 15) — but at the
SAME pack_voltage (53.4 V) and cell voltage (3.337 V), so that's **counter drift,
not real imbalance**: all packs are paralleled and physically at the same charge.
- **EG4 SoC never re-anchors** (drifts because packs rarely hit 100 % to reset the
coulomb counter). Verified live via the equal-voltage/different-SoC signature above.
Fix = the `calibration-charge` skill (periodic full charge). See memory
`project_eg4_soc_drift_remediation`.
- **RS485 daisy-chain silences slave packs** — each pack needs its own FTDI; an
inter-pack chain demotes slaves. See memory `project_eg4_daisy_chain_silences_slaves`.
- **No per-day inverter energy** — PI18 only gives `ET` (lifetime Wh); ED/EM/EY NAK.
Daily kWh must come from HA recorder or ET deltas.
- **Parallel cluster**: changing inverter settings on only one unit risks fault 86
(desync). `lvx-control` always mirrors to both — that's why setters go through it.
- **MCHGC (max_charging_current) is firmware-LOCKED while charging** — confirmed live
2026-06-25: a cap change NAKs ("Failed") on BOTH units whenever `mppt1_charger_status`
= `charging`, even though `device_mode` still reads `Battery`. So the cap is only
settable in a true pre-charge idle window (dawn) and revertible only once charging
stops. Detect charging via `charger_status`, NOT `device_mode`. This is why
`solar-morning-run` defaults to solar-only @ 60 A and gates the 80 A throttle behind
an idle check. Same lock applies via `flash.py` (it's an inverter-side lock).
- **MCHGC `0`/Full equivalent — see** `battery_re_discharge_voltage` gotcha in the
calibration notes (`stop_charge_voltage` is really re-discharge; firmware NAKs 0).
## Action policy for these skills
**Allowed (safe remediation):**
- Read anything: `solar-snapshot`, `mosquitto_sub`, `journalctl`, `systemctl status/is-active`.
- Restart the data-plane daemons when they're wedged:
`sudo systemctl restart powermon.service` / `powermon2.service` / `eg4-battery.service` / `lvx-control.service`
- Recover inverter USB links: `sudo systemctl restart lvx-resolve-links.service`
or `sudo /usr/local/sbin/lvx-resolve-links`.
**Forbidden (escalate to the user instead — propose the exact command, don't run it):**
- Any inverter/battery **setter**: `solar/control/lvx6048/*` publishes
(charger priority, max charge current, output priority, …).
- `lvx-flash/flash.py apply` and `dump`/`compare`/`sync-check` — they contend for
exclusive USB and stop powermon; advanced, user-driven only.
- Anything that writes battery thresholds, output mode, or factory resets.
- Power-cycling hardware, moving cables, breaker changes.
When a fix is outside the allowed set, report the finding and hand the user the
precise command(s) to run.
**Scoped exception — `calibration-charge` skill only:** that one skill may change
exactly one setting (`stop_charge_voltage` → Full and back) via the prepared
`eg4-lp4-v2-calibration.yaml` profile, on BOTH inverters, and ONLY after explicit
in-session user confirmation, and it must REVERT afterward. No other skill and no
other setting. This does not loosen the policy above for anything else.

View File

@@ -0,0 +1,126 @@
---
name: calibration-charge
description: >-
Guided runbook to re-anchor the 6 EG4 LifePower4 pack SoC counters by driving a
full charge to absorption and verifying every pack resets to 100%. Use when pack
SoC readings have drifted (e.g. one pack reads much higher/lower than the others
while all pack voltages agree), when the user asks to "calibrate / balance / fix
SoC / do a full charge", or on a monthly cadence. This is the ONE skill that
changes inverter settings — and only the grid-charge ceiling, only on explicit
user confirmation, and it reverts them. Everything else is read/monitor/verify.
---
# calibration-charge
Re-anchors drifted EG4 SoC counters. The BMS only resets SoC to 100% on a real
full-charge termination (high cell voltage + low taper current); the everyday
profile caps grid charging at 54 V, so the bank can go weeks without a full charge
and the counters drift. This runbook guarantees one full charge, then verifies.
## Action-policy exception (read this)
Unlike the troubleshoot-* skills, this one MAY change two inverter settings — but
ONLY `stop_charge_voltage` (54.0 → 0/Full) via the prepared calibration profile,
applied to BOTH units, and ONLY after the user explicitly confirms in-session. You
present the exact `flash.py` commands; the user runs them (or confirms you may). You
own pre-flight, monitoring, verification, and the REVERT. Never change anything else.
## 0. Load context
```bash
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
FLASH="$ROOT/LVX6048/lvx-flash" # flash.py + profiles/ live here
```
Read `$ROOT/.claude/skills/REFERENCE.md`. The two profiles: `eg4-lp4-v2.yaml`
(canonical/everyday) and `eg4-lp4-v2-calibration.yaml` (temporary; identical except
`stop_charge_voltage: 0`).
## 1. Pre-flight (must pass before charging)
```bash
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|cell_voltage_max|cell_voltage_delta_mv|temperature_pcb|protection)' 'homeassistant/sensor/+/state'
```
Record and check:
- **Capture the "before" SoC spread** — this is what we're fixing. Confirm it's *drift*
not real imbalance: if all 6 `pack_voltage` agree (±0.1 V) but SoC readings differ,
it's counter drift (the target). If voltages actually diverge, STOP — that's real
imbalance → troubleshoot-battery first.
- **Temps in charge range**: every `temperature_*` between ~5 °C and 45 °C. **Never
charge LFP below 0 °C** (BMS should block, but verify). Abort if any pack > 45 °C.
- **No protection bits set**; cells reasonably balanced (`cell_voltage_delta_mv` < ~50).
- **Forecast/grid**: solar-only needs a sunny low-load day; grid-assist works anytime.
## 2. Choose the method (ask the user)
- **Solar-only — RECOMMENDED (no setting change, free, known-good):** on a sunny,
low-load day with a full day ahead, solar drives the bank through the full CC/CV
curve to bulk 56.4 V and holds absorption on its own — exactly the clean termination
the BMS needs to re-anchor. No flash, no risk. Just monitor §4 and verify §5; skip §3.
This is the method to default to.
- **Grid-assist — for cloudy days (CORRECT lever, validated 2026-06-25):** force a full
grid charge by switching **output priority to SUB**, NOT by touching voltage
thresholds. In the everyday SBU mode the inverter won't grid-charge unless the bank is
critically low; SUB makes it run loads on grid AND charge the battery to full. Both
setters go through lvx-control (all-mode-safe, atomic, no flash). Needs §3.
(Do NOT use `re_discharge`/flash.py — firmware NAKs it; see memory
`project_lvx6048_grid_charge_lever`.)
## 3. Grid-assist: enable grid charging via output priority (USER-CONFIRMED)
Publish via lvx-control (atomic to both units, no powermon stop):
```bash
B=... # broker creds from ~/.config/powermon/powermon.yaml
mosquitto_pub ... -t solar/control/lvx6048/charger_priority -m solar_and_utility
mosquitto_pub ... -t solar/control/lvx6048/output_priority -m solar_utility_battery # SUB
```
Confirm within ~1 min: `device_mode` -> Hybrid/Line, `line_power_direction` -> input,
pack current jumps. Verify BOTH units match (output_source_priority) — parallel sync.
Then run `lib/grid-cal-monitor` (detached) to drive/verify/auto-revert; it reverts
`output_priority->solar_battery_utility` + `charger_priority->solar_first` on completion
(trap-guaranteed). Skip §3 entirely for the solar-only method.
Mirror to BOTH inverters (parallel cluster — mismatched settings throw fault 86).
`flash.py apply` stops powermon for exclusive USB, so MQTT telemetry pauses briefly.
```bash
cd "$FLASH"
sudo systemctl stop powermon.service powermon2.service
./flash.py apply --device /dev/lvx6048-1 --profile profiles/eg4-lp4-v2-calibration.yaml --confirm
./flash.py apply --device /dev/lvx6048-2 --profile profiles/eg4-lp4-v2-calibration.yaml --confirm
./flash.py compare --device-a /dev/lvx6048-1 --device-b /dev/lvx6048-2 # must match
sudo systemctl start powermon.service powermon2.service
```
Confirm via MQTT the new ceiling took (both units): `battery_recharge`/`stop_charge`
readback reflects Full, others unchanged.
## 4. Drive + monitor the charge (this is the agent's job — poll periodically)
```bash
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current|cell_voltage_max|temperature_pcb)' 'homeassistant/sensor/+/state'
"$SNAP" -w 10 -g 'lvx6048_[12]_(device_mode|mppt1_input_power|ac_output_active_power)/' 'homeassistant/sensor/+/state'
```
Watch for, and re-poll every ~1530 min as it climbs:
- `pack_voltage` rising toward ~56 V; `device_mode` should show charging/absorption.
- **SAFETY — abort and revert (§6) if:** any `cell_voltage_max` > **3.60 V** (BMS
protects ~3.65; don't ride it), any pack temp > 45 °C, or any protection bit sets.
- **Absorption/taper:** once pack voltage holds near bulk and `pack_current` tapers to
~5 A/pack (≈0.05 C, < ~30 A total), the BMS will flip SoC to 100%. This hold is the
part that actually re-anchors — let it finish, don't cut it early.
## 5. Verify the re-anchor (the whole point)
```bash
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|cell_voltage_delta_mv)' 'homeassistant/sensor/+/state'
```
PASS = **all 6 packs read SoC 100%** (≥99) with pack voltages converged (spread
< ~0.1 V) and cell deltas still tight. The previously-drifted pack (e.g. pack 6) now
matching the others = counter re-anchored. Report before→after SoC spread.
If a pack still lags, it didn't terminate — extend the hold or investigate that pack.
## 6. REVERT (mandatory if you did §3 — do NOT leave the ceiling lifted)
```bash
cd "$FLASH"
sudo systemctl stop powermon.service powermon2.service
./flash.py apply --device /dev/lvx6048-1 --profile profiles/eg4-lp4-v2.yaml --confirm
./flash.py apply --device /dev/lvx6048-2 --profile profiles/eg4-lp4-v2.yaml --confirm
./flash.py compare --device-a /dev/lvx6048-1 --device-b /dev/lvx6048-2
./flash.py sync-check --device-a /dev/lvx6048-1 --device-b /dev/lvx6048-2
sudo systemctl start powermon.service powermon2.service
```
Confirm `stop_charge` is back to 54.0 on both units via MQTT.
## 7. Record
Note the date (for the ~monthly cadence) and before→after SoC spread. If drift returns
fast, consider the opt-in `soc_estimator` daemon backstop (see memory
`project_eg4_soc_drift_remediation`) as a longer-term fix.

View File

@@ -0,0 +1,107 @@
#!/usr/bin/env bash
# grid-cal-monitor — supervise a grid-assisted calibration charge to full, then
# auto-revert the inverter back to normal (battery-priority, solar-only charging).
#
# Assumes the operator already set, via lvx-control:
# output_priority = solar_utility_battery (SUB: grid powers loads + charges batt)
# charger_priority = solar_and_utility
# This script ONLY monitors and then REVERTS those two (POP->solar_battery_utility,
# PCP->solar_first). It changes nothing else. Reverting is trap-guaranteed on exit.
#
# Done = all 6 packs report SoC>=99 (BMS re-anchored), or pack_V>=56.2 with combined
# current tapered <20A for 2 polls. Safety: abort+revert if any cell>3.60V or temp>45C.
# Max runtime guard then revert regardless.
set -uo pipefail
HA="http://10.0.0.41:8123"; TOKEN_FILE="$HOME/.config/ha/token"
POWERMON_CONF="$HOME/.config/powermon/powermon.yaml"
CELL_ABORT=3.60; TEMP_ABORT=45; SOC_DONE=99
VFULL=56.2; ITAPER=20; MAX_HOURS=8; POLL_S=300
RUNDIR="$HOME/solar-runs"; mkdir -p "$RUNDIR"
LOG="$RUNDIR/gridcal-$(date +%Y%m%d).log"
REVERTED=0
log(){ printf '%s %s\n' "$(date '+%F %T')" "$*" | tee -a "$LOG"; }
read -r BHOST BPORT BUSER BPASS < <(awk '/^[^[:space:]]/{i=0}/^mqttbroker:/{i=1;next} i&&/^[[:space:]]+name:/{h=$2} i&&/^[[:space:]]+port:/{p=$2} i&&/^[[:space:]]+username:/{u=$2} i&&/^[[:space:]]+password:/{w=$2} END{print h,(p?p:1883),u,w}' "$POWERMON_CONF")
TOKEN="$(cat "$TOKEN_FILE")"
ha(){ curl -s -H "Authorization: Bearer $TOKEN" "$HA/api/states/$1"; }
st(){ ha "$1" | python3 -c 'import sys,json
try:print(json.load(sys.stdin).get("state",""))
except:print("")'; }
tc(){ ha "$1" | python3 -c 'import sys,json
try:
d=json.load(sys.stdin);s=float(d["state"]);u=d["attributes"].get("unit_of_measurement","")
print(round((s-32)*5/9,1) if "F" in u else round(s,1))
except:print("")'; }
P(){ echo "sensor.eg4_lifepower4_lifepower4_${1}_lifepower4_${1}_${2}"; }
pub(){ mosquitto_pub -h "$BHOST" -p "$BPORT" -u "$BUSER" -P "$BPASS" -t "solar/control/lvx6048/$1" -m "$2"; }
raw(){ for u in 1 2; do mosquitto_pub -h "$BHOST" -p "$BPORT" -u "$BUSER" -P "$BPASS" -t "powermon/lvx6048_${u}/addcommand" -m "$1"; done; }
# revert VERIFIES it actually took (the friendly path can silently fail — lvx-control
# used to encode POP01 which the inverter rejects, and powermon's adhoc queue can wedge).
# Ground truth = behavior: in SBU with a full bank, line_power_direction leaves 'input'.
reverted_ok(){
local ld pop; ld=$(st sensor.lvx6048_lvx6048_1_line_power_direction)
pop=$(st sensor.lvx6048_lvx6048_1_output_source_priority)
{ [ -n "$ld" ] && [ "$ld" != "input" ]; } || echo "$pop" | grep -q "Battery - Utility"
}
revert(){
[ "$REVERTED" = 1 ] && return 0
log "REVERT: output_priority->solar_battery_utility, charger_priority->solar_first"
pub output_priority solar_battery_utility; sleep 3; pub charger_priority solar_first; sleep 12
if reverted_ok; then log "REVERT verified (line_dir=$(st sensor.lvx6048_lvx6048_1_line_power_direction))"; REVERTED=1; return 0; fi
# escalate: powermon adhoc may be wedged and/or friendly encode rejected -> restart + raw
for try in 1 2; do
log "REVERT not effective yet — restart powermon + raw POP1/PCP0,0 (try $try)"
sudo systemctl restart powermon.service powermon2.service 2>/dev/null; sleep 12
raw POP1; sleep 3; raw PCP0,0; sleep 15
if reverted_ok; then log "REVERT verified after escalation"; REVERTED=1; return 0; fi
done
log "REVERT: !!! COULD NOT CONFIRM — still grid-priority. Manually run: raw POP1 to both addcommand topics (POP1, not POP01)."
REVERTED=1
}
trap 'revert; log "exit"' EXIT INT TERM
# read packs -> "minSoC maxSoC maxcell maxtemp minV maxV totI ndone"
read_packs(){
local socs=() cells=() temps=() vs=() is=() i s t tmax
for i in 1 2 3 4 5 6; do
socs+=("$(st "$(P $i soc)")"); cells+=("$(st "$(P $i cell_voltage_max)")")
vs+=("$(st "$(P $i pack_voltage)")"); is+=("$(st "$(P $i pack_current)")")
tmax=0
for s in temperature_pcb temperature_01 temperature_02 temperature_03; do
t="$(tc "$(P $i $s)")"; t=${t%.*}; [[ "$t" =~ ^-?[0-9]+$ ]] && [ "$t" -gt "$tmax" ] && tmax=$t
done
temps+=("$tmax")
done
python3 - "${socs[*]}" "${cells[*]}" "${temps[*]}" "${vs[*]}" "${is[*]}" <<'PY'
import sys
f=lambda a:[float(x) for x in a.split() if x]
soc,cell,tmp,v,i=map(f,sys.argv[1:6])
print(f"{min(soc):.0f} {max(soc):.0f} {max(cell):.3f} {max(tmp):.0f} {min(v):.2f} {max(v):.2f} {sum(i):.0f} {len([s for s in soc if s>=99])}")
PY
}
log "=== grid-cal-monitor start (auto-revert on full/abort/exit) ==="
START=$(date +%s); taper_hits=0
while :; do
read MNS MXS MXCELL MXT MNV MXV TOTI NDONE <<<"$(read_packs)"
el=$(( ($(date +%s)-START)/60 ))
log "[+${el}m] SoC ${MNS}-${MXS}% packs@100=${NDONE}/6 | packV ${MNV}-${MXV} | totI ${TOTI}A | maxcell ${MXCELL}V maxtemp ${MXT}C"
# SAFETY
if (( $(python3 -c "print(1 if $MXCELL>$CELL_ABORT or $MXT>$TEMP_ABORT else 0)") )); then
log "!!! SAFETY ABORT: maxcell ${MXCELL}V / maxtemp ${MXT}C — reverting now"; exit 2; fi
# DONE: all re-anchored
if [ "$NDONE" = 6 ]; then log "COMPLETE: all 6 packs >=${SOC_DONE}% — re-anchored"; break; fi
# DONE backstop: at bulk + tapered for 2 consecutive polls
if (( $(python3 -c "print(1 if $MXV>=$VFULL and $TOTI<$ITAPER else 0)") )); then
taper_hits=$((taper_hits+1)); log " (at bulk + tapered, ${taper_hits}/2)"
[ "$taper_hits" -ge 2 ] && { log "COMPLETE: bulk reached + current tapered"; break; }
else taper_hits=0; fi
# TIMEOUT
if [ "$el" -ge $((MAX_HOURS*60)) ]; then log "TIMEOUT ${MAX_HOURS}h — reverting at packV ${MXV}"; break; fi
sleep "$POLL_S"
done
read MNS MXS _ _ _ MXV _ NDONE <<<"$(read_packs)"
log "RESULT: SoC ${MNS}-${MXS}%, packs@100=${NDONE}/6, packV up to ${MXV}"
# revert runs via trap

184
.claude/skills/lib/ha-history Executable file
View File

@@ -0,0 +1,184 @@
#!/usr/bin/env python3
"""ha-history — pull state history for HA entities from the Home Assistant
recorder, and print a compact change-point timeline. Companion to solar-snapshot
(which only sees live values); this is the historic-lookback tool.
Auth: reads a long-lived access token from ~/.config/ha/token (or $HA_TOKEN).
No secret is ever passed on the command line or hardcoded.
Base URL: $HA_URL, else ~/.config/ha/url, else http://10.0.0.41:8123.
Usage:
ha-history [-s SINCE] [-e END] [-m REGEX] [-a] ENTITY [ENTITY ...]
ENTITY entity_id; a bare name with no dot is auto-prefixed `sensor.`
e.g. `lvx6048_1_device_mode` -> `sensor.lvx6048_1_device_mode`
-s SINCE start of window. default "7 days ago".
accepts: "7 days ago", "7d", "36h", "90m", an ISO timestamp,
or a date "2026-06-16".
-e END end of window. default: now. same formats as -s.
-m REGEX only show change-points whose state matches REGEX (case-insensitive);
the per-entity header still reports the full count. e.g. -m fault
-a show every recorded point, not just state *changes*.
Examples:
ha-history lvx6048_1_device_mode lvx6048_2_device_mode
ha-history -s "10 days ago" -m fault lvx6048_1_fault_code lvx6048_2_fault_code
ha-history -s 2026-06-16 -e 2026-06-23 lvx6048_1_device_mode
"""
import sys, os, re, json, argparse, urllib.request, urllib.parse, urllib.error
from datetime import datetime, timedelta, timezone
CONF_DIR = os.path.expanduser("~/.config/ha")
DEFAULT_URL = "http://10.0.0.41:8123"
def die(msg, code=1):
print(f"ha-history: {msg}", file=sys.stderr)
sys.exit(code)
def load_token():
tok = os.environ.get("HA_TOKEN")
if tok:
return tok.strip()
path = os.path.join(CONF_DIR, "token")
if not os.path.exists(path):
die("no token. Create a Long-Lived Access Token in HA "
"(Profile -> Security), then:\n"
" mkdir -p ~/.config/ha && install -m600 /dev/stdin ~/.config/ha/token\n"
"or set $HA_TOKEN.")
with open(path) as f:
tok = f.read().strip()
if not tok:
die(f"{path} is empty")
return tok
def base_url():
if os.environ.get("HA_URL"):
return os.environ["HA_URL"].rstrip("/")
p = os.path.join(CONF_DIR, "url")
if os.path.exists(p):
with open(p) as f:
u = f.read().strip()
if u:
return u.rstrip("/")
return DEFAULT_URL
def parse_when(s, *, default_now=False):
if s is None:
return datetime.now(timezone.utc).astimezone() if default_now else None
s = s.strip()
m = re.fullmatch(r"(\d+)\s*(d|h|m)(?:ays?|ours?|in(?:ute)?s?)?(?:\s*ago)?", s, re.I)
if m:
n, unit = int(m.group(1)), m.group(2).lower()
delta = {"d": timedelta(days=n), "h": timedelta(hours=n), "m": timedelta(minutes=n)}[unit]
return datetime.now(timezone.utc).astimezone() - delta
# ISO timestamp or bare date
try:
dt = datetime.fromisoformat(s)
except ValueError:
die(f"can't parse time {s!r}. Use '7 days ago', '36h', ISO, or 'YYYY-MM-DD'.")
if dt.tzinfo is None: # assume local tz
dt = dt.astimezone()
return dt
def fetch(url, token):
req = urllib.request.Request(url, headers={"Authorization": f"Bearer {token}"})
try:
with urllib.request.urlopen(req, timeout=30) as r:
return json.load(r)
except urllib.error.HTTPError as e:
if e.code == 401:
die("401 Unauthorized — token rejected. Regenerate it in HA and rewrite "
"~/.config/ha/token.")
die(f"HTTP {e.code} from HA: {e.reason}")
except urllib.error.URLError as e:
die(f"cannot reach HA at {url.split('/api')[0]}: {e.reason}")
def fmt_local(iso):
"""HA returns UTC ISO; show local time, second precision."""
try:
return datetime.fromisoformat(iso).astimezone().strftime("%Y-%m-%d %H:%M:%S")
except (ValueError, TypeError):
return str(iso)
def main():
ap = argparse.ArgumentParser(add_help=False)
ap.add_argument("-s", "--since", default="7 days ago")
ap.add_argument("-e", "--end", default=None)
ap.add_argument("-m", "--match", default=None)
ap.add_argument("-a", "--all-points", action="store_true")
ap.add_argument("-h", "--help", action="store_true")
ap.add_argument("entities", nargs="*")
a = ap.parse_args()
if a.help or not a.entities:
print(__doc__.strip())
sys.exit(0 if a.help else 2)
ents = [e if "." in e else f"sensor.{e}" for e in a.entities]
start = parse_when(a.since)
end = parse_when(a.end, default_now=True)
matcher = re.compile(a.match, re.I) if a.match else None
token = load_token()
url = (f"{base_url()}/api/history/period/"
f"{urllib.parse.quote(start.isoformat())}"
f"?end_time={urllib.parse.quote(end.isoformat())}"
f"&filter_entity_id={urllib.parse.quote(','.join(ents))}"
f"&minimal_response&no_attributes")
data = fetch(url, token)
print(f"# HA history {start.strftime('%Y-%m-%d %H:%M')} -> "
f"{end.strftime('%Y-%m-%d %H:%M')} ({base_url()})\n")
by_id = {}
for series in data or []:
if series:
by_id[series[0].get("entity_id")] = series
for ent in ents:
series = by_id.get(ent)
if not series:
print(f"{ent}\n (no recorded history in window — entity wrong, "
f"excluded from recorder, or purged)\n")
continue
# Build (time, state) points, collapsing consecutive identical states
# unless --all-points.
points, prev = [], object()
for item in series:
st = item.get("state")
ts = item.get("last_changed") or item.get("last_updated")
if a.all_points or st != prev:
points.append((ts, st))
prev = st
shown = [(ts, st) for ts, st in points if not matcher or matcher.search(str(st))]
label = "points" if a.all_points else "changes"
extra = f", {len(shown)} match /m" if matcher else ""
print(f"{ent} ({len(points)} {label}{extra})")
if not shown:
print(" (nothing matched)\n")
continue
for i, (ts, st) in enumerate(shown):
mark = " <<< FAULT" if re.search(r"fault", str(st), re.I) and st not in ("No fault",) else ""
# duration until next change-point in the *full* timeline
dur = ""
if not matcher:
nxt = points[i + 1][0] if i + 1 < len(points) else None
if nxt:
try:
d = datetime.fromisoformat(nxt) - datetime.fromisoformat(ts)
secs = int(d.total_seconds())
dur = f" ({secs//3600}h{secs%3600//60:02d}m)" if secs >= 3600 else f" ({secs//60}m)"
except (ValueError, TypeError):
pass
print(f" {fmt_local(ts)} {st}{dur}{mark}")
print()
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,225 @@
#!/usr/bin/env bash
# solar-morning-run — unattended solar calibration top-off (+ opt-in throttle test).
#
# DEFAULT (scheduled): SOLAR-ONLY at the existing 60 A cap. No setting is changed.
# Drives the natural solar charge to a full absorption that re-anchors the 6 EG4 pack
# SoC counters, with a weather gate, safety monitoring, and re-anchor verification.
#
# OPT-IN throttle mode (THROTTLE=1 env): also raises the charge cap 60->80 A to test
# whether the cap throttles midday harvest. IMPORTANT FIRMWARE CONSTRAINT (confirmed
# live 2026-06-24): MCHGC is LOCKED while the inverter is charging — a cap change NAKs
# in 'Battery' mode whenever charger_status='charging'. So the cap can only be set in a
# true pre-charge idle window (very early dawn), and the revert can only complete once
# charging stops (evening / battery full). cleanup() retries the revert; if it can't
# (still charging), it WARNS and the cap stays 80 A (safe) until an idle window. Because
# of this lock there is NO clean noon A/B — throttle mode only logs today's 80 A-cap
# peak vs the historical 60 A baseline (weather-confounded). Use supervised, not blind.
#
# Safety: monitor-abort if any cell > 3.65 V or pack temp > 45 C. NOTE the script CANNOT
# force-stop a charge (the relevant setters are locked while charging) — the pack BMS
# over-voltage/over-temp protection is the real safety net, as always.
# Weather gate: if PV hasn't ramped past 4 kW by 10:30, too cloudy — exit (revert if raised).
#
# solar-morning-run [--dry-run] # default: solar-only @ 60 A
# THROTTLE=1 solar-morning-run # also raise cap to 80 A (supervised dawn use)
set -uo pipefail # NOT -e: we handle errors explicitly so cleanup/revert always runs
DRYRUN=0; [ "${1:-}" = "--dry-run" ] && DRYRUN=1
# ---- config ----
HA="http://10.0.0.41:8123"
TOKEN_FILE="$HOME/.config/ha/token"
POWERMON_CONF="$HOME/.config/powermon/powermon.yaml"
CTRL_TOPIC="solar/control/lvx6048/max_charging_current"
CAP_HIGH=80; CAP_LOW=60
CELL_ABORT=3.65 # V, any pack cell_voltage_max above this -> abort
TEMP_ABORT=45 # C, any pack temp above this -> abort
SOC_DONE=99 # all 6 packs >= this -> re-anchored
PV_SUN_GATE=4000 # W combined; below this by SUN_DEADLINE -> too cloudy
SUN_DEADLINE="10:30"
MAX_HOURS=13
POLL_S=600 # monitor loop period
[ "$DRYRUN" = 1 ] && POLL_S=2 # fast loop for validation
BASE_PEAK_W=9459 # yesterday's 60A-cap noon peak, for the throttle comparison
BASE_PEAK_A=120 # ...and the cap it was pinned at
RUNDIR="$HOME/solar-runs"; mkdir -p "$RUNDIR"
LOG="$RUNDIR/run-$(date +%Y%m%d).log"
LOCK="$RUNDIR/.lock"
RAISED=0
log(){ printf '%s %s\n' "$(date '+%F %T')" "$*" | tee -a "$LOG"; }
# ---- broker creds from powermon.yaml (same source as the other tools) ----
read -r BHOST BPORT BUSER BPASS < <(awk '
/^[^[:space:]]/{i=0} /^mqttbroker:/{i=1;next}
i&&/^[[:space:]]+name:/{h=$2} i&&/^[[:space:]]+port:/{p=$2}
i&&/^[[:space:]]+username:/{u=$2} i&&/^[[:space:]]+password:/{w=$2}
END{print h,(p?p:1883),u,w}' "$POWERMON_CONF")
TOKEN="$(cat "$TOKEN_FILE" 2>/dev/null)"
[ -z "$TOKEN" ] && { log "FATAL: no HA token at $TOKEN_FILE"; exit 1; }
# ---- HA point reads ----
ha_json(){ curl -s -H "Authorization: Bearer $TOKEN" "$HA/api/states/$1"; }
ha_state(){ ha_json "$1" | python3 -c 'import sys,json
try:d=json.load(sys.stdin);print(d.get("state",""))
except:print("")'; }
ha_temp_c(){ ha_json "$1" | python3 -c 'import sys,json
try:
d=json.load(sys.stdin);s=float(d["state"]);u=d["attributes"].get("unit_of_measurement","")
print(round((s-32)*5/9,1) if "F" in u else round(s,1))
except:print("")'; }
PACK(){ echo "sensor.eg4_lifepower4_lifepower4_${1}_lifepower4_${1}_${2}"; }
INV(){ echo "sensor.lvx6048_lvx6048_${1}_${2}"; }
# combined live PV (both units, mppt1)
pv_now(){
local a b
a=$(ha_state "$(INV 1 mppt1_input_power)"); b=$(ha_state "$(INV 2 mppt1_input_power)")
python3 - "$a" "$b" <<'PY'
import sys
def f(x):
try: return float(x)
except: return 0.0
print(int(f(sys.argv[1]) + f(sys.argv[2])))
PY
}
# ---- cap change via lvx-control (atomic mirror to both units) ----
set_cap(){ # $1 = amps
local amps="$1"
if [ "$DRYRUN" = 1 ]; then log "[dry-run] would publish $amps to $CTRL_TOPIC"; return 0; fi
mosquitto_pub -h "$BHOST" -p "$BPORT" -u "$BUSER" -P "$BPASS" -t "$CTRL_TOPIC" -m "$amps"
# confirm via the two result topics (fail shows 'Failed')
local res; res=$(timeout 17 mosquitto_sub -h "$BHOST" -p "$BPORT" -u "$BUSER" -P "$BPASS" \
-W 15 -t powermon/lvx6048_1/result -t powermon/lvx6048_2/result 2>/dev/null)
log "cap->$amps result: $(echo "$res" | tr '\n' '|')"
echo "$res" | grep -qi 'fail' && return 1
return 0
}
cleanup(){
local rc=$? tries=0
while [ "$RAISED" = 1 ] && [ "$tries" -lt 4 ]; do
tries=$((tries+1)); log "CLEANUP: revert cap to ${CAP_LOW}A (try $tries)"
if set_cap "$CAP_LOW"; then
sleep 8
log "CLEANUP: live cap now $(ha_state $(INV 1 max_charging_current))/$(ha_state $(INV 2 max_charging_current)) A"
RAISED=0
else
log "CLEANUP: revert NAK'd (inverter still charging?) — wait 120s + retry"
sleep 120
fi
done
[ "$RAISED" = 1 ] && log "CLEANUP: !!! COULD NOT REVERT — cap still ${CAP_HIGH}A (safe). Restore ${CAP_LOW}A via lvx-control once charging stops (evening)."
log "run exit rc=$rc"
flock -u 9 2>/dev/null || true
}
trap cleanup EXIT INT TERM
# ---- read all 6 packs; echo "minSoC maxSoC maxCell maxTempC spreadV"; set ABORT_REASON ----
ABORT_REASON=""
read_packs(){
local socs=() cells=() temps=() pv=() i t pvolt
for i in 1 2 3 4 5 6; do
socs+=("$(ha_state "$(PACK $i soc)")")
cells+=("$(ha_state "$(PACK $i cell_voltage_max)")")
# hottest of the pack's sensors (pcb + 01..03), unit-corrected to C
local tmax=0 s
for s in temperature_pcb temperature_01 temperature_02 temperature_03; do
t="$(ha_temp_c "$(PACK $i $s)")"; t=${t%.*}
[[ "$t" =~ ^-?[0-9]+$ ]] && [ "$t" -gt "$tmax" ] && tmax=$t
done
temps+=("$tmax")
pv+=("$(ha_state "$(PACK $i pack_voltage)")")
done
python3 - "${socs[*]}" "${cells[*]}" "${temps[*]}" "${pv[*]}" <<'PY'
import sys
socs=[float(x) for x in sys.argv[1].split() if x]
cells=[float(x) for x in sys.argv[2].split() if x]
temps=[float(x) for x in sys.argv[3].split() if x]
pv=[float(x) for x in sys.argv[4].split() if x]
print(f"{min(socs):.0f} {max(socs):.0f} {max(cells):.3f} {max(temps):.0f} {max(pv)-min(pv):.2f} {len([s for s in socs if s>=99])}")
PY
}
# ============================ run ============================
exec 9>"$LOCK"
flock -n 9 || { log "another run holds the lock; exiting"; exit 0; }
log "=== solar-morning-run start (dry-run=$DRYRUN) ==="
# PHASE 0 — pre-flight
read MNS MXS MXCELL MXT SPREAD NDONE <<<"$(read_packs)"
log "PRE-FLIGHT: SoC ${MNS}-${MXS}% (spread $((MXS-MNS)) pts), maxcell ${MXCELL}V, maxtemp ${MXT}C, packV spread ${SPREAD}V, packs@100=${NDONE}/6"
BEFORE_SPREAD=$((MXS-MNS))
if (( $(python3 -c "print(1 if $MXCELL>$CELL_ABORT or $MXT>$TEMP_ABORT else 0)") )); then
log "PRE-FLIGHT FAIL: cell/temp already at limit — not starting"; exit 1; fi
if (( $(python3 -c "print(1 if $SPREAD>0.3 else 0)") )); then
log "PRE-FLIGHT WARN: pack voltage spread ${SPREAD}V >0.3 — possible REAL imbalance, not just drift. Proceeding read-only, NOT raising cap."; SKIP_CAP=1; fi
# PHASE 1 — arm: raise cap to 80A ONLY in throttle mode AND a true pre-charge idle
# window (MCHGC is firmware-locked while charging — detect via charger_status, NOT
# device_mode, which reads 'Battery' even mid-charge).
if [ "${THROTTLE:-0}" = 1 ] && [ "${SKIP_CAP:-0}" != 1 ]; then
c1=$(ha_state "$(INV 1 mppt1_charger_status)"); c2=$(ha_state "$(INV 2 mppt1_charger_status)")
log "throttle mode; charger_status: u1='$c1' u2='$c2'"
if echo "$c1$c2" | grep -qiE 'charg'; then
log "already charging -> MCHGC locked; staying solar-only @ ${CAP_LOW}A (no cap change)"
elif set_cap "$CAP_HIGH"; then
[ "$DRYRUN" = 1 ] || RAISED=1
log "ARMED: cap raised to ${CAP_HIGH}A (live: $(ha_state $(INV 1 max_charging_current))/$(ha_state $(INV 2 max_charging_current)))"
else
log "cap raise NAK'd -> solar-only @ ${CAP_LOW}A"
fi
else
log "SOLAR-ONLY mode: cap stays ${CAP_LOW}A, no setting changed (set THROTTLE=1 to opt in)"
fi
# PHASE 2 — sun gate + throttle observation
START=$(date +%s)
log "waiting for PV to ramp (gate ${PV_SUN_GATE}W by ${SUN_DEADLINE})..."
while :; do
P=$(pv_now); now=$(date +%H:%M)
log " PV=${P}W at $now"
(( $(python3 -c "print(1 if $P>=$PV_SUN_GATE else 0)") )) && { log "sun gate passed (${P}W)"; break; }
[ "$DRYRUN" = 1 ] && { log "[dry-run] skip sun wait -> exercise monitor loop"; break; }
if [[ "$now" > "$SUN_DEADLINE" ]]; then log "TOO CLOUDY: PV ${P}W < ${PV_SUN_GATE}W by ${SUN_DEADLINE} — aborting day"; exit 0; fi
sleep 300
done
# throttle observation (only meaningful if the cap was actually raised this run)
PKW=$(pv_now)
if [ "$RAISED" = 1 ]; then
log "THROTTLE: PV(@${CAP_HIGH}A cap)=${PKW}W now; baseline 60A-cap noon peak was ${BASE_PEAK_W}W(@${BASE_PEAK_A}A). If today's clear-noon peak climbs well above baseline, the 60A cap WAS throttling -> raise permanently. (Tracked each poll.)"
else
log "solar-only: PV ${PKW}W (no throttle comparison — cap unchanged at ${CAP_LOW}A)"
fi
# PHASE 3 — monitor charge to full
log "monitoring charge to full (poll ${POLL_S}s, abort cell>${CELL_ABORT}V/temp>${TEMP_ABORT}C)..."
iter=0
while :; do
iter=$((iter+1))
read MNS MXS MXCELL MXT SPREAD NDONE <<<"$(read_packs)"
P=$(pv_now)
log " [$iter] SoC ${MNS}-${MXS}% packs@100=${NDONE}/6 maxcell ${MXCELL}V maxtemp ${MXT}C PV ${P}W"
# SAFETY
if (( $(python3 -c "print(1 if $MXCELL>$CELL_ABORT or $MXT>$TEMP_ABORT else 0)") )); then
log "!!! SAFETY ABORT: maxcell ${MXCELL}V / maxtemp ${MXT}C exceeded limit — reverting + stopping"; exit 2; fi
# COMPLETE
if [ "$NDONE" = 6 ]; then log "COMPLETE: all 6 packs >= ${SOC_DONE}% — re-anchored"; break; fi
# TIMEOUT / sun gone
el=$(( ($(date +%s)-START)/3600 ))
if [ "$el" -ge "$MAX_HOURS" ]; then log "TIMEOUT: ${MAX_HOURS}h elapsed, not full — reverting"; break; fi
if (( $(python3 -c "print(1 if $P<300 else 0)") )) && [ "$iter" -gt 2 ]; then
log "SUN GONE: PV ${P}W and not full — incomplete, reverting (retry another clear day)"; break; fi
if [ "$DRYRUN" = 1 ] && [ "$iter" -ge 2 ]; then log "[dry-run] stop after 2 iters"; break; fi
sleep "$POLL_S"
done
# PHASE 4 — verify
read MNS MXS _ _ _ NDONE <<<"$(read_packs)"
log "RESULT: SoC spread ${BEFORE_SPREAD}pts -> $((MXS-MNS))pts, packs@100=${NDONE}/6"
[ "$NDONE" = 6 ] && log "SUCCESS: all packs re-anchored to 100%." || log "PARTIAL: ${NDONE}/6 re-anchored — rerun on a clear day."
# cleanup() runs on exit and reverts the cap

View File

@@ -0,0 +1,97 @@
#!/usr/bin/env bash
# solar-snapshot — capture the latest retained/published value of every MQTT
# topic matching a filter, over a short listen window, and print a clean table.
#
# Why a listen window: powermon/eg4-battery STATE topics are NOT retained — they
# are republished every poll cycle (GS ~5s, packs ~one cycle, EVSE on-change).
# So we subscribe for a few seconds and keep the last value seen per topic.
# (HA discovery `.../config` topics ARE retained and show up immediately.)
#
# Broker credentials are read from ~/.config/powermon/powermon.yaml (the same
# source the openevse + lvx-control tools use) so nothing is hardcoded here.
#
# NOTE on MQTT wildcards: `+` matches exactly ONE whole level, so it cannot be
# used as a name prefix. `homeassistant/sensor/lifepower4_+/state` matches NOTHING.
# To grab a family of entities, subscribe to the level wildcard and filter with -g:
# solar-snapshot -g lifepower4 'homeassistant/sensor/+/state'
#
# Usage:
# solar-snapshot [-w SECONDS] [-f] [-g GREP_RE] TOPIC_FILTER [TOPIC_FILTER ...]
# -w SECONDS listen window (default 12)
# -f print full topic path (default: strip homeassistant/<class>/ prefix)
# -g GREP_RE keep only topics whose path matches this extended-regex
#
# Examples:
# solar-snapshot -g 'lvx6048_1' 'homeassistant/sensor/+/state'
# solar-snapshot -w 18 -g 'lifepower4_[1-6]_soc' 'homeassistant/sensor/+/state'
# solar-snapshot 'openevse/#'
# solar-snapshot -w 6 'homeassistant/sensor/lvx6048_1_battery_voltage/state' \
# 'homeassistant/sensor/lifepower4_1_pack_voltage/state'
#
# Exit status reflects the formatting stage, not mosquitto_sub's benign -W
# window-expiry code, so callers don't misread a normal capture as a failure.
set -eu
WINDOW=12
FULL=0
GREP_RE=""
while getopts "w:fg:" opt; do
case "$opt" in
w) WINDOW="$OPTARG" ;;
f) FULL=1 ;;
g) GREP_RE="$OPTARG" ;;
*) echo "usage: solar-snapshot [-w SECONDS] [-f] [-g GREP_RE] TOPIC_FILTER..." >&2; exit 2 ;;
esac
done
shift $((OPTIND - 1))
if [ "$#" -lt 1 ]; then
echo "usage: solar-snapshot [-w SECONDS] [-f] [-g GREP_RE] TOPIC_FILTER..." >&2
exit 2
fi
CONF="${POWERMON_CONF:-$HOME/.config/powermon/powermon.yaml}"
if [ ! -r "$CONF" ]; then
echo "solar-snapshot: cannot read broker config $CONF" >&2
exit 1
fi
# Pull host/port/user/pass from the mqttbroker: block of powermon.yaml.
# Keys are anchored to leading whitespace + exact key so `name:` doesn't also
# match `username:`.
read -r HOST PORT USER PASS < <(awk '
/^[^[:space:]]/ { inblk=0 }
/^mqttbroker:/ { inblk=1; next }
inblk && /^[[:space:]]+name:/ { h=$2 }
inblk && /^[[:space:]]+port:/ { p=$2 }
inblk && /^[[:space:]]+username:/ { u=$2 }
inblk && /^[[:space:]]+password:/ { w=$2 }
END { print h, (p?p:1883), u, w }
' "$CONF")
if [ -z "${HOST:-}" ]; then
echo "solar-snapshot: no mqttbroker.name found in $CONF" >&2
exit 1
fi
# Build -t args from filters.
TARGS=()
for f in "$@"; do TARGS+=(-t "$f"); done
# Subscribe for the window, then reduce to last-value-per-topic.
timeout "$((WINDOW + 2))" mosquitto_sub -h "$HOST" -p "$PORT" -u "$USER" -P "$PASS" \
-W "$WINDOW" -v "${TARGS[@]}" 2>/dev/null \
| { [ -n "$GREP_RE" ] && grep -E "$GREP_RE" || cat; } \
| awk -v full="$FULL" '
{ t=$1; $1=""; sub(/^ /,""); v=$0; last[t]=v; order[t]=NR }
END {
n=0
for (t in last) { keys[n++]=t }
# stable-ish sort by topic name
for (i=0;i<n;i++) for (j=i+1;j<n;j++) if (keys[j]<keys[i]) { tmp=keys[i];keys[i]=keys[j];keys[j]=tmp }
for (i=0;i<n;i++) {
t=keys[i]; disp=t
if (!full) { sub(/^homeassistant\/[^/]+\//,"",disp); sub(/\/state$/,"",disp) }
printf "%-44s %s\n", disp, last[t]
}
if (n==0) print "(no messages in window — topics idle, broker unreachable, or filter wrong)"
}'

View File

@@ -0,0 +1,71 @@
---
name: power-usage
description: >-
Analyze where the power is going across the install — load vs PV generation vs
grid vs battery flow, plus EVSE charging sessions. Use when the user asks "why is
my battery draining / how much am I using / where are the watts going / is the car
charging / what's my solar production / power consumption", or wants an energy
balance or breakdown. Read-only; this skill measures and explains, it does not
change anything.
---
# power-usage
## 0. Load context
Shell cwd is the repo root; anchor paths there:
```bash
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"
```
Read `$ROOT/.claude/skills/REFERENCE.md` for entity names. Key sign conventions: pack
`pack_current` is signed (**+ = charging, = discharging**); inverter
`mppt*_input_power` is PV in (W); `ac_output_active_power` is load out (W).
## 1. Instantaneous energy balance
```bash
# Generation (PV) + load, per inverter:
"$SNAP" -w 10 -g 'lvx6048_[12]_(mppt1_input_power|mppt2_input_power|ac_output_active_power|grid_voltage|device_mode)/' 'homeassistant/sensor/+/state'
# Battery flow, per pack (sum the pack_power = V×I yourself):
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(pack_voltage|pack_current|soc)/' 'homeassistant/sensor/+/state'
# EV charger:
"$SNAP" -w 8 'openevse/status' 'openevse/power' 'openevse/amp' 'openevse/voltage' 'openevse/session_energy'
```
Then state the balance in words:
- **PV in** = sum of all `mppt*_input_power`.
- **Battery** = sum of (pack_voltage × pack_current) over 6 packs. Negative total =
discharging (load exceeds PV+grid); positive = charging.
- **Load out** = sum of inverter `ac_output_active_power`.
- **EVSE** = `openevse/power` — and the EVSE load is a *subset* of total load, so a
draining battery with the car plugged usually explains itself here.
- **Grid**: `device_mode` Bypass/Line means grid is carrying/supplementing; Battery
mode means running off the bank. The LVX6048 has no clean grid-power entity, so
infer grid = load PV battery_discharge.
Sanity: PV + grid + battery_discharge ≈ load (within metering noise). A big residual
means one feed is mis-reported — note it (e.g. the known `lvx6048_1_battery_voltage`
~10 V glitch will corrupt any pack-power math that uses the *inverter's* battery
reading; always use the **pack** entities for battery flow).
## 2. "Why is the battery draining?"
Walk the chain: is PV low (night/shade/§5 of troubleshoot-inverter dead string)? Is
load high (check `ac_output_active_power` and EVSE `power`)? Is the inverter in
Battery mode instead of using grid (`device_mode`)? Pin the drain on the largest
negative contributor and say which.
## 3. EVSE sessions
```bash
"$SNAP" -w 10 'openevse/status' 'openevse/state' 'openevse/session_energy' 'openevse/total_energy' 'openevse/vehicle' 'openevse/pilot' 'openevse/max_current'
```
- `status` active = charging; sleeping/disabled = not drawing. `vehicle` = plugged.
- `session_energy` (Wh) this plug-in; `pilot`/`max_current` = the current cap the
EVSE is signalling. Idle EVSE publishes little — a short empty capture is normal.
- For history/trends (daily kWh, past sessions), the data lives in **Home Assistant's
recorder**, not on MQTT — direct the user to the HA Energy dashboard /
`sensor.openevse_total_day|week|month`. PI18 has no per-day inverter energy
(memory `project_lvx6048_no_daily_energy_query`); only `ET` lifetime Wh exists.
## 4. Report
Give the live balance (PV / load / battery / grid / EVSE, with numbers and signs),
the headline ("you're pulling X W from the bank because load Y W > PV Z W, car is
taking W W"), and point at HA recorder for anything historical. This skill never
changes settings — if the answer is "shift charging to solar hours" etc., suggest it
as advice, don't actuate.

View File

@@ -0,0 +1,103 @@
---
name: solar-health-check
description: >-
Top-level health snapshot of the whole solar/power install — 2 LVX6048
inverters, 6 EG4 LifePower4 packs, and the OpenEVSE charger — with cross-checks
and a green/yellow/red verdict. Use when the user asks "how's the solar /
battery / power system doing", "is everything ok", "check the install", wants a
status report, or as the first step before deeper troubleshooting. For deep
dives into one subsystem, hand off to troubleshoot-inverter, troubleshoot-battery,
or power-usage.
---
# solar-health-check
A fast, read-only sweep of every subsystem that ends in a clear verdict. Do NOT
change settings here; if something needs a restart, that's allowed (see policy).
## 0. Load context
Skills run with the shell cwd at the repo root, so anchor paths there:
```bash
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
```
Read `$ROOT/.claude/skills/REFERENCE.md` (system map, entity names, snapshot helper,
action policy) before proceeding.
## 1. Services up?
```bash
systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service lvx-resolve-links.service
```
`lvx-resolve-links` is a oneshot → expect `active`/`exited` (not `failed`). Any
`failed`/`inactive` on the others is RED. For a wedged data-plane daemon, a
restart is allowed (see §6).
## 2. Capture live telemetry
```bash
"$SNAP" -w 10 -g 'lvx6048_[12]_(device_mode|fault_code|battery_voltage|battery_capacity|ac_output_active_power|mppt1_input_power|mppt2_input_power|grid_voltage|inverter_heat_sink_temperature|parallel_instance_number)/' 'homeassistant/sensor/+/state'
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current|cell_voltage_delta_mv|temperature_pcb)/' 'homeassistant/sensor/+/state'
"$SNAP" -w 6 'openevse/status' 'openevse/amp' 'openevse/power' 'openevse/session_energy'
```
If a family returns "(no messages)": the feeding daemon is silent → that subsystem
is RED regardless of `is-active` (running but not publishing). EVSE idle/unplugged
publishing nothing is normal — confirm via `openevse/status`.
## 3. Cross-checks (this is the value-add — single sensors can each look fine)
- **Battery voltage agreement**: each inverter's `battery_voltage` should be within
~1 V of the pack stack voltage (`pack_voltage` ≈ 5155 V). **Known anomaly:** the
inverter reading is *intermittently* wrong (correct ~54 V on 2026-06-20, ~910 V
after the Jun 22 reboot) — a post-reboot glitch, not a permanent bug. If it reads
~10 V, note it and suggest a powermon restart; use the `lifepower4_*` pack entities,
never the inverter reading, for any battery math (see REFERENCE known-issues).
- **Cross-unit PV production (catches a silently-dead inverter)**: compare
`lvx6048_1_mppt1_input_power` vs `lvx6048_2_mppt1_input_power`. In daylight (the
*other* unit clearly producing), one unit pinned at **0 W** = that inverter is down
and being masked by its sibling — RED → troubleshoot-inverter. This is exactly the
2026-06-20 fault-08 failure mode (unit 1 sat at 0 W for ~1.8 days). At night/heavy
shade both at 0 W is normal.
- **SoC spread across packs**: `max(soc) - min(soc)` over the 6 packs. BUT first
cross-check against `pack_voltage`/`cell_voltage_max`: the packs are paralleled, so
if all `pack_voltage` agree (±0.1 V) the packs are physically at the same charge and
any SoC spread is **counter drift**, not real imbalance (pack 6 ran 76 % while
reading the same 53.4 V / 3.337 V/cell as packs at 5055 % on 2026-06-24). Real
imbalance = pack voltages actually diverge. Drift → note it, recommend a calibration
charge; >20 % spread with diverging voltages = RED → troubleshoot-battery.
- **Cell imbalance**: any pack with `cell_voltage_delta_mv` > 50 = YELLOW, > 100 = RED.
- **Parallel master/slave**: exactly one inverter should report
`parallel_instance_number` 0 (master); the other 1+. Two masters or two slaves = RED.
- **Faults**: any `fault_code` non-zero, or `device_mode` = Fault = RED → troubleshoot-inverter.
- **Temps**: pack `temperature_pcb` > 55 °C or inverter heat-sink > 75 °C = YELLOW.
- **Power balance sanity**: PV in (`mppt*_input_power`) vs AC out vs pack
`pack_current` should roughly conserve. Gross mismatch = investigate via power-usage.
## 4. Verdict
Print a compact table (subsystem → state → one-line reason), then an overall
GREEN / YELLOW / RED with the top 13 issues and which deeper skill to run.
## 5. Recent error scan (only if anything looked off)
```bash
for s in powermon powermon2 eg4-battery lvx-control; do
echo "== $s =="; journalctl -u $s.service --since "15 min ago" --no-pager | grep -iE 'error|timeout|fail|crc|nak|reconnect' | tail -5
done
```
## 5b. Historical sanity — did anything fail while unattended? (needs HA token)
Live snapshots miss faults that already cleared and silent-unit spells that ended.
If `~/.config/ha/token` exists (see REFERENCE), scan the recorder for the last few
days. Use the REAL HA entity_ids (doubled slug — see REFERENCE), not MQTT names:
```bash
"$HIST" -s "5 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
# silent-unit hunt: sample midday PV both units across recent days; one pinned 0 while
# the other produced = it was down. e.g. check a midday window per day:
"$HIST" -s "2 days ago" sensor.lvx6048_lvx6048_1_mppt1_input_power sensor.lvx6048_lvx6048_2_mppt1_input_power | head -40
```
Any fault-08 / silent-unit episode → report with timestamps and hand off to
troubleshoot-inverter §2§5. No token → say so and point the user at REFERENCE to add one.
## 6. Allowed remediation
If a daemon is `failed` or running-but-silent, restarting it is permitted:
```bash
sudo systemctl restart eg4-battery.service # or powermon / powermon2 / lvx-control
```
Re-run the relevant snapshot to confirm data resumes. Anything beyond a restart
(settings, flash, cabling) → report and hand the user the exact command. Never
publish to `solar/control/lvx6048/*` from this skill.

View File

@@ -0,0 +1,77 @@
---
name: troubleshoot-battery
description: >-
Diagnose the 6× EG4 LifePower4 v2 battery packs — per-pack SoC, cell imbalance,
SoC drift, RS485/Modbus comms silence, temperature, and warning/protection bits.
Use when a pack reads oddly, packs disagree on SoC, a pack stopped reporting,
cells look imbalanced, the user mentions "battery problem / pack down / SoC wrong
/ imbalance / one battery", or after solar-health-check flags the stack. Read-only
plus a safe eg4-battery daemon restart; never writes BMS settings.
---
# troubleshoot-battery
## 0. Load context
Shell cwd is the repo root; anchor paths there:
```bash
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"
```
Read `$ROOT/.claude/skills/REFERENCE.md`. There are **6 packs** `lifepower4_1..6_*`,
all served by `eg4-battery.service` (one FTDI RS485 adapter per pack). Pack config:
`~/.config/eg4-battery/eg4-battery.yaml`.
## 1. Are all 6 packs reporting?
```bash
systemctl is-active eg4-battery.service
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current)/' 'homeassistant/sensor/+/state'
```
- Fewer than 6 packs in the output → a pack is **silent on RS485**, go to §4.
- All 6 present → go to §2/§3.
## 2. SoC spread & drift
```bash
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|soc_alt|pack_voltage)/' 'homeassistant/sensor/+/state'
```
- Compute `max(soc) - min(soc)`. >10 % = imbalance worth noting; >20 % = significant.
**Pack 6 historically runs high** (it's the oddball: Modbus addr `0x01`/115200 vs
`0x40`/9600 for packs 15) — judge it on its own, don't assume it tracks 15.
- **SoC drift is a known design limitation**, not a live fault: the coulomb counter
never re-anchors because the bank rarely reaches 100 % to reset. See memory
`project_eg4_soc_drift_remediation`. If SoC looks wrong but `pack_voltage` is
sane, suspect drift, not a dead pack. Voltage→SoC sanity for LFP at rest:
~51.2 V ≈ low, ~53.5 V ≈ mid, ~54+ V ≈ high (loaded/charging skews this).
## 3. Cell imbalance, temperature, protection bits
```bash
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(cell_voltage_delta_mv|cell_voltage_min|cell_voltage_max|temperature_pcb)/' 'homeassistant/sensor/+/state'
# For a specific suspect pack N, pull all 16 cells + bits:
"$SNAP" -w 14 -g 'lifepower4_3_(cell_[0-9]+_voltage|warning|protection|temperature)' 'homeassistant/sensor/+/state'
```
- `cell_voltage_delta_mv`: <30 mV good, 3050 mV watch, >50 mV imbalanced, >100 mV
bad (a weak/failing cell, or the pack simply needs a long absorb to balance).
- Any warning/protection bit set → read its name; over-/under-voltage,
over-temp, and over-current protections will also explain a pack dropping current.
- Temps: `temperature_pcb` > 55 °C = watch.
## 4. RS485 / Modbus comms silence (a pack missing from §1)
```bash
journalctl -u eg4-battery.service --since "15 min ago" --no-pager | grep -iE 'timeout|crc|nak|error|no response|pack|addr' | tail -30
ls -l /dev/serial/by-id/ | grep -i ft232 # all 6 FTDI adapters enumerated?
grep -E 'name:|port:|address|baud' ~/.config/eg4-battery/eg4-battery.yaml
```
- Missing FTDI under `by-id` → USB/adapter/cable issue for that pack (hardware →
report to user; don't unplug things yourself).
- FTDI present but pack times out → check it isn't demoted by an inter-pack
daisy-chain (memory `project_eg4_daisy_chain_silences_slaves`: each pack must be on
its own dongle; a chain silences slaves). Also confirm the pack's `address`/`baud`
in config match the unit (pack 6 legitimately differs).
- Daemon wedged after a USB re-enumerate → restart is ALLOWED:
```bash
sudo systemctl restart eg4-battery.service
```
Then re-run §1 to confirm all 6 return.
## 5. Report
Per-pack table (SoC, voltage, current, delta_mv, max temp, any bits) + the stack
spread, whether it's drift vs a real fault, comms state, what you restarted, and any
hardware action left for the user. Do not write BMS registers or thresholds.

View File

@@ -0,0 +1,118 @@
---
name: troubleshoot-inverter
description: >-
Diagnose the 2× MPP Solar LVX6048 inverters — faults/warnings (FWS), operating
mode, parallel-cluster master/slave sync, PV/MPPT input, USB-HID link loss, and
powermon daemon health. Use when an inverter shows a fault, is in the wrong mode,
stopped publishing, the two units disagree, PV looks low, or the user says
"inverter problem / fault code / no solar / one inverter is down". Read-only plus
safe link/daemon recovery; never changes inverter settings.
---
# troubleshoot-inverter
## 0. Load context
Shell cwd is the repo root; anchor paths there:
```bash
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
```
Read `$ROOT/.claude/skills/REFERENCE.md`. Inverter entities are `lvx6048_1_*`
(powermon.service) and `lvx6048_2_*` (powermon2.service).
## 1. Is it a data problem or a device problem?
```bash
systemctl is-active powermon.service powermon2.service lvx-resolve-links.service
ls -l /dev/lvx6048-1 /dev/lvx6048-2 # symlinks present? point at hidraw?
"$SNAP" -w 12 -g 'lvx6048_[12]_(device_mode|fault_code|battery_voltage|ac_output_active_power)/' 'homeassistant/sensor/+/state'
```
- Service active + data flowing → **device/config** issue, go to §2.
- Service active but a unit's entities are silent, or a `/dev/lvx6048-*` symlink is
missing/dangling → **USB-HID link** issue, go to §4.
## 2. Faults & mode
```bash
"$SNAP" -w 12 -g 'lvx6048_[12]_(device_mode|fault_code|inverter_heat_sink_temperature)/' 'homeassistant/sensor/+/state'
journalctl -u powermon.service -u powermon2.service --since "20 min ago" --no-pager | grep -iE 'fault|warn|FWS|mode|error' | tail -20
```
- `device_mode` values: Power-On / Standby / Bypass / Battery / Line / Charge / Fault.
`Bypass`/`Line` = passing grid through (normal when grid present + low PV).
`Fault` = stop, decode the fault.
- The FWS fault/warning bit → label mapping lives in the patched driver
`$ROOT/LVX6048/powermon-patches/pi18.py` (search `FWS`, `fault`, `warning`).
Read it to translate a raw `fault_code`. MOD code labels are there too.
Quick refs: 02 over-temp, 03/04 battery V high/low, 07 overload timeout,
08 bus voltage too high, 56 battery connection open, 71 parallel version
different, 8086 parallel-cluster faults (86 = output setting mismatch).
### Historic faults — "when did it last happen / show me last week"
Local logs only reach the last reboot (`journalctl` here is volatile, ~1 day), so
for anything older query HA's recorder via `ha-history` (needs `~/.config/ha/token`
— see REFERENCE; if absent, tell the user how to create it, don't block):
Use the REAL HA entity_ids (NOT the MQTT object names — see REFERENCE; the slug is
doubled and differs per command):
```bash
"$HIST" -s "10 days ago" sensor.lvx6048_lvx6048_1_device_mode sensor.lvx6048_lvx6048_2_device_mode
"$HIST" -s "10 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
```
Each `Fault`/non-`No fault` change-point prints with a local timestamp and how long
it lasted (marked `<<< FAULT`). To pin the cause, re-query the same window with `-a`
for the surrounding conditions — `mppt1_input_voltage`, `mppt1_input_power`,
`battery_voltage`, `ac_output_active_power`:
- **Normal PV V (~300 V) + normal battery (~54 V) at the fault** → it's an internal
DC-bus transient, NOT input/battery over-voltage (rules out the cold-Voc theory).
- **Fault on ONE unit only, repeatedly** → unit-specific weakness (bus regulation /
cap / sensor / slave-CPU FW), not environmental (which hits both paralleled units).
- **`mppt_input_power` flatlines at 0 after the fault and stays there** → that unit
silently stopped producing; check it didn't sit dead until a reboot (a real 2026-06-20
occurrence: unit 1 fault 08 at 17:20 → 0 W PV until the Jun 22 reboot, ~half the
array offline ~1.8 days while unit 2 masked it). Cross-check the *other* unit's PV
over the same span to catch this.
## 3. Parallel-cluster sync
The two units run paralleled; desync throws **fault 86**.
```bash
"$SNAP" -w 12 -g 'lvx6048_[12]_(parallel_instance_number|ac_output_active_power|ac_output_voltage)/' 'homeassistant/sensor/+/state'
```
- Exactly one unit = `parallel_instance_number` 0 (master); other ≥1. Two masters /
two slaves / both 0 → cluster confused → likely needs a coordinated power cycle
(user action — propose it, don't do it).
- Healthy parallel = both units sharing load roughly symmetrically
(`ac_output_active_power` comparable). One at ~0 W while the other carries
everything = that unit dropped out of the cluster.
- Deeper sync/settings comparison via `$ROOT/LVX6048/lvx-flash/flash.py sync-check` /
`compare` exists but **stops powermon and grabs the USB** — advanced, user-run
only. Propose the command; don't execute.
## 4. USB-HID link recovery (ALLOWED)
Symptom: one unit's entities stale/absent, or `/dev/lvx6048-*` missing/dangling.
The resolver maps hidraw→stable symlink by PI18 serial and restarts powermon.
```bash
sudo systemctl restart lvx-resolve-links.service # remaps + restarts powermon{,2}
# or run the resolver directly:
sudo /usr/local/sbin/lvx-resolve-links
ls -l /dev/lvx6048-* # both symlinks back?
sudo systemctl restart powermon.service powermon2.service # if still wedged
```
Then re-run §1 snapshot to confirm both units publish again. Note: after an
inverter power-cycle the udev rule normally self-heals; manual restart is the
fallback when it didn't.
## 5. PV / MPPT looks low
```bash
"$SNAP" -w 12 -g 'lvx6048_[12]_(mppt1_input_power|mppt2_input_power|device_mode)/' 'homeassistant/sensor/+/state'
```
Cross-reference array memory (`project_solar_array_config`): single MPPT per inverter,
9s2p into it. **"Low" PV is mostly expected, not a fault** (verified 2026-06-24): at a
clear-noon peak each inverter made ~16 A @ ~300 V — a down string would read ~10 A, so
both parallel strings are live. The ~66%-of-nameplate output is 45° tilt (wrong for
high summer sun), heat derate, tree shading, AND **demand-throttling** (at noon the
battery charge pins at its cap, so the array is curtailed — not weak). To judge the
array, measure on a clear noon with the bank *hungry* (uncurtailed): ~16 A @ 300 V then
is healthy. Zero PV at night/heavy shade is normal. Don't diagnose "low PV" off a
single off-peak or demand-limited sample.
## 6. Report
State: which unit, mode, fault (decoded), cluster health, link state, what you
recovered (if anything), and any remaining action that needs the user (power cycle,
flash.py, combiner check) as exact commands/steps. Never publish to
`solar/control/lvx6048/*` here.

View File

@@ -196,9 +196,42 @@ then restart the three services.
PF) are implemented in `lvx-flash/flash.py` for offline use, but aren't
exposed as HA button/select entities. Deferred until monitoring has been
stable for at least a week.
- **Closed-loop BMS comms.** Currently open-loop — inverters estimate SoC
from voltage, batteries don't push real-time SoC / charge limits to the
inverter. Closed-loop would give better SoC accuracy and dynamic CC
tapering near full. Path is the dedicated CAN port on the master pack
inverter BMS port (separate cable from the inter-pack daisy-chain).
Deferred.
- **Closed-loop BMS comms — evaluated 2026-06, NOT pursuing.** Stays open-loop.
Closed-loop would need the LP4V2 to emulate Pylontech-CAN (no native EG4
support on the LVX BMS port) and would require the inter-pack daisy-chain that
silences slave packs' RS485 ports — i.e. trade away the per-pack/per-cell
telemetry that's our best diagnostic. And it wouldn't fix SoC accuracy: it just
forwards the BMS's own drifted SoC. The cure for drift is a periodic full charge
(see below), done open-loop. See `../eg4battery/NOTES.md` §"Closed-loop BMS comms".
## SoC calibration & known firmware quirks
**SoC drift fix.** The EG4 pack SoC counters drift because the conservative profile
rarely drives a true full charge. A periodic **full charge to absorption** re-anchors
every pack to 100%. Automated by the `calibration-charge` skill —
[`../.claude/skills/calibration-charge/`](../.claude/skills/calibration-charge/):
solar-only on a clear day, or grid-assisted via **output-priority SUB** on a cloudy
day (`../.claude/skills/lib/grid-cal-monitor`, which safety-monitors and auto-reverts).
**LVX6048 firmware quirks** (learned the hard way; baked into the tooling):
- **PI18 `POP` is single-digit.** `solar_battery_utility` encodes as `POP1`, NOT
`POP01` — the inverter *silently rejects* the malformed `POP01` (no error on the
result topic). Fixed in `lvx-control` + `lvx-flash/flash.py` `POP_MAP`. A `POP` set
returns "crc check fails" on the result topic but still applies; `PCP` returns "Succeeded".
- **`MCHGC`/`MUCHGC` are locked while charging** — a charge-current change NAKs whenever
`mppt1_charger_status=charging` (even though `device_mode` reads `Battery`). Set them
only in a pre-charge idle window.
- **`stop_charge_voltage` is really `battery_re_discharge_voltage`** and can't exceed
`float`; firmware NAKs `0`/"Full". Don't use it to force a grid charge — use POP=SUB.
- **PIRI readback lags ~5 min**, so verify a setter by *behavior*
(`line_power_direction`, `device_mode`), not the readback entity.
- **powermon's adhoc queue can wedge** (commands stop landing on the result topic);
`sudo systemctl restart powermon.service powermon2.service` clears it.
## Monitoring & troubleshooting skills
Agent-runnable skills for this install live in
[`../.claude/skills/`](../.claude/skills/) (see `REFERENCE.md` there for the system
map): `solar-health-check`, `troubleshoot-inverter`, `troubleshoot-battery`,
`power-usage`, and `calibration-charge`, plus helpers `lib/solar-snapshot` (live MQTT)
and `lib/ha-history` (HA recorder lookback).

View File

@@ -41,18 +41,35 @@ Risky setters (battery thresholds, type, output mode, factory reset) are
intentionally **not** exposed here — those should go through
`lvx-flash/flash.py apply` with an explicit profile and confirmation.
### Known limitation
### Known limitations & quirks
`max_charging_current` (MCHGC) and `max_utility_charging_current` (MUCHGC)
return `Failed` via PI18 when the inverter is actively charging (mode 06)
— the firmware appears to lock these setters during charge cycles. Other
setters (POP / PCP / PSP / PEI / PDI) work in all observed modes. If you
need to reliably change the charge-current caps, either:
return `Failed` whenever the inverter is **actively charging** (charger_status
= `charging`, even while `device_mode` reads `Battery`) — the firmware locks
these charge-current setters during charge. Set them only in a pre-charge idle
window, or via the LCD (Programs 02 / 11), or `lvx-flash/flash.py apply`.
- wait for the inverter to settle into Standby (mode 01) and retry, or
- change via the LCD (Programs 02 / 11), or
- use `lvx-flash/flash.py apply` (it stops the powermon services first,
giving exclusive USB access).
Other quirks (confirmed 2026-06-25):
- **`POP` is single-digit.** `output_priority=solar_battery_utility` must encode
as `POP1`, NOT `POP01` — the inverter *silently rejects* the malformed `POP01`
(nothing on the `result` topic, no effect). Fixed in `POP_MAP` here and in
`flash.py`. If reinstalling, make sure the live `/usr/local/bin/lvx-control`
has the fix.
- A **`POP` set returns "crc check fails"** on the `result` topic but still
applies; `PCP` returns a clean "Succeeded". So verify a POP change by
**behavior** (`device_mode` / `line_power_direction`), not the result string.
- **PIRI readback lags ~5 min** — don't trust the `*_output_source_priority`
sensor to confirm a just-issued change; watch the behavior instead.
- If commands stop landing on the `result` topic entirely, powermon's adhoc
queue has wedged → `sudo systemctl restart powermon.service powermon2.service`.
### Forcing a full grid charge (calibration)
To grid-charge the bank to full (e.g. SoC calibration on a cloudy day), set
`output_priority``solar_utility_battery` (SUB) so the inverter runs loads off
grid and charges the battery to full, plus `charger_priority``solar_and_utility`.
Revert to `solar_battery_utility` + `solar_first` when done. Automated and
safety-monitored by `../../.claude/skills/lib/grid-cal-monitor`.
Track the `result` topic to see the actual outcome of each command.

View File

@@ -44,7 +44,7 @@ ADHOC_TOPICS = (
# Intentionally a strict subset of flash.py's SCHEDULE — only the safe,
# day-to-day knobs. The risky-end calibration setters live in flash.py.
POP_MAP = {"solar_utility_battery": "0", "solar_battery_utility": "01"}
POP_MAP = {"solar_utility_battery": "0", "solar_battery_utility": "1"} # PI18 POP is single-digit; "01" is malformed -> inverter silently rejects (confirmed 2026-06-25, broke a calibration auto-revert)
PCP_MAP = {"solar_first": "0", "solar_and_utility": "1", "solar_only": "2"}
PSP_MAP = {"battery_load_utility_ac": "0", "load_battery_utility": "1"}
ALLOWED_MCHGC = (10, 20, 30, 40, 50, 60, 70, 80)

View File

@@ -79,7 +79,7 @@ KEY_DOCS: dict[str, str] = {
"grid_tie": "enum: enabled | disabled (PEI/PDI)",
}
POP_MAP = {"solar_utility_battery": "0", "solar_battery_utility": "01"}
POP_MAP = {"solar_utility_battery": "0", "solar_battery_utility": "1"} # PI18 POP is single-digit; "01" is malformed -> inverter silently rejects (confirmed 2026-06-25)
PCP_MAP = {"solar_first": "0", "solar_and_utility": "1", "solar_only": "2"}
PSP_MAP = {"battery_load_utility_ac": "0", "load_battery_utility": "1"}
PBT_MAP = {"AGM": "0", "FLOODED": "1", "USER": "2"}

View File

@@ -0,0 +1,67 @@
# LVX6048 settings profile — TEMPORARY calibration charge for the EG4 LP4 v2 bank.
#
# Purpose: re-anchor drifted EG4 pack SoC counters (and top-balance) by letting the
# bank reach a FULL charge with absorption hold. The EG4 BMS resets SoC to 100% only
# on a real full-charge termination (high cell voltage + low taper current); the
# conservative everyday profile stops grid charging at 54.0 V (mid-knee), so on cloudy
# / high-load stretches the bank may go weeks without a full charge and the coulomb
# counters drift (e.g. pack 6 read 76% while physically at ~53% on 2026-06-24).
#
# !!! DEPRECATED 2026-06-25 — DO NOT USE. The re_discharge lever does not work for
# grid calibration: firmware NAKs stop_charge_voltage above float (56 > 54), and it's
# the wrong mechanism anyway. The CORRECT grid-charge lever is output priority -> SUB
# (solar_utility_battery) via lvx-control — see the calibration-charge skill §3 and
# memory project_lvx6048_grid_charge_lever. Kept only as a record of the dead end.
#
# !!! 2026-06-24 FINDING — grid-assist lever corrected, still UNVALIDATED !!!
# The original idea (stop_charge_voltage: 0 = "Full") was REJECTED by the firmware:
# `flash.py apply` got an inverter NAK on `BUCD480,000` on BOTH units (no change made).
# The field flash.py calls `stop_charge_voltage` is really the inverter's
# **battery_re_discharge_voltage** (HA: sensor.lvx6048_*_battery_re_discharge_voltage) —
# the voltage at which the inverter switches loads back to battery after grid charging.
# At 54.0 V, grid tops the bank only to ~54 V. Raising it (below) lets grid charge
# higher, BUT it may band-oscillate near the setpoint rather than hold a clean
# absorption, so it's NOT guaranteed to give the full-charge termination the BMS needs
# to re-anchor. SOLAR-ONLY is the known-good method (solar follows the full CC/CV curve
# to bulk + absorption); use this grid profile only as a supervised experiment.
#
# Corrected (candidate) change vs canonical: stop_charge_voltage 54.0 -> 56.0 (was 0).
# bulk_voltage stays 56.4 (absorption target).
#
# USE: this is a TEMPORARY profile driven by the `calibration-charge` skill. Apply to
# BOTH inverters, run the full charge, verify all 6 packs hit 100%, then REVERT to
# eg4-lp4-v2.yaml. Do not leave this profile applied — it removes the everyday
# grid-charge ceiling.
#
# sudo systemctl stop powermon.service powermon2.service
# ./flash.py apply --device /dev/lvx6048-1 --profile profiles/eg4-lp4-v2-calibration.yaml --confirm
# ./flash.py apply --device /dev/lvx6048-2 --profile profiles/eg4-lp4-v2-calibration.yaml --confirm
# ./flash.py compare --device-a /dev/lvx6048-1 --device-b /dev/lvx6048-2
# sudo systemctl start powermon.service powermon2.service
# # ... drive + verify the charge (see calibration-charge skill) ...
# # REVERT when all packs read 100%:
# sudo systemctl stop powermon.service powermon2.service
# ./flash.py apply --device /dev/lvx6048-1 --profile profiles/eg4-lp4-v2.yaml --confirm
# ./flash.py apply --device /dev/lvx6048-2 --profile profiles/eg4-lp4-v2.yaml --confirm
# sudo systemctl start powermon.service powermon2.service
battery_type: USER
cutoff_voltage: 48.0
stop_discharge_voltage: 48.0
# re-discharge voltage. 54.0 (canonical) tops grid charge to ~54 V; 56.0 lets grid
# charge higher. NOT 0 — firmware NAKs 0/"Full". Range 48.0..58.0. UNVALIDATED lever.
stop_charge_voltage: 56.0
bulk_voltage: 56.4
float_voltage: 54.0
max_charging_current: 60
max_utility_charging_current: 30
output_source_priority: solar_battery_utility
charger_priority: solar_first
solar_power_priority: battery_load_utility_ac
grid_tie: disabled

View File

@@ -1,6 +1,8 @@
# LVX6048 settings profile — EG4 LifePower4 v2 LiFePO4 bank, open-loop.
#
# Designed for: 3× EG4 LP4 v2 100 Ah in parallel (300 Ah, ~15.4 kWh).
# Designed for: 6× EG4 LP4 v2 100 Ah in parallel (600 Ah, ~30.7 kWh).
# (Was 3 packs originally; 3 more added — commit 38ac9ca. Values below are
# unchanged and remain correct for 6 packs — they're only MORE conservative now.)
# Apply identically to both inverters in a parallel pair:
# sudo systemctl stop powermon.service powermon2.service
# ./flash.py apply --device /dev/lvx6048-1 --profile profiles/eg4-lp4-v2.yaml --confirm
@@ -15,11 +17,22 @@
# stop_charge 54.0 V — grid charges only to ~mid-knee; solar handles the bulk top-off
# stop_dis 48.0 V = 3.000 V/cell — soft "switch to grid" point
# cutoff 48.0 V = 3.000 V/cell — inverter hard shutdown floor (BMS still protects below)
# (Floor reviewed 2026-06-24: deliberately kept at 48.0/3.00 V for cycle-life margin.
# Lowering to 46-47 V would unlock only a few % of usable LFP capacity and isn't
# worth the deeper cycling on this large, non-capacity-constrained bank.)
# Conservative off-grid policy: keep grid as a soft top-up, let solar do the bulk work.
#
# Current rationale (300 Ah bank):
# 60 A/unit × 2 units = 120 A combined ≈ 0.4 C — well under bank's continuous spec
# 30 A/unit MUCHGC keeps grid-charging conservative (60 A combined)
# Current rationale (600 Ah bank):
# 60 A/unit × 2 units = 120 A combined ≈ 0.2 C — very gentle on a 600 Ah bank.
# 30 A/unit MUCHGC keeps grid-charging conservative (60 A combined).
# Charge-current headroom — REVISIT (corrected 2026-06-24): an off-peak sample
# (71 A of 120 A) first suggested no clipping, but at the actual solar PEAK (13:44,
# ~9.5 kW PV) the bank was pinned AT the 120 A cap (121 A) with SoC only 65% and
# load just 3.3 kW. So at noon the charge cap likely IS throttling harvest. Before
# raising to 80 A/unit, run the throttle test: on a clear noon, lift the cap (or add
# load) and see if combined PV climbs above ~9.5 kW (MPPT voltage staying at Vmp). If
# it does, raise max_charging_current; if PV stays flat, the array is tilt/heat/shade
# limited (45 deg tilt + trees), not cap limited. See memory project_solar_array_config.
# enum: AGM | FLOODED | USER
# USER required to enable the per-cell custom voltages below.

51
README.md Normal file
View File

@@ -0,0 +1,51 @@
# solar — home power monitoring & control
Monitoring, control, and calibration tooling for an off-grid-leaning solar + storage
install, all published to one Home Assistant MQTT broker (`10.0.0.41`).
## The system
```
6× EG4 LifePower4 v2 packs ──RS485 (1 FTDI each)──┐
2× MPP Solar LVX6048 inverters ──USB-HID/PI18──────┤ monitoring Pi ──MQTT──► Home Assistant
1× OpenEVSE charger (10.0.0.249) ───────────────────┘ (daemons) 10.0.0.41:1883
```
- **14.4 kW PV** (36×400 W, 4×9s strings, 9s2p per inverter, 45° south) → 2 paralleled
LVX6048 inverters → **~30 kWh** EG4 LifePower4 bank (6× 100 Ah, 16S).
## Subsystems
| Dir | What | Entities |
|-----|------|----------|
| [`LVX6048/`](./LVX6048/) | 2 inverters via powermon (PI18/USB) + `lvx-flash` settings tool + `lvx-control` HA→PI18 bridge | `lvx6048_{1,2}_*` |
| [`eg4battery/`](./eg4battery/) | 6 battery packs via RS485/Modbus daemon | `lifepower4_{1..6}_*` |
| [`openevse/`](./openevse/) | EV charger HA-discovery publisher | `openevse_*` |
| [`battery/`](./battery/) | legacy V1 protocol decoder (historical) | — |
Each subsystem has its own `README.md` / `Install.md` / `NOTES.md`.
## Agent skills
Agent-runnable skills for monitoring, troubleshooting, and calibration live in
[`.claude/skills/`](./.claude/skills/) — start with
[`.claude/skills/REFERENCE.md`](./.claude/skills/REFERENCE.md) (system map, real HA
entity ids, known firmware quirks, action policy):
- `solar-health-check` — whole-system sweep + cross-checks + R/Y/G verdict
- `troubleshoot-inverter` / `troubleshoot-battery` — subsystem deep-dives
- `power-usage` — load vs PV vs grid vs battery balance
- `calibration-charge` — re-anchor drifted EG4 SoC via a full charge
- helpers: `lib/solar-snapshot` (live MQTT), `lib/ha-history` (HA recorder lookback),
`lib/grid-cal-monitor` (supervised grid calibration with auto-revert)
## Notable findings (see per-subsystem docs + the skills' REFERENCE)
- **EG4 SoC drifts** (counters never re-anchor without a full charge) → fixed by the
`calibration-charge` skill.
- **"Low" PV is mostly geometry + curtailment, not a fault** — both strings are healthy
(~16 A @ 300 V at clear-noon peak); the shortfall is 45° tilt, heat, trees, and the
battery charge cap throttling midday harvest.
- **LVX6048 firmware quirks**: PI18 `POP` is single-digit (`POP1`, not `POP01`);
MCHGC locked while charging; force a full grid charge via output-priority **SUB**, not
the voltage thresholds; PIRI readback lags ~5 min (verify by behavior).

View File

@@ -148,7 +148,11 @@ come from decoding the Comm1/Comm2 hub bus instead (a future mode).
### Adapters
On this host, three USB-FTDI adapters are plugged into the three packs' RS485 ports:
On this host there are now **6 packs**, each with its own USB-FTDI adapter on its
RS485 port. The config of record (all 6 `packs:` entries with their `/dev/serial/
by-id/...` paths, addresses, and bauds) is `~/.config/eg4-battery/eg4-battery.yaml`
and the example at `config/eg4-battery.yaml.example`. Packs 15 are addr `0x40` @
9600; **pack 6** is the odd one, addr `0x01` @ 115200. First three (historical):
| Adapter ID | Pack | `/dev/serial/by-id/...` |
|------------------|----------------|--------------------------------------------------------|
@@ -156,12 +160,34 @@ On this host, three USB-FTDI adapters are plugged into the three packs' RS485 po
| A994XGUY | bat2 (RS485) | `usb-FTDI_FT232R_USB_UART_A994XGUY-if00-port0` |
| A994XMBR | bat3 (RS485) | `usb-FTDI_FT232R_USB_UART_A994XMBR-if00-port0` |
Each pack gets polled on its own bus → no shared-bus arbitration, no master/slave coordination needed, pack Modbus address is 0x40 for all of them.
Each pack gets polled on its own bus → no shared-bus arbitration, no master/slave coordination needed.
## LVX6048 compatibility (still true)
LVX6048 BMS port protocols: `PYL` (Pylontech), `LIb` (MPP LIO), `WEC` (WECO), `SOL` (Soltaro), `VSC` (Pylontech-CAN), `USE` (voltage-only). **No native EG4 LP4V2 support.** For inverter↔battery comms, set `P05/P14 = USE` and manage charge profile via `lvx-flash`. See DIY Solar Forum threads 67496 & 96019, LVX6048WP manual §9-2.
### Closed-loop BMS comms — evaluated, NOT recommended (2026-06)
Going closed-loop (daisy-chain packs → master CAN → inverter) was assessed and
**rejected** for this install:
1. **No native protocol** — would rely on the LP4V2 emulating Pylontech-CAN; unverified.
2. **Loses per-pack monitoring** — closed-loop needs the inter-pack Comm daisy-chain,
which silences slave packs' RS485 ports (see "RS485 only works standalone" above).
The 6-FTDI per-pack/per-cell telemetry — our best diagnostic — collapses to
master-only. Bad trade.
3. **Doesn't fix the real pain (SoC drift)** — closed-loop just forwards the BMS's own
(drifted) SoC to the inverter. The cure is a periodic full charge (see eg4battery
README §"SoC drift & calibration"), doable open-loop today.
### Forcing a full GRID charge (for calibration on a cloudy day)
The lever is the inverter's **output priority**, NOT the voltage thresholds: switch
POP to **SUB** (`solar_utility_battery`) so the inverter runs loads off grid AND
charges the bank to full from grid+solar; pair with `charger_priority=solar_and_utility`.
Revert to `solar_battery_utility` + `solar_first` when done. Both via lvx-control.
(Raising `re_discharge`/`stop_charge_voltage` does NOT work — firmware NAKs it.)
Automated + safety-monitored + auto-reverting in `../.claude/skills/lib/grid-cal-monitor`.
## Bring-up checklist (when a new pack goes live)
1. Wire: plug USB-FTDI adapter (stock pin-1-2 cable) into the pack's **RS485** port.

View File

@@ -5,8 +5,9 @@ RS-485 and publishes per-pack telemetry to MQTT with HA auto-discovery.
## Status: live
All 3 packs publishing in `modbus_per_pack` mode, each on its own FTDI
RS-485 adapter. Per pack, ~70 named entities + 136 raw `register_NN` series:
All **6 packs** publishing in `modbus_per_pack` mode, each on its own FTDI
RS-485 adapter (packs 15 at addr `0x40`/9600; pack 6 is an oddball at addr
`0x01`/115200). Per pack, ~70 named entities + 136 raw `register_NN` series:
```
lifepower4_1_pack_voltage 52.56 V (16 cells × 3.285 V)
@@ -41,6 +42,18 @@ Set by `bus.mode` in `~/.config/eg4-battery/eg4-battery.yaml`:
See [`NOTES.md`](./NOTES.md) for architecture, register map, LVX6048
compatibility findings, and bring-up checklist.
## SoC drift & calibration
The per-pack BMS SoC is coulomb-counted and **drifts** because the conservative
LVX6048 charge profile rarely drives a true full charge, so the counters never
re-anchor to 100% (observed 870% spread across packs at an identical resting
voltage — they're all physically at the same charge; the spread is pure drift).
Fix is a periodic **full charge to absorption**, which re-anchors every pack to
100%. Automated by the `calibration-charge` skill (solar-only, or grid-assisted
via output-priority SUB on a cloudy day) — see
[`../.claude/skills/calibration-charge/`](../.claude/skills/calibration-charge/)
and `../.claude/skills/lib/grid-cal-monitor`.
## What's in the box
```