Add solar monitoring/troubleshooting skills for agents
Four Skill-tool skills under .claude/skills/ that let an agent monitor and
troubleshoot the install (2x LVX6048, 6x EG4 LifePower4, OpenEVSE), grounded
in the real MQTT/HA topology rather than generic advice:
- solar-health-check : whole-system sweep + cross-checks + R/Y/G verdict,
incl. cross-unit "silently-dead inverter" detection
- troubleshoot-inverter: FWS fault decode, parallel sync, USB link recovery
- troubleshoot-battery : per-pack imbalance vs SoC-counter-drift, RS485 silence
- power-usage : PV/load/grid/battery balance + EVSE sessions
Shared lib:
- solar-snapshot : live MQTT capture (creds from powermon.yaml, no hardcoding)
- ha-history : HA recorder lookback (token from ~/.config/ha/token)
REFERENCE.md documents topology, real HA entity_ids (doubled slug), known
issues, and a safe-remediation-only action policy (restarts yes; setters no).
Action boundary: diagnose + restart wedged daemons / recover USB links;
never touches inverter/battery setters or flash.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2
.claude/skills/.gitignore
vendored
Normal file
2
.claude/skills/.gitignore
vendored
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
145
.claude/skills/REFERENCE.md
Normal file
145
.claude/skills/REFERENCE.md
Normal file
@@ -0,0 +1,145 @@
|
|||||||
|
# Solar install — system map (shared reference for the solar skills)
|
||||||
|
|
||||||
|
This file is the ground truth the `solar-*` / `troubleshoot-*` / `power-usage`
|
||||||
|
skills build on. Read it once at the start of any solar task. Everything below
|
||||||
|
was verified live on this host (the monitoring Pi) on 2026-06-23; re-verify
|
||||||
|
anything load-bearing before acting on it.
|
||||||
|
|
||||||
|
## Topology
|
||||||
|
|
||||||
|
```
|
||||||
|
6× EG4 LifePower4 v2 packs ──RS485 (1 FTDI each)──┐
|
||||||
|
2× MPP Solar LVX6048 inverters ──USB-HID/PI18─────┤ this Pi ──MQTT──► HA broker
|
||||||
|
1× OpenEVSE charger (10.0.0.249) ──its own WiFi───┘ (daemons) 10.0.0.41:1883
|
||||||
|
```
|
||||||
|
|
||||||
|
All telemetry lands on the **MQTT broker at 10.0.0.41:1883** under HA
|
||||||
|
auto-discovery (`homeassistant/<class>/<entity>/config` retained, `.../state`
|
||||||
|
republished each poll cycle — **state topics are NOT retained**, so to read
|
||||||
|
current values you must listen for a window: use `lib/solar-snapshot`).
|
||||||
|
|
||||||
|
Broker credentials live in `~/.config/powermon/powermon.yaml`
|
||||||
|
(`mqttbroker.{name,port,username,password}`). **Never hardcode them** — every
|
||||||
|
tool here reads them from that file. `lib/solar-snapshot` does too.
|
||||||
|
|
||||||
|
## The snapshot helper
|
||||||
|
|
||||||
|
`./lib/solar-snapshot` (relative to this skills dir) captures the latest value of
|
||||||
|
every matching MQTT topic over a short window and prints a table. This is the
|
||||||
|
primary read tool — prefer it over raw `mosquitto_sub`.
|
||||||
|
|
||||||
|
```
|
||||||
|
solar-snapshot [-w SECONDS] [-g GREP_RE] [-f] TOPIC_FILTER...
|
||||||
|
```
|
||||||
|
MQTT `+` matches one WHOLE level, so `lifepower4_+` matches nothing. Subscribe to
|
||||||
|
`homeassistant/sensor/+/state` and narrow with `-g`:
|
||||||
|
```
|
||||||
|
solar-snapshot -g 'lvx6048_1_' 'homeassistant/sensor/+/state'
|
||||||
|
solar-snapshot -w 16 -g 'lifepower4_[1-6]_soc/' 'homeassistant/sensor/+/state'
|
||||||
|
solar-snapshot 'openevse/#' # EVSE publishes on-change; idle when unplugged
|
||||||
|
```
|
||||||
|
|
||||||
|
## The history helper
|
||||||
|
|
||||||
|
`solar-snapshot` only sees *now*. For "when did X last happen / show last week",
|
||||||
|
use `./lib/ha-history`, which queries **Home Assistant's recorder** (the only
|
||||||
|
store that keeps history — local journald is volatile, ~1 day, wiped on reboot;
|
||||||
|
no solar data goes to InfluxDB). Default window 7 days; HA recorder default
|
||||||
|
retention is 10 days.
|
||||||
|
```
|
||||||
|
ha-history [-s SINCE] [-e END] [-m REGEX] [-a] ENTITY...
|
||||||
|
ha-history -s "10 days ago" sensor.lvx6048_lvx6048_1_device_mode sensor.lvx6048_lvx6048_2_device_mode
|
||||||
|
ha-history -s "10 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
|
||||||
|
```
|
||||||
|
**HA entity_ids ≠ MQTT object names.** powermon's hass output doubles the device
|
||||||
|
slug and is inconsistent across commands, so you must use the real ids, e.g.:
|
||||||
|
- device mode: `sensor.lvx6048_lvx6048_{1,2}_device_mode` (device slug `lvx6048`)
|
||||||
|
- fault code: `sensor.lvx6048_0{1,2}_lvx6048_{1,2}_fault_code` (slug `lvx6048_01`/`_02`)
|
||||||
|
- PV/batt/load: `sensor.lvx6048_lvx6048_1_{mppt1_input_power,mppt1_input_voltage,battery_voltage,ac_output_active_power}`
|
||||||
|
- EG4 packs follow the same doubling, e.g. `sensor.lifepower4_*`. When unsure, list
|
||||||
|
them: `curl -s -H "Authorization: Bearer $(cat ~/.config/ha/token)" $HA/api/states
|
||||||
|
| python3 -c 'import sys,json;[print(s["entity_id"]) for s in json.load(sys.stdin) if "lvx6048" in s["entity_id"]]'`
|
||||||
|
Auth: reads a long-lived token from `~/.config/ha/token` (mode 600) or `$HA_TOKEN`
|
||||||
|
— never on the command line, never hardcoded. Base URL `$HA_URL` else
|
||||||
|
`~/.config/ha/url` else `http://10.0.0.41:8123`. If it reports "no token", the user
|
||||||
|
must create one (HA → Profile → Security → Long-lived access tokens) and write it
|
||||||
|
to `~/.config/ha/token`; tell them which file, don't ask them to paste it in chat.
|
||||||
|
Recorder excludes (per `eg4battery/homeassistant/recorder.yaml`) drop EG4
|
||||||
|
per-cell/register/string entities — those have no history; the inverter
|
||||||
|
`device_mode`/`fault_code` and pack `soc`/`pack_voltage` etc. are recorded.
|
||||||
|
|
||||||
|
## Services (this Pi)
|
||||||
|
|
||||||
|
| Service | Role | Entities it feeds |
|
||||||
|
|---|---|---|
|
||||||
|
| `powermon.service` | LVX6048 #1 poller (PI18/USB) | `lvx6048_1_*` |
|
||||||
|
| `powermon2.service` | LVX6048 #2 poller (PI18/USB) | `lvx6048_2_*` |
|
||||||
|
| `lvx-resolve-links.service` | oneshot: maps `/dev/hidraw*` → `/dev/lvx6048-{1,2}` by PI18 serial; runs before powermon | (links) |
|
||||||
|
| `lvx-control.service` | bridges `solar/control/lvx6048/*` → powermon adhoc queue | (control) |
|
||||||
|
| `eg4-battery.service` | polls all 6 packs over RS485/Modbus | `lifepower4_1..6_*` |
|
||||||
|
|
||||||
|
Quick health: `systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service`
|
||||||
|
Logs: `journalctl -u <svc> --since "10 min ago" --no-pager`
|
||||||
|
|
||||||
|
## Entities cheat-sheet
|
||||||
|
|
||||||
|
**Inverters** `lvx6048_{1,2}_*` (PI18 GS/MOD/PIRI/FWS/ET):
|
||||||
|
`device_mode` (Power-On/Standby/Bypass/Battery/Fault/Charge…), `fault_code`,
|
||||||
|
`battery_voltage`, `battery_capacity` (%), `ac_output_active_power` (W),
|
||||||
|
`ac_output_voltage`, `grid_voltage`, `mppt1_input_power`/`mppt2_input_power` (W, PV),
|
||||||
|
`inverter_heat_sink_temperature`, `parallel_instance_number` (0 = master, 1+ = slave).
|
||||||
|
|
||||||
|
**Packs** `lifepower4_{1..6}_*` (Modbus): `soc`, `soc_alt`, `pack_voltage`,
|
||||||
|
`pack_current` (signed, + = charging), `cell_01..16_voltage`,
|
||||||
|
`cell_voltage_delta_mv` (imbalance), `cell_voltage_min`/`max`, `capacity_ah`,
|
||||||
|
`temperature_01..04`, `temperature_pcb`, `model`, `firmware_version`,
|
||||||
|
`firmware_date`, warning/protection bits, `register_NN` raw. There are 16 cells/pack.
|
||||||
|
|
||||||
|
**EVSE** `openevse/<key>` and `openevse_*` HA entities: `power` (W), `voltage`,
|
||||||
|
`amp` (mA raw → A in HA), `pilot`, `max_current`, `session_energy` (Wh),
|
||||||
|
`total_energy`, `status` (active/sleeping/disabled…), `state`, `temp`,
|
||||||
|
`vehicle` (plug). Charger HTTP UI at http://10.0.0.249.
|
||||||
|
|
||||||
|
Derived HA template sensors (`lifepower4_N_pack_power`, `_temperature_max`,
|
||||||
|
`_cell_imbalance_pct`, `lifepower4_stack_*`) are computed **inside HA**, not on
|
||||||
|
MQTT — compute them yourself from the raw entities when working off the Pi.
|
||||||
|
|
||||||
|
## Known issues / gotchas (check memory for the canonical versions)
|
||||||
|
|
||||||
|
- **Inverter `battery_voltage` is INTERMITTENTLY wrong** — read a correct ~54 V on
|
||||||
|
2026-06-20 (verified via HA history), but ~9–10 V on 2026-06-23/24 after the Jun 22
|
||||||
|
14:18 reboot, with packs steady at ~52–53 V throughout. So it's a post-reboot /
|
||||||
|
re-init glitch (the inverter or PI18 GS field not settling after restart), NOT a
|
||||||
|
permanent scaling bug. Implication: treat the inverter battery reading as
|
||||||
|
untrustworthy and use the `lifepower4_*` pack entities for any battery math; if it
|
||||||
|
reads ~10 V right now, a powermon (or inverter) restart may clear it — worth testing.
|
||||||
|
- **Pack 6 is an oddball**: Modbus addr `0x01` @ 115200 (packs 1–5 are `0x40` @
|
||||||
|
9600); ran 65 % SoC while 1–5 sat 40–44 %. Treat as a distinct member.
|
||||||
|
- **EG4 SoC never re-anchors** (drifts because packs rarely hit 100 % to reset the
|
||||||
|
coulomb counter). See memory `project_eg4_soc_drift_remediation`.
|
||||||
|
- **RS485 daisy-chain silences slave packs** — each pack needs its own FTDI; an
|
||||||
|
inter-pack chain demotes slaves. See memory `project_eg4_daisy_chain_silences_slaves`.
|
||||||
|
- **No per-day inverter energy** — PI18 only gives `ET` (lifetime Wh); ED/EM/EY NAK.
|
||||||
|
Daily kWh must come from HA recorder or ET deltas.
|
||||||
|
- **Parallel cluster**: changing inverter settings on only one unit risks fault 86
|
||||||
|
(desync). `lvx-control` always mirrors to both — that's why setters go through it.
|
||||||
|
|
||||||
|
## Action policy for these skills
|
||||||
|
|
||||||
|
**Allowed (safe remediation):**
|
||||||
|
- Read anything: `solar-snapshot`, `mosquitto_sub`, `journalctl`, `systemctl status/is-active`.
|
||||||
|
- Restart the data-plane daemons when they're wedged:
|
||||||
|
`sudo systemctl restart powermon.service` / `powermon2.service` / `eg4-battery.service` / `lvx-control.service`
|
||||||
|
- Recover inverter USB links: `sudo systemctl restart lvx-resolve-links.service`
|
||||||
|
or `sudo /usr/local/sbin/lvx-resolve-links`.
|
||||||
|
|
||||||
|
**Forbidden (escalate to the user instead — propose the exact command, don't run it):**
|
||||||
|
- Any inverter/battery **setter**: `solar/control/lvx6048/*` publishes
|
||||||
|
(charger priority, max charge current, output priority, …).
|
||||||
|
- `lvx-flash/flash.py apply` and `dump`/`compare`/`sync-check` — they contend for
|
||||||
|
exclusive USB and stop powermon; advanced, user-driven only.
|
||||||
|
- Anything that writes battery thresholds, output mode, or factory resets.
|
||||||
|
- Power-cycling hardware, moving cables, breaker changes.
|
||||||
|
|
||||||
|
When a fix is outside the allowed set, report the finding and hand the user the
|
||||||
|
precise command(s) to run.
|
||||||
184
.claude/skills/lib/ha-history
Executable file
184
.claude/skills/lib/ha-history
Executable file
@@ -0,0 +1,184 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""ha-history — pull state history for HA entities from the Home Assistant
|
||||||
|
recorder, and print a compact change-point timeline. Companion to solar-snapshot
|
||||||
|
(which only sees live values); this is the historic-lookback tool.
|
||||||
|
|
||||||
|
Auth: reads a long-lived access token from ~/.config/ha/token (or $HA_TOKEN).
|
||||||
|
No secret is ever passed on the command line or hardcoded.
|
||||||
|
Base URL: $HA_URL, else ~/.config/ha/url, else http://10.0.0.41:8123.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
ha-history [-s SINCE] [-e END] [-m REGEX] [-a] ENTITY [ENTITY ...]
|
||||||
|
|
||||||
|
ENTITY entity_id; a bare name with no dot is auto-prefixed `sensor.`
|
||||||
|
e.g. `lvx6048_1_device_mode` -> `sensor.lvx6048_1_device_mode`
|
||||||
|
-s SINCE start of window. default "7 days ago".
|
||||||
|
accepts: "7 days ago", "7d", "36h", "90m", an ISO timestamp,
|
||||||
|
or a date "2026-06-16".
|
||||||
|
-e END end of window. default: now. same formats as -s.
|
||||||
|
-m REGEX only show change-points whose state matches REGEX (case-insensitive);
|
||||||
|
the per-entity header still reports the full count. e.g. -m fault
|
||||||
|
-a show every recorded point, not just state *changes*.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
ha-history lvx6048_1_device_mode lvx6048_2_device_mode
|
||||||
|
ha-history -s "10 days ago" -m fault lvx6048_1_fault_code lvx6048_2_fault_code
|
||||||
|
ha-history -s 2026-06-16 -e 2026-06-23 lvx6048_1_device_mode
|
||||||
|
"""
|
||||||
|
import sys, os, re, json, argparse, urllib.request, urllib.parse, urllib.error
|
||||||
|
from datetime import datetime, timedelta, timezone
|
||||||
|
|
||||||
|
CONF_DIR = os.path.expanduser("~/.config/ha")
|
||||||
|
DEFAULT_URL = "http://10.0.0.41:8123"
|
||||||
|
|
||||||
|
|
||||||
|
def die(msg, code=1):
|
||||||
|
print(f"ha-history: {msg}", file=sys.stderr)
|
||||||
|
sys.exit(code)
|
||||||
|
|
||||||
|
|
||||||
|
def load_token():
|
||||||
|
tok = os.environ.get("HA_TOKEN")
|
||||||
|
if tok:
|
||||||
|
return tok.strip()
|
||||||
|
path = os.path.join(CONF_DIR, "token")
|
||||||
|
if not os.path.exists(path):
|
||||||
|
die("no token. Create a Long-Lived Access Token in HA "
|
||||||
|
"(Profile -> Security), then:\n"
|
||||||
|
" mkdir -p ~/.config/ha && install -m600 /dev/stdin ~/.config/ha/token\n"
|
||||||
|
"or set $HA_TOKEN.")
|
||||||
|
with open(path) as f:
|
||||||
|
tok = f.read().strip()
|
||||||
|
if not tok:
|
||||||
|
die(f"{path} is empty")
|
||||||
|
return tok
|
||||||
|
|
||||||
|
|
||||||
|
def base_url():
|
||||||
|
if os.environ.get("HA_URL"):
|
||||||
|
return os.environ["HA_URL"].rstrip("/")
|
||||||
|
p = os.path.join(CONF_DIR, "url")
|
||||||
|
if os.path.exists(p):
|
||||||
|
with open(p) as f:
|
||||||
|
u = f.read().strip()
|
||||||
|
if u:
|
||||||
|
return u.rstrip("/")
|
||||||
|
return DEFAULT_URL
|
||||||
|
|
||||||
|
|
||||||
|
def parse_when(s, *, default_now=False):
|
||||||
|
if s is None:
|
||||||
|
return datetime.now(timezone.utc).astimezone() if default_now else None
|
||||||
|
s = s.strip()
|
||||||
|
m = re.fullmatch(r"(\d+)\s*(d|h|m)(?:ays?|ours?|in(?:ute)?s?)?(?:\s*ago)?", s, re.I)
|
||||||
|
if m:
|
||||||
|
n, unit = int(m.group(1)), m.group(2).lower()
|
||||||
|
delta = {"d": timedelta(days=n), "h": timedelta(hours=n), "m": timedelta(minutes=n)}[unit]
|
||||||
|
return datetime.now(timezone.utc).astimezone() - delta
|
||||||
|
# ISO timestamp or bare date
|
||||||
|
try:
|
||||||
|
dt = datetime.fromisoformat(s)
|
||||||
|
except ValueError:
|
||||||
|
die(f"can't parse time {s!r}. Use '7 days ago', '36h', ISO, or 'YYYY-MM-DD'.")
|
||||||
|
if dt.tzinfo is None: # assume local tz
|
||||||
|
dt = dt.astimezone()
|
||||||
|
return dt
|
||||||
|
|
||||||
|
|
||||||
|
def fetch(url, token):
|
||||||
|
req = urllib.request.Request(url, headers={"Authorization": f"Bearer {token}"})
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=30) as r:
|
||||||
|
return json.load(r)
|
||||||
|
except urllib.error.HTTPError as e:
|
||||||
|
if e.code == 401:
|
||||||
|
die("401 Unauthorized — token rejected. Regenerate it in HA and rewrite "
|
||||||
|
"~/.config/ha/token.")
|
||||||
|
die(f"HTTP {e.code} from HA: {e.reason}")
|
||||||
|
except urllib.error.URLError as e:
|
||||||
|
die(f"cannot reach HA at {url.split('/api')[0]}: {e.reason}")
|
||||||
|
|
||||||
|
|
||||||
|
def fmt_local(iso):
|
||||||
|
"""HA returns UTC ISO; show local time, second precision."""
|
||||||
|
try:
|
||||||
|
return datetime.fromisoformat(iso).astimezone().strftime("%Y-%m-%d %H:%M:%S")
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
return str(iso)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
ap = argparse.ArgumentParser(add_help=False)
|
||||||
|
ap.add_argument("-s", "--since", default="7 days ago")
|
||||||
|
ap.add_argument("-e", "--end", default=None)
|
||||||
|
ap.add_argument("-m", "--match", default=None)
|
||||||
|
ap.add_argument("-a", "--all-points", action="store_true")
|
||||||
|
ap.add_argument("-h", "--help", action="store_true")
|
||||||
|
ap.add_argument("entities", nargs="*")
|
||||||
|
a = ap.parse_args()
|
||||||
|
if a.help or not a.entities:
|
||||||
|
print(__doc__.strip())
|
||||||
|
sys.exit(0 if a.help else 2)
|
||||||
|
|
||||||
|
ents = [e if "." in e else f"sensor.{e}" for e in a.entities]
|
||||||
|
start = parse_when(a.since)
|
||||||
|
end = parse_when(a.end, default_now=True)
|
||||||
|
matcher = re.compile(a.match, re.I) if a.match else None
|
||||||
|
|
||||||
|
token = load_token()
|
||||||
|
url = (f"{base_url()}/api/history/period/"
|
||||||
|
f"{urllib.parse.quote(start.isoformat())}"
|
||||||
|
f"?end_time={urllib.parse.quote(end.isoformat())}"
|
||||||
|
f"&filter_entity_id={urllib.parse.quote(','.join(ents))}"
|
||||||
|
f"&minimal_response&no_attributes")
|
||||||
|
data = fetch(url, token)
|
||||||
|
|
||||||
|
print(f"# HA history {start.strftime('%Y-%m-%d %H:%M')} -> "
|
||||||
|
f"{end.strftime('%Y-%m-%d %H:%M')} ({base_url()})\n")
|
||||||
|
|
||||||
|
by_id = {}
|
||||||
|
for series in data or []:
|
||||||
|
if series:
|
||||||
|
by_id[series[0].get("entity_id")] = series
|
||||||
|
|
||||||
|
for ent in ents:
|
||||||
|
series = by_id.get(ent)
|
||||||
|
if not series:
|
||||||
|
print(f"{ent}\n (no recorded history in window — entity wrong, "
|
||||||
|
f"excluded from recorder, or purged)\n")
|
||||||
|
continue
|
||||||
|
# Build (time, state) points, collapsing consecutive identical states
|
||||||
|
# unless --all-points.
|
||||||
|
points, prev = [], object()
|
||||||
|
for item in series:
|
||||||
|
st = item.get("state")
|
||||||
|
ts = item.get("last_changed") or item.get("last_updated")
|
||||||
|
if a.all_points or st != prev:
|
||||||
|
points.append((ts, st))
|
||||||
|
prev = st
|
||||||
|
shown = [(ts, st) for ts, st in points if not matcher or matcher.search(str(st))]
|
||||||
|
label = "points" if a.all_points else "changes"
|
||||||
|
extra = f", {len(shown)} match /m" if matcher else ""
|
||||||
|
print(f"{ent} ({len(points)} {label}{extra})")
|
||||||
|
if not shown:
|
||||||
|
print(" (nothing matched)\n")
|
||||||
|
continue
|
||||||
|
for i, (ts, st) in enumerate(shown):
|
||||||
|
mark = " <<< FAULT" if re.search(r"fault", str(st), re.I) and st not in ("No fault",) else ""
|
||||||
|
# duration until next change-point in the *full* timeline
|
||||||
|
dur = ""
|
||||||
|
if not matcher:
|
||||||
|
nxt = points[i + 1][0] if i + 1 < len(points) else None
|
||||||
|
if nxt:
|
||||||
|
try:
|
||||||
|
d = datetime.fromisoformat(nxt) - datetime.fromisoformat(ts)
|
||||||
|
secs = int(d.total_seconds())
|
||||||
|
dur = f" ({secs//3600}h{secs%3600//60:02d}m)" if secs >= 3600 else f" ({secs//60}m)"
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
pass
|
||||||
|
print(f" {fmt_local(ts)} {st}{dur}{mark}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
97
.claude/skills/lib/solar-snapshot
Executable file
97
.claude/skills/lib/solar-snapshot
Executable file
@@ -0,0 +1,97 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# solar-snapshot — capture the latest retained/published value of every MQTT
|
||||||
|
# topic matching a filter, over a short listen window, and print a clean table.
|
||||||
|
#
|
||||||
|
# Why a listen window: powermon/eg4-battery STATE topics are NOT retained — they
|
||||||
|
# are republished every poll cycle (GS ~5s, packs ~one cycle, EVSE on-change).
|
||||||
|
# So we subscribe for a few seconds and keep the last value seen per topic.
|
||||||
|
# (HA discovery `.../config` topics ARE retained and show up immediately.)
|
||||||
|
#
|
||||||
|
# Broker credentials are read from ~/.config/powermon/powermon.yaml (the same
|
||||||
|
# source the openevse + lvx-control tools use) so nothing is hardcoded here.
|
||||||
|
#
|
||||||
|
# NOTE on MQTT wildcards: `+` matches exactly ONE whole level, so it cannot be
|
||||||
|
# used as a name prefix. `homeassistant/sensor/lifepower4_+/state` matches NOTHING.
|
||||||
|
# To grab a family of entities, subscribe to the level wildcard and filter with -g:
|
||||||
|
# solar-snapshot -g lifepower4 'homeassistant/sensor/+/state'
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# solar-snapshot [-w SECONDS] [-f] [-g GREP_RE] TOPIC_FILTER [TOPIC_FILTER ...]
|
||||||
|
# -w SECONDS listen window (default 12)
|
||||||
|
# -f print full topic path (default: strip homeassistant/<class>/ prefix)
|
||||||
|
# -g GREP_RE keep only topics whose path matches this extended-regex
|
||||||
|
#
|
||||||
|
# Examples:
|
||||||
|
# solar-snapshot -g 'lvx6048_1' 'homeassistant/sensor/+/state'
|
||||||
|
# solar-snapshot -w 18 -g 'lifepower4_[1-6]_soc' 'homeassistant/sensor/+/state'
|
||||||
|
# solar-snapshot 'openevse/#'
|
||||||
|
# solar-snapshot -w 6 'homeassistant/sensor/lvx6048_1_battery_voltage/state' \
|
||||||
|
# 'homeassistant/sensor/lifepower4_1_pack_voltage/state'
|
||||||
|
#
|
||||||
|
# Exit status reflects the formatting stage, not mosquitto_sub's benign -W
|
||||||
|
# window-expiry code, so callers don't misread a normal capture as a failure.
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
WINDOW=12
|
||||||
|
FULL=0
|
||||||
|
GREP_RE=""
|
||||||
|
while getopts "w:fg:" opt; do
|
||||||
|
case "$opt" in
|
||||||
|
w) WINDOW="$OPTARG" ;;
|
||||||
|
f) FULL=1 ;;
|
||||||
|
g) GREP_RE="$OPTARG" ;;
|
||||||
|
*) echo "usage: solar-snapshot [-w SECONDS] [-f] [-g GREP_RE] TOPIC_FILTER..." >&2; exit 2 ;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
shift $((OPTIND - 1))
|
||||||
|
if [ "$#" -lt 1 ]; then
|
||||||
|
echo "usage: solar-snapshot [-w SECONDS] [-f] [-g GREP_RE] TOPIC_FILTER..." >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
|
CONF="${POWERMON_CONF:-$HOME/.config/powermon/powermon.yaml}"
|
||||||
|
if [ ! -r "$CONF" ]; then
|
||||||
|
echo "solar-snapshot: cannot read broker config $CONF" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Pull host/port/user/pass from the mqttbroker: block of powermon.yaml.
|
||||||
|
# Keys are anchored to leading whitespace + exact key so `name:` doesn't also
|
||||||
|
# match `username:`.
|
||||||
|
read -r HOST PORT USER PASS < <(awk '
|
||||||
|
/^[^[:space:]]/ { inblk=0 }
|
||||||
|
/^mqttbroker:/ { inblk=1; next }
|
||||||
|
inblk && /^[[:space:]]+name:/ { h=$2 }
|
||||||
|
inblk && /^[[:space:]]+port:/ { p=$2 }
|
||||||
|
inblk && /^[[:space:]]+username:/ { u=$2 }
|
||||||
|
inblk && /^[[:space:]]+password:/ { w=$2 }
|
||||||
|
END { print h, (p?p:1883), u, w }
|
||||||
|
' "$CONF")
|
||||||
|
|
||||||
|
if [ -z "${HOST:-}" ]; then
|
||||||
|
echo "solar-snapshot: no mqttbroker.name found in $CONF" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Build -t args from filters.
|
||||||
|
TARGS=()
|
||||||
|
for f in "$@"; do TARGS+=(-t "$f"); done
|
||||||
|
|
||||||
|
# Subscribe for the window, then reduce to last-value-per-topic.
|
||||||
|
timeout "$((WINDOW + 2))" mosquitto_sub -h "$HOST" -p "$PORT" -u "$USER" -P "$PASS" \
|
||||||
|
-W "$WINDOW" -v "${TARGS[@]}" 2>/dev/null \
|
||||||
|
| { [ -n "$GREP_RE" ] && grep -E "$GREP_RE" || cat; } \
|
||||||
|
| awk -v full="$FULL" '
|
||||||
|
{ t=$1; $1=""; sub(/^ /,""); v=$0; last[t]=v; order[t]=NR }
|
||||||
|
END {
|
||||||
|
n=0
|
||||||
|
for (t in last) { keys[n++]=t }
|
||||||
|
# stable-ish sort by topic name
|
||||||
|
for (i=0;i<n;i++) for (j=i+1;j<n;j++) if (keys[j]<keys[i]) { tmp=keys[i];keys[i]=keys[j];keys[j]=tmp }
|
||||||
|
for (i=0;i<n;i++) {
|
||||||
|
t=keys[i]; disp=t
|
||||||
|
if (!full) { sub(/^homeassistant\/[^/]+\//,"",disp); sub(/\/state$/,"",disp) }
|
||||||
|
printf "%-44s %s\n", disp, last[t]
|
||||||
|
}
|
||||||
|
if (n==0) print "(no messages in window — topics idle, broker unreachable, or filter wrong)"
|
||||||
|
}'
|
||||||
71
.claude/skills/power-usage/SKILL.md
Normal file
71
.claude/skills/power-usage/SKILL.md
Normal file
@@ -0,0 +1,71 @@
|
|||||||
|
---
|
||||||
|
name: power-usage
|
||||||
|
description: >-
|
||||||
|
Analyze where the power is going across the install — load vs PV generation vs
|
||||||
|
grid vs battery flow, plus EVSE charging sessions. Use when the user asks "why is
|
||||||
|
my battery draining / how much am I using / where are the watts going / is the car
|
||||||
|
charging / what's my solar production / power consumption", or wants an energy
|
||||||
|
balance or breakdown. Read-only; this skill measures and explains, it does not
|
||||||
|
change anything.
|
||||||
|
---
|
||||||
|
|
||||||
|
# power-usage
|
||||||
|
|
||||||
|
## 0. Load context
|
||||||
|
Shell cwd is the repo root; anchor paths there:
|
||||||
|
```bash
|
||||||
|
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"
|
||||||
|
```
|
||||||
|
Read `$ROOT/.claude/skills/REFERENCE.md` for entity names. Key sign conventions: pack
|
||||||
|
`pack_current` is signed (**+ = charging, − = discharging**); inverter
|
||||||
|
`mppt*_input_power` is PV in (W); `ac_output_active_power` is load out (W).
|
||||||
|
|
||||||
|
## 1. Instantaneous energy balance
|
||||||
|
```bash
|
||||||
|
# Generation (PV) + load, per inverter:
|
||||||
|
"$SNAP" -w 10 -g 'lvx6048_[12]_(mppt1_input_power|mppt2_input_power|ac_output_active_power|grid_voltage|device_mode)/' 'homeassistant/sensor/+/state'
|
||||||
|
# Battery flow, per pack (sum the pack_power = V×I yourself):
|
||||||
|
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(pack_voltage|pack_current|soc)/' 'homeassistant/sensor/+/state'
|
||||||
|
# EV charger:
|
||||||
|
"$SNAP" -w 8 'openevse/status' 'openevse/power' 'openevse/amp' 'openevse/voltage' 'openevse/session_energy'
|
||||||
|
```
|
||||||
|
Then state the balance in words:
|
||||||
|
- **PV in** = sum of all `mppt*_input_power`.
|
||||||
|
- **Battery** = sum of (pack_voltage × pack_current) over 6 packs. Negative total =
|
||||||
|
discharging (load exceeds PV+grid); positive = charging.
|
||||||
|
- **Load out** = sum of inverter `ac_output_active_power`.
|
||||||
|
- **EVSE** = `openevse/power` — and the EVSE load is a *subset* of total load, so a
|
||||||
|
draining battery with the car plugged usually explains itself here.
|
||||||
|
- **Grid**: `device_mode` Bypass/Line means grid is carrying/supplementing; Battery
|
||||||
|
mode means running off the bank. The LVX6048 has no clean grid-power entity, so
|
||||||
|
infer grid = load − PV − battery_discharge.
|
||||||
|
|
||||||
|
Sanity: PV + grid + battery_discharge ≈ load (within metering noise). A big residual
|
||||||
|
means one feed is mis-reported — note it (e.g. the known `lvx6048_1_battery_voltage`
|
||||||
|
~10 V glitch will corrupt any pack-power math that uses the *inverter's* battery
|
||||||
|
reading; always use the **pack** entities for battery flow).
|
||||||
|
|
||||||
|
## 2. "Why is the battery draining?"
|
||||||
|
Walk the chain: is PV low (night/shade/§5 of troubleshoot-inverter dead string)? Is
|
||||||
|
load high (check `ac_output_active_power` and EVSE `power`)? Is the inverter in
|
||||||
|
Battery mode instead of using grid (`device_mode`)? Pin the drain on the largest
|
||||||
|
negative contributor and say which.
|
||||||
|
|
||||||
|
## 3. EVSE sessions
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 10 'openevse/status' 'openevse/state' 'openevse/session_energy' 'openevse/total_energy' 'openevse/vehicle' 'openevse/pilot' 'openevse/max_current'
|
||||||
|
```
|
||||||
|
- `status` active = charging; sleeping/disabled = not drawing. `vehicle` = plugged.
|
||||||
|
- `session_energy` (Wh) this plug-in; `pilot`/`max_current` = the current cap the
|
||||||
|
EVSE is signalling. Idle EVSE publishes little — a short empty capture is normal.
|
||||||
|
- For history/trends (daily kWh, past sessions), the data lives in **Home Assistant's
|
||||||
|
recorder**, not on MQTT — direct the user to the HA Energy dashboard /
|
||||||
|
`sensor.openevse_total_day|week|month`. PI18 has no per-day inverter energy
|
||||||
|
(memory `project_lvx6048_no_daily_energy_query`); only `ET` lifetime Wh exists.
|
||||||
|
|
||||||
|
## 4. Report
|
||||||
|
Give the live balance (PV / load / battery / grid / EVSE, with numbers and signs),
|
||||||
|
the headline ("you're pulling X W from the bank because load Y W > PV Z W, car is
|
||||||
|
taking W W"), and point at HA recorder for anything historical. This skill never
|
||||||
|
changes settings — if the answer is "shift charging to solar hours" etc., suggest it
|
||||||
|
as advice, don't actuate.
|
||||||
103
.claude/skills/solar-health-check/SKILL.md
Normal file
103
.claude/skills/solar-health-check/SKILL.md
Normal file
@@ -0,0 +1,103 @@
|
|||||||
|
---
|
||||||
|
name: solar-health-check
|
||||||
|
description: >-
|
||||||
|
Top-level health snapshot of the whole solar/power install — 2 LVX6048
|
||||||
|
inverters, 6 EG4 LifePower4 packs, and the OpenEVSE charger — with cross-checks
|
||||||
|
and a green/yellow/red verdict. Use when the user asks "how's the solar /
|
||||||
|
battery / power system doing", "is everything ok", "check the install", wants a
|
||||||
|
status report, or as the first step before deeper troubleshooting. For deep
|
||||||
|
dives into one subsystem, hand off to troubleshoot-inverter, troubleshoot-battery,
|
||||||
|
or power-usage.
|
||||||
|
---
|
||||||
|
|
||||||
|
# solar-health-check
|
||||||
|
|
||||||
|
A fast, read-only sweep of every subsystem that ends in a clear verdict. Do NOT
|
||||||
|
change settings here; if something needs a restart, that's allowed (see policy).
|
||||||
|
|
||||||
|
## 0. Load context
|
||||||
|
Skills run with the shell cwd at the repo root, so anchor paths there:
|
||||||
|
```bash
|
||||||
|
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
|
||||||
|
```
|
||||||
|
Read `$ROOT/.claude/skills/REFERENCE.md` (system map, entity names, snapshot helper,
|
||||||
|
action policy) before proceeding.
|
||||||
|
|
||||||
|
## 1. Services up?
|
||||||
|
```bash
|
||||||
|
systemctl is-active powermon.service powermon2.service eg4-battery.service lvx-control.service lvx-resolve-links.service
|
||||||
|
```
|
||||||
|
`lvx-resolve-links` is a oneshot → expect `active`/`exited` (not `failed`). Any
|
||||||
|
`failed`/`inactive` on the others is RED. For a wedged data-plane daemon, a
|
||||||
|
restart is allowed (see §6).
|
||||||
|
|
||||||
|
## 2. Capture live telemetry
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 10 -g 'lvx6048_[12]_(device_mode|fault_code|battery_voltage|battery_capacity|ac_output_active_power|mppt1_input_power|mppt2_input_power|grid_voltage|inverter_heat_sink_temperature|parallel_instance_number)/' 'homeassistant/sensor/+/state'
|
||||||
|
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current|cell_voltage_delta_mv|temperature_pcb)/' 'homeassistant/sensor/+/state'
|
||||||
|
"$SNAP" -w 6 'openevse/status' 'openevse/amp' 'openevse/power' 'openevse/session_energy'
|
||||||
|
```
|
||||||
|
If a family returns "(no messages)": the feeding daemon is silent → that subsystem
|
||||||
|
is RED regardless of `is-active` (running but not publishing). EVSE idle/unplugged
|
||||||
|
publishing nothing is normal — confirm via `openevse/status`.
|
||||||
|
|
||||||
|
## 3. Cross-checks (this is the value-add — single sensors can each look fine)
|
||||||
|
- **Battery voltage agreement**: each inverter's `battery_voltage` should be within
|
||||||
|
~1 V of the pack stack voltage (`pack_voltage` ≈ 51–55 V). **Known anomaly:** the
|
||||||
|
inverter reading is *intermittently* wrong (correct ~54 V on 2026-06-20, ~9–10 V
|
||||||
|
after the Jun 22 reboot) — a post-reboot glitch, not a permanent bug. If it reads
|
||||||
|
~10 V, note it and suggest a powermon restart; use the `lifepower4_*` pack entities,
|
||||||
|
never the inverter reading, for any battery math (see REFERENCE known-issues).
|
||||||
|
- **Cross-unit PV production (catches a silently-dead inverter)**: compare
|
||||||
|
`lvx6048_1_mppt1_input_power` vs `lvx6048_2_mppt1_input_power`. In daylight (the
|
||||||
|
*other* unit clearly producing), one unit pinned at **0 W** = that inverter is down
|
||||||
|
and being masked by its sibling — RED → troubleshoot-inverter. This is exactly the
|
||||||
|
2026-06-20 fault-08 failure mode (unit 1 sat at 0 W for ~1.8 days). At night/heavy
|
||||||
|
shade both at 0 W is normal.
|
||||||
|
- **SoC spread across packs**: `max(soc) - min(soc)` over the 6 packs. BUT first
|
||||||
|
cross-check against `pack_voltage`/`cell_voltage_max`: the packs are paralleled, so
|
||||||
|
if all `pack_voltage` agree (±0.1 V) the packs are physically at the same charge and
|
||||||
|
any SoC spread is **counter drift**, not real imbalance (pack 6 ran 76 % while
|
||||||
|
reading the same 53.4 V / 3.337 V/cell as packs at 50–55 % on 2026-06-24). Real
|
||||||
|
imbalance = pack voltages actually diverge. Drift → note it, recommend a calibration
|
||||||
|
charge; >20 % spread with diverging voltages = RED → troubleshoot-battery.
|
||||||
|
- **Cell imbalance**: any pack with `cell_voltage_delta_mv` > 50 = YELLOW, > 100 = RED.
|
||||||
|
- **Parallel master/slave**: exactly one inverter should report
|
||||||
|
`parallel_instance_number` 0 (master); the other 1+. Two masters or two slaves = RED.
|
||||||
|
- **Faults**: any `fault_code` non-zero, or `device_mode` = Fault = RED → troubleshoot-inverter.
|
||||||
|
- **Temps**: pack `temperature_pcb` > 55 °C or inverter heat-sink > 75 °C = YELLOW.
|
||||||
|
- **Power balance sanity**: PV in (`mppt*_input_power`) vs AC out vs pack
|
||||||
|
`pack_current` should roughly conserve. Gross mismatch = investigate via power-usage.
|
||||||
|
|
||||||
|
## 4. Verdict
|
||||||
|
Print a compact table (subsystem → state → one-line reason), then an overall
|
||||||
|
GREEN / YELLOW / RED with the top 1–3 issues and which deeper skill to run.
|
||||||
|
|
||||||
|
## 5. Recent error scan (only if anything looked off)
|
||||||
|
```bash
|
||||||
|
for s in powermon powermon2 eg4-battery lvx-control; do
|
||||||
|
echo "== $s =="; journalctl -u $s.service --since "15 min ago" --no-pager | grep -iE 'error|timeout|fail|crc|nak|reconnect' | tail -5
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5b. Historical sanity — did anything fail while unattended? (needs HA token)
|
||||||
|
Live snapshots miss faults that already cleared and silent-unit spells that ended.
|
||||||
|
If `~/.config/ha/token` exists (see REFERENCE), scan the recorder for the last few
|
||||||
|
days. Use the REAL HA entity_ids (doubled slug — see REFERENCE), not MQTT names:
|
||||||
|
```bash
|
||||||
|
"$HIST" -s "5 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
|
||||||
|
# silent-unit hunt: sample midday PV both units across recent days; one pinned 0 while
|
||||||
|
# the other produced = it was down. e.g. check a midday window per day:
|
||||||
|
"$HIST" -s "2 days ago" sensor.lvx6048_lvx6048_1_mppt1_input_power sensor.lvx6048_lvx6048_2_mppt1_input_power | head -40
|
||||||
|
```
|
||||||
|
Any fault-08 / silent-unit episode → report with timestamps and hand off to
|
||||||
|
troubleshoot-inverter §2–§5. No token → say so and point the user at REFERENCE to add one.
|
||||||
|
|
||||||
|
## 6. Allowed remediation
|
||||||
|
If a daemon is `failed` or running-but-silent, restarting it is permitted:
|
||||||
|
```bash
|
||||||
|
sudo systemctl restart eg4-battery.service # or powermon / powermon2 / lvx-control
|
||||||
|
```
|
||||||
|
Re-run the relevant snapshot to confirm data resumes. Anything beyond a restart
|
||||||
|
(settings, flash, cabling) → report and hand the user the exact command. Never
|
||||||
|
publish to `solar/control/lvx6048/*` from this skill.
|
||||||
77
.claude/skills/troubleshoot-battery/SKILL.md
Normal file
77
.claude/skills/troubleshoot-battery/SKILL.md
Normal file
@@ -0,0 +1,77 @@
|
|||||||
|
---
|
||||||
|
name: troubleshoot-battery
|
||||||
|
description: >-
|
||||||
|
Diagnose the 6× EG4 LifePower4 v2 battery packs — per-pack SoC, cell imbalance,
|
||||||
|
SoC drift, RS485/Modbus comms silence, temperature, and warning/protection bits.
|
||||||
|
Use when a pack reads oddly, packs disagree on SoC, a pack stopped reporting,
|
||||||
|
cells look imbalanced, the user mentions "battery problem / pack down / SoC wrong
|
||||||
|
/ imbalance / one battery", or after solar-health-check flags the stack. Read-only
|
||||||
|
plus a safe eg4-battery daemon restart; never writes BMS settings.
|
||||||
|
---
|
||||||
|
|
||||||
|
# troubleshoot-battery
|
||||||
|
|
||||||
|
## 0. Load context
|
||||||
|
Shell cwd is the repo root; anchor paths there:
|
||||||
|
```bash
|
||||||
|
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"
|
||||||
|
```
|
||||||
|
Read `$ROOT/.claude/skills/REFERENCE.md`. There are **6 packs** `lifepower4_1..6_*`,
|
||||||
|
all served by `eg4-battery.service` (one FTDI RS485 adapter per pack). Pack config:
|
||||||
|
`~/.config/eg4-battery/eg4-battery.yaml`.
|
||||||
|
|
||||||
|
## 1. Are all 6 packs reporting?
|
||||||
|
```bash
|
||||||
|
systemctl is-active eg4-battery.service
|
||||||
|
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|pack_voltage|pack_current)/' 'homeassistant/sensor/+/state'
|
||||||
|
```
|
||||||
|
- Fewer than 6 packs in the output → a pack is **silent on RS485**, go to §4.
|
||||||
|
- All 6 present → go to §2/§3.
|
||||||
|
|
||||||
|
## 2. SoC spread & drift
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(soc|soc_alt|pack_voltage)/' 'homeassistant/sensor/+/state'
|
||||||
|
```
|
||||||
|
- Compute `max(soc) - min(soc)`. >10 % = imbalance worth noting; >20 % = significant.
|
||||||
|
**Pack 6 historically runs high** (it's the oddball: Modbus addr `0x01`/115200 vs
|
||||||
|
`0x40`/9600 for packs 1–5) — judge it on its own, don't assume it tracks 1–5.
|
||||||
|
- **SoC drift is a known design limitation**, not a live fault: the coulomb counter
|
||||||
|
never re-anchors because the bank rarely reaches 100 % to reset. See memory
|
||||||
|
`project_eg4_soc_drift_remediation`. If SoC looks wrong but `pack_voltage` is
|
||||||
|
sane, suspect drift, not a dead pack. Voltage→SoC sanity for LFP at rest:
|
||||||
|
~51.2 V ≈ low, ~53.5 V ≈ mid, ~54+ V ≈ high (loaded/charging skews this).
|
||||||
|
|
||||||
|
## 3. Cell imbalance, temperature, protection bits
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 16 -g 'lifepower4_[1-6]_(cell_voltage_delta_mv|cell_voltage_min|cell_voltage_max|temperature_pcb)/' 'homeassistant/sensor/+/state'
|
||||||
|
# For a specific suspect pack N, pull all 16 cells + bits:
|
||||||
|
"$SNAP" -w 14 -g 'lifepower4_3_(cell_[0-9]+_voltage|warning|protection|temperature)' 'homeassistant/sensor/+/state'
|
||||||
|
```
|
||||||
|
- `cell_voltage_delta_mv`: <30 mV good, 30–50 mV watch, >50 mV imbalanced, >100 mV
|
||||||
|
bad (a weak/failing cell, or the pack simply needs a long absorb to balance).
|
||||||
|
- Any warning/protection bit set → read its name; over-/under-voltage,
|
||||||
|
over-temp, and over-current protections will also explain a pack dropping current.
|
||||||
|
- Temps: `temperature_pcb` > 55 °C = watch.
|
||||||
|
|
||||||
|
## 4. RS485 / Modbus comms silence (a pack missing from §1)
|
||||||
|
```bash
|
||||||
|
journalctl -u eg4-battery.service --since "15 min ago" --no-pager | grep -iE 'timeout|crc|nak|error|no response|pack|addr' | tail -30
|
||||||
|
ls -l /dev/serial/by-id/ | grep -i ft232 # all 6 FTDI adapters enumerated?
|
||||||
|
grep -E 'name:|port:|address|baud' ~/.config/eg4-battery/eg4-battery.yaml
|
||||||
|
```
|
||||||
|
- Missing FTDI under `by-id` → USB/adapter/cable issue for that pack (hardware →
|
||||||
|
report to user; don't unplug things yourself).
|
||||||
|
- FTDI present but pack times out → check it isn't demoted by an inter-pack
|
||||||
|
daisy-chain (memory `project_eg4_daisy_chain_silences_slaves`: each pack must be on
|
||||||
|
its own dongle; a chain silences slaves). Also confirm the pack's `address`/`baud`
|
||||||
|
in config match the unit (pack 6 legitimately differs).
|
||||||
|
- Daemon wedged after a USB re-enumerate → restart is ALLOWED:
|
||||||
|
```bash
|
||||||
|
sudo systemctl restart eg4-battery.service
|
||||||
|
```
|
||||||
|
Then re-run §1 to confirm all 6 return.
|
||||||
|
|
||||||
|
## 5. Report
|
||||||
|
Per-pack table (SoC, voltage, current, delta_mv, max temp, any bits) + the stack
|
||||||
|
spread, whether it's drift vs a real fault, comms state, what you restarted, and any
|
||||||
|
hardware action left for the user. Do not write BMS registers or thresholds.
|
||||||
113
.claude/skills/troubleshoot-inverter/SKILL.md
Normal file
113
.claude/skills/troubleshoot-inverter/SKILL.md
Normal file
@@ -0,0 +1,113 @@
|
|||||||
|
---
|
||||||
|
name: troubleshoot-inverter
|
||||||
|
description: >-
|
||||||
|
Diagnose the 2× MPP Solar LVX6048 inverters — faults/warnings (FWS), operating
|
||||||
|
mode, parallel-cluster master/slave sync, PV/MPPT input, USB-HID link loss, and
|
||||||
|
powermon daemon health. Use when an inverter shows a fault, is in the wrong mode,
|
||||||
|
stopped publishing, the two units disagree, PV looks low, or the user says
|
||||||
|
"inverter problem / fault code / no solar / one inverter is down". Read-only plus
|
||||||
|
safe link/daemon recovery; never changes inverter settings.
|
||||||
|
---
|
||||||
|
|
||||||
|
# troubleshoot-inverter
|
||||||
|
|
||||||
|
## 0. Load context
|
||||||
|
Shell cwd is the repo root; anchor paths there:
|
||||||
|
```bash
|
||||||
|
ROOT="$(git rev-parse --show-toplevel)"; SNAP="$ROOT/.claude/skills/lib/solar-snapshot"; HIST="$ROOT/.claude/skills/lib/ha-history"
|
||||||
|
```
|
||||||
|
Read `$ROOT/.claude/skills/REFERENCE.md`. Inverter entities are `lvx6048_1_*`
|
||||||
|
(powermon.service) and `lvx6048_2_*` (powermon2.service).
|
||||||
|
|
||||||
|
## 1. Is it a data problem or a device problem?
|
||||||
|
```bash
|
||||||
|
systemctl is-active powermon.service powermon2.service lvx-resolve-links.service
|
||||||
|
ls -l /dev/lvx6048-1 /dev/lvx6048-2 # symlinks present? point at hidraw?
|
||||||
|
"$SNAP" -w 12 -g 'lvx6048_[12]_(device_mode|fault_code|battery_voltage|ac_output_active_power)/' 'homeassistant/sensor/+/state'
|
||||||
|
```
|
||||||
|
- Service active + data flowing → **device/config** issue, go to §2.
|
||||||
|
- Service active but a unit's entities are silent, or a `/dev/lvx6048-*` symlink is
|
||||||
|
missing/dangling → **USB-HID link** issue, go to §4.
|
||||||
|
|
||||||
|
## 2. Faults & mode
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 12 -g 'lvx6048_[12]_(device_mode|fault_code|inverter_heat_sink_temperature)/' 'homeassistant/sensor/+/state'
|
||||||
|
journalctl -u powermon.service -u powermon2.service --since "20 min ago" --no-pager | grep -iE 'fault|warn|FWS|mode|error' | tail -20
|
||||||
|
```
|
||||||
|
- `device_mode` values: Power-On / Standby / Bypass / Battery / Line / Charge / Fault.
|
||||||
|
`Bypass`/`Line` = passing grid through (normal when grid present + low PV).
|
||||||
|
`Fault` = stop, decode the fault.
|
||||||
|
- The FWS fault/warning bit → label mapping lives in the patched driver
|
||||||
|
`$ROOT/LVX6048/powermon-patches/pi18.py` (search `FWS`, `fault`, `warning`).
|
||||||
|
Read it to translate a raw `fault_code`. MOD code labels are there too.
|
||||||
|
Quick refs: 02 over-temp, 03/04 battery V high/low, 07 overload timeout,
|
||||||
|
08 bus voltage too high, 56 battery connection open, 71 parallel version
|
||||||
|
different, 80–86 parallel-cluster faults (86 = output setting mismatch).
|
||||||
|
|
||||||
|
### Historic faults — "when did it last happen / show me last week"
|
||||||
|
Local logs only reach the last reboot (`journalctl` here is volatile, ~1 day), so
|
||||||
|
for anything older query HA's recorder via `ha-history` (needs `~/.config/ha/token`
|
||||||
|
— see REFERENCE; if absent, tell the user how to create it, don't block):
|
||||||
|
Use the REAL HA entity_ids (NOT the MQTT object names — see REFERENCE; the slug is
|
||||||
|
doubled and differs per command):
|
||||||
|
```bash
|
||||||
|
"$HIST" -s "10 days ago" sensor.lvx6048_lvx6048_1_device_mode sensor.lvx6048_lvx6048_2_device_mode
|
||||||
|
"$HIST" -s "10 days ago" -m fault sensor.lvx6048_01_lvx6048_1_fault_code sensor.lvx6048_02_lvx6048_2_fault_code
|
||||||
|
```
|
||||||
|
Each `Fault`/non-`No fault` change-point prints with a local timestamp and how long
|
||||||
|
it lasted (marked `<<< FAULT`). To pin the cause, re-query the same window with `-a`
|
||||||
|
for the surrounding conditions — `mppt1_input_voltage`, `mppt1_input_power`,
|
||||||
|
`battery_voltage`, `ac_output_active_power`:
|
||||||
|
- **Normal PV V (~300 V) + normal battery (~54 V) at the fault** → it's an internal
|
||||||
|
DC-bus transient, NOT input/battery over-voltage (rules out the cold-Voc theory).
|
||||||
|
- **Fault on ONE unit only, repeatedly** → unit-specific weakness (bus regulation /
|
||||||
|
cap / sensor / slave-CPU FW), not environmental (which hits both paralleled units).
|
||||||
|
- **`mppt_input_power` flatlines at 0 after the fault and stays there** → that unit
|
||||||
|
silently stopped producing; check it didn't sit dead until a reboot (a real 2026-06-20
|
||||||
|
occurrence: unit 1 fault 08 at 17:20 → 0 W PV until the Jun 22 reboot, ~half the
|
||||||
|
array offline ~1.8 days while unit 2 masked it). Cross-check the *other* unit's PV
|
||||||
|
over the same span to catch this.
|
||||||
|
|
||||||
|
## 3. Parallel-cluster sync
|
||||||
|
The two units run paralleled; desync throws **fault 86**.
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 12 -g 'lvx6048_[12]_(parallel_instance_number|ac_output_active_power|ac_output_voltage)/' 'homeassistant/sensor/+/state'
|
||||||
|
```
|
||||||
|
- Exactly one unit = `parallel_instance_number` 0 (master); other ≥1. Two masters /
|
||||||
|
two slaves / both 0 → cluster confused → likely needs a coordinated power cycle
|
||||||
|
(user action — propose it, don't do it).
|
||||||
|
- Healthy parallel = both units sharing load roughly symmetrically
|
||||||
|
(`ac_output_active_power` comparable). One at ~0 W while the other carries
|
||||||
|
everything = that unit dropped out of the cluster.
|
||||||
|
- Deeper sync/settings comparison via `$ROOT/LVX6048/lvx-flash/flash.py sync-check` /
|
||||||
|
`compare` exists but **stops powermon and grabs the USB** — advanced, user-run
|
||||||
|
only. Propose the command; don't execute.
|
||||||
|
|
||||||
|
## 4. USB-HID link recovery (ALLOWED)
|
||||||
|
Symptom: one unit's entities stale/absent, or `/dev/lvx6048-*` missing/dangling.
|
||||||
|
The resolver maps hidraw→stable symlink by PI18 serial and restarts powermon.
|
||||||
|
```bash
|
||||||
|
sudo systemctl restart lvx-resolve-links.service # remaps + restarts powermon{,2}
|
||||||
|
# or run the resolver directly:
|
||||||
|
sudo /usr/local/sbin/lvx-resolve-links
|
||||||
|
ls -l /dev/lvx6048-* # both symlinks back?
|
||||||
|
sudo systemctl restart powermon.service powermon2.service # if still wedged
|
||||||
|
```
|
||||||
|
Then re-run §1 snapshot to confirm both units publish again. Note: after an
|
||||||
|
inverter power-cycle the udev rule normally self-heals; manual restart is the
|
||||||
|
fallback when it didn't.
|
||||||
|
|
||||||
|
## 5. PV / MPPT looks low
|
||||||
|
```bash
|
||||||
|
"$SNAP" -w 12 -g 'lvx6048_[12]_(mppt1_input_power|mppt2_input_power|device_mode)/' 'homeassistant/sensor/+/state'
|
||||||
|
```
|
||||||
|
Cross-reference array memory (`project_solar_array_config`): 9s2p per inverter, and
|
||||||
|
there's a **suspected dead string per inverter** — one MPPT reading ~half the other,
|
||||||
|
or roughly half nameplate in full sun, is consistent with that and worth confirming
|
||||||
|
at the combiner, not a software bug. Zero PV at night/heavy shade is normal.
|
||||||
|
|
||||||
|
## 6. Report
|
||||||
|
State: which unit, mode, fault (decoded), cluster health, link state, what you
|
||||||
|
recovered (if anything), and any remaining action that needs the user (power cycle,
|
||||||
|
flash.py, combiner check) as exact commands/steps. Never publish to
|
||||||
|
`solar/control/lvx6048/*` here.
|
||||||
Reference in New Issue
Block a user