Skip to content

Latest commit

 

History

History
506 lines (306 loc) · 28.6 KB

File metadata and controls

506 lines (306 loc) · 28.6 KB

Energy Management System Improvements - Prioritized Implementation Plan

�🔴 CRITICAL PRIORITY (System Reliability)

0. Fix Battery Discharge Power Control Bug

Impact: High | Effort: Medium | Dependencies: Growatt inverter control

Description: Discharge power seems to always be 100% leading to higher export than intended during EXPORT_ARBITRAGE operations.

1. Extended Horizon Optimization ✅ Completed (v7.2.0-v7.3.0, PRs #21-#22)

Impact: High | Effort: Medium | Dependencies: Price manager, DP algorithm

Description: Use tomorrow's electricity prices to extend the intraday optimization horizon to cover today + tomorrow (up to 192 periods), enabling true cross-day arbitrage decisions.

Background: The intraday optimizer (runs every 15 min) currently uses only get_today_prices(), giving a shrinking horizon toward midnight. A zero terminal value caused the optimizer to export battery energy late in the day rather than holding it for self-consumption overnight/tomorrow. Option 3 (terminal value from tomorrow's mean price) was deployed as a minimal fix — it stops premature exporting but does not enable true cross-day arbitrage.

Implementation:

  • ✅ Modify OptimizationManager to fetch tomorrow's prices when available (PR #21)
  • ✅ Extend DP algorithm input to handle 48+ hour horizons (PR #21, 192 periods)
  • ✅ Update ViewBuilder to display extended optimization results (PR #22 dashboard, PR #23 inverter)
  • N/A Multi-day TOU schedule management — Growatt TOU segments are dateless (HH:MM), so multi-day TOU isn't possible at the hardware level. The optimizer uses tomorrow's data for better decisions, but only today's schedule is deployed.

🟡 HIGH PRIORITY (Core Functionality)

Add SolaX Modbus inverter support

Impact: High | Effort: High | Dependencies: Inverter abstraction layer, GrowattScheduleManager

Description: Add support for the homeassistant-solax-modbus integration alongside the existing Growatt integration. This would allow BESS Manager to be used with SolaX inverters, significantly expanding the supported hardware.

Implementation:

  • Introduce an inverter abstraction layer (interface/protocol) that GrowattScheduleManager and a new SolaXScheduleManager both implement
  • Implement SolaXScheduleManager using SolaX Modbus entities for TOU schedule deployment and battery mode control
  • Add inverter type selection to settings (config.yaml, BatterySettings, setup wizard)
  • Route schedule deployment through the selected inverter manager in BatterySystemManager
  • Verify sensor mapping: SolaX entities for battery SOC, charge/discharge power, and grid import/export may differ from Growatt names
  • Update health checks to validate the correct inverter entities based on selected type
  • Add SolaX-specific documentation to docs/INSTALLATION.md

Files: core/bess/growatt_schedule_manager.py (extract interface), new core/bess/solax_schedule_manager.py, core/bess/battery_system_manager.py, core/bess/settings.py, backend/settings_store.py, config.yaml


Rename strategic_intent to battery_intent throughout the codebase

Impact: Low | Effort: Low | Dependencies: decision_intelligence.py, dp_battery_algorithm.py, sph_schedule.py, models.py, frontend

Description: The term "strategic intent" has been replaced with "battery intent" in the software design document. Rename accordingly in code:

  • StrategicIntent enum → BatteryIntent (dp_battery_algorithm.py)
  • strategic_intent field → battery_intent in DecisionData (models.py)
  • All assignments and references in decision_intelligence.py, sph_schedule.py, battery_system_manager.py, and any API serialization
  • Frontend: any display label or type referencing strategicIntent / strategic_intent

Investigate redundant power gate in strategic intent detection

Impact: Low | Effort: Low | Dependencies: decision_intelligence.py, dp_battery_algorithm.py

Description: In create_decision_data (decision_intelligence.py), strategic intent is determined by an outer power < -0.1 / power > 0.1 check followed by inner energy flow checks (battery_to_grid, grid_to_battery). The power check is likely redundant: the detailed flows in EnergyData are derived automatically via _calculate_detailed_flows() from battery_charged/battery_discharged, so if power < -0.1 then battery_discharged > 0 and the flow checks already handle the distinction. The inner flow thresholds (0.1 kWh) also provide the same noise filtering as the outer power threshold. Verify whether the outer power gate can be removed and intent determined solely from energy flows.

Improve InfluxDB health check error messages in UI ✅ Completed (v8.2.1)

Resolution: HTTP 401/403/404 now show actionable messages; ConnectionError shows the URL that failed.

1. Improve Battery SOC and Actions Component

Impact: Medium-High | Effort: High | Dependencies: Backend cost calculations

Description: Core feature enhancement showing detailed battery optimization reasoning

Current State: BatteryLevelChart.tsx exists with SOC and battery action visualization. Missing: cost breakdown table and actual/predicted timeline split.

Implementation:

  • Add actual/predicted timeline split with visual distinction
  • Add detailed cost breakdown table:
  Base case cost:     65.69 SEK (Actual: 27.06 + Predicted: 38.63)
  Grid cost:         -14.90 SEK (Actual: 28.13 + Predicted: -43.03)
  Battery wear cost:   9.89 SEK (Actual: 2.60 + Predicted: 7.29)
  Total savings:      80.59 SEK (Actual: -1.07 + Predicted: 81.67)

Technical Tasks:

  • Update BatteryLevelChart.tsx for actual/predicted split
  • Create cost breakdown table component
  • Add hover tooltips with detailed calculations
  • Integrate with backend hourly cost data

2. Complete Decision Intelligence Implementation

Impact: Medium-High | Effort: Medium | Dependencies: decision_intelligence.py, sensor_collector.py, dp_battery_algorithm.py

Vision: Transform the DP battery optimization from a "black box" into a transparent, educational system that helps users understand complex energy economics and multi-hour optimization strategies. Users should see real SEK values for each energy pathway and understand why the optimizer made each decision — not just what it decided.

What is working (future/predicted hours only):

  • Advanced flow pattern recognition: SOLAR_TO_HOME_AND_BATTERY, GRID_TO_HOME_PLUS_BATTERY_TO_GRID, etc.
  • Economic chain explanations: multi-hour strategy reasoning with real SEK values
  • Future target hours: identifies when arbitrage opportunities occur
  • Frontend DecisionFramework.tsx component is complete and consuming enhanced data

Gap 1: Historical hours show fallback values

Past periods (already executed) still show:

  • advanced_flow_pattern: "NO_PATTERN_DETECTED"
  • detailed_flow_values: {}
  • economic_chain: "Historical data - basic strategic intent"

Root cause: the historical data pipeline (SensorCollectorHistoricalDataStore) does not run through decision_intelligence.py. Fix: apply create_decision_data() when recording historical periods, using actual energy flow data and prices from that period.

Gap 2: Future economic values showing 0.00 SEK

Future arbitrage calculations (the "expected arbitrage value" in economic chain explanations) show 0.00 SEK. Needs investigation in DP algorithm economic chain value computation and how future target hour values are propagated.

Files: core/bess/decision_intelligence.py, core/bess/dp_battery_algorithm.py, core/bess/sensor_collector.py, frontend/src/components/DecisionFramework.tsx


3. Move Relevant Parts of Daily Summary to Dashboard

Impact: Medium | Effort: Low-Medium | Dependencies: Dashboard layout

Current State: SavingsPage contains energy independence metrics that belong on Dashboard

Implementation:

  • Extract Energy Independence Card: Self-sufficiency %, Grid independence time, Solar utilization %
  • Remove duplicates: Eliminate redundant cost/savings between Dashboard and Savings pages

Technical Tasks:

  • Create EnergyIndependenceCard.tsx component
  • Extract logic from SavingsPage.tsx
  • Add to DashboardPage.tsx
  • Remove duplicate information

5. Enhance Insights Page with Decision Detail

Impact: Medium | Effort: High | Dependencies: Backend decision logging

Current State: InsightsPage.tsx renders PredictionAnalysisView but lacks decision reasoning, algorithm transparency, and confidence metrics

Implementation:

  • Add detailed decision analysis: Why each battery action was chosen
  • Algorithm transparency: DP optimization steps, price arbitrage reasoning
  • Alternative scenarios: Options considered, confidence metrics

Technical Tasks:

  • Extend backend to capture decision reasoning
  • Create decision timeline component
  • Add interactive decision trees
  • Include confidence metrics display

6. Demo Mode for Users Without Configured Sensors

Impact: Medium | Effort: Medium | Dependencies: Backend architecture, Mock data

Description: Allow users to run and explore the system without requiring fully configured Home Assistant sensors. This enables evaluation, development, and troubleshooting scenarios.

Implementation:

  • Enhanced mock data generation: Create realistic synthetic energy data, battery states, and pricing
  • Demo mode toggle: Configuration option to enable full demo mode vs partial sensor availability
  • Graceful degradation: System operates with missing sensors using reasonable defaults
  • Demo data scenarios: Multiple realistic scenarios (high solar, EV charging, peak pricing days)
  • Visual indicators: Clear UI indication when running in demo/mock mode

Benefits: Users can evaluate the system before full HA integration, developers can test without hardware, easier onboarding experience

Current State: ha_api_controller.py has a basic test_mode / set_test_mode() infrastructure but no synthetic data generation or UI indicators.

Technical Tasks:

  • Extend existing test mode functionality in ha_api_controller.py
  • Create comprehensive mock data generators for all sensor types
  • Add demo mode configuration to config.yaml
  • Update frontend to show demo mode indicators
  • Ensure optimization algorithms work with mock data

📄 DOCUMENTATION IMPROVEMENTS

Improve Consumption Forecast Documentation

Impact: Medium | Effort: Low | Dependencies: docs/INSTALLATION.md

Current gap: Step 3 explains how to create the 48h average sensor but not why it works or how to tune it for different households.

What to add:

  • Explain what BESS does with the sensor: the DP optimizer uses the current sensor value as the predicted consumption for all future periods in the optimization horizon.
  • Explain why the battery-active filter matters: without it, battery discharge power inflates the apparent home consumption, causing the optimizer to over-predict load and charge more aggressively than needed.
  • Explain how to tune for your household:
    • The 48h window is a good default — it captures both weekday and weekend patterns.
    • If your consumption has strong seasonal variation (e.g. heat pump), consider a shorter window (12-24h) so the average adapts faster.
    • If you have large predictable loads (sauna, hot tub), the average smooths these out — the optimizer will not plan for a 3 kW sauna spike at 19:00 specifically.
    • EV charging: whether to include or exclude depends on whether BESS should see EV charging as "normal home load" (include → optimizer plans for it) or as a separate managed load (exclude → optimizer ignores it, relies on discharge inhibit sensor instead).

Improve EV Charging / Discharge Inhibit Documentation

Impact: Medium | Effort: Low | Dependencies: docs/INSTALLATION.md

Current gap: Line 172 says "EV charging: Exclude if managed separately. Include if you want BESS to optimize around it." — too brief. The discharge inhibit sensor is not explained at all.

What to add:

BESS does not control EV charging — it is designed to work in parallel with it. Normally both the car and the battery charge when electricity is cheap, so there is no conflict.

The exception is Tibber grid rewards (and similar grid balancing programs). Grid rewards can start EV charging even when prices are not at their lowest, because Tibber compensates you separately for supporting grid balancing.

BESS auto-detects any binary_sensor whose entity ID ends with _charging or _is_charging (e.g. binary_sensor.zap263668_charging) and treats it as a discharge inhibit signal. When the sensor is on, BESS will not discharge the battery even if the schedule says to.

If BESS were to discharge the battery at the same time, that energy would flow to the car instead of from the grid — you would miss out on the grid reward income while also losing the battery support you would have had for the home.

The discharge inhibit only blocks discharging — it does not change the TOU schedule, trigger charging, or interfere with the EV charging session in any way.


🟢 LOW PRIORITY (Polish)

7. Add Prediction accuracy and history

8. Intent is not always correct for historical data

Current State: The inverter sometimes charges/discharges small amounts like 0.1kW. Or its a rounding error or inefficiencies losses when calculating flows. I don't think its a strategic intent, but it is interpreted as one.

9. Add multi day view ✅ Completed (v7.2.0-v7.3.0, PRs #21-#22)

Problem: Today we only operate on 24h intervals. But at noon every day we get tomorrows schedule. We could use this information to take better economic decisions. It would mean changing a lot of places where 24h is hard coded.

Resolution: The DP optimizer now considers up to 192 periods (2 days) when tomorrow's prices are available (PR #21). Dashboard charts (PR #22) and inverter schedule overview (PR #23) display the extended horizon. TOU deployment remains today-only due to Growatt hardware limitations.

🔵 ROBUSTNESS IMPROVEMENTS (System Observability)

Complete or Remove EV Energy Meter Integration

Impact: Low | Effort: Low-Medium | Dependencies: energy_flow_calculator.py, sensor_collector.py, ha_api_controller.py

Current state: The EV charger energy meter (ev_energy_meter) is partially wired in but the feature is incomplete:

  • sensor_collector.py: ev_energy_meter is in cumulative_sensor_keys and sensor_method_map (→ get_ev_energy)
  • energy_flow_calculator.py: maps to aux_load and calculates the delta per period
  • ha_api_controller.py: has get_ev_energy_meter in METHOD_SENSOR_MAP but no corresponding def get_ev_energy_meter() method
  • aux_load is computed but never consumed — nothing in the API, savings calculations, or dashboard uses it

The sensor is intentionally excluded from health checks to avoid spurious warnings.

Options:

  1. Finish it: Implement def get_ev_energy_meter(self) in ha_api_controller.py, fix the method name mismatch (get_ev_energy vs get_ev_energy_meter in sensor_collector), surface aux_load in the dashboard and savings breakdown, add to optional health checks
  2. Remove it: Delete the dead ev_energy_meter entries from cumulative_sensor_keys, sensor_method_map, METHOD_SENSOR_MAP, and energy_flow_calculator.py

Improve InfluxDB Health Check to Verify Sensor Coverage

Impact: Medium | Effort: Low-Medium | Dependencies: health_check.py, influxdb_helper.py

Problem: The "Historical Data Access" health check reports OK as long as InfluxDB is reachable and the bucket contains any row. It uses a limit(n: 1) probe — equivalent to pinging the database. It does not verify that the specific sensors the BESS system needs are actually present. As a result, it reported OK on 2026-04-03 even though 4 of 10 required sensors had no data in InfluxDB.

Note: A sensor showing value 0 (e.g. battery_input_energy at day start) is valid — cumulative sensors legitimately start at 0. The check must verify existence (any data point in past 7 days), not recent non-zero values.

Current behavior (influxdb_helper.py:test_influxdb_connection()):

from(bucket: "...")
  |> range(start: -24h)
  |> limit(n: 1)

Passes if any measurement returns a row. No knowledge of which sensors were found.

Desired behavior:

For each sensor configured in the BESS system (from METHOD_SENSOR_MAP), run a targeted query:

from(bucket: "...")
  |> range(start: -7d)
  |> filter(fn: (r) => r["entity_id"] == "sensor.battery_input_energy")
  |> limit(n: 1)

Report:

  • OK: all core sensors found in InfluxDB
  • WARNING: optional sensors missing (configured but no InfluxDB data)
  • ERROR: core energy sensors missing (battery, grid, consumption)

Technical Tasks:

  • Extend test_influxdb_connection() to accept a list of entity IDs to probe
  • Pass the configured sensor entity IDs from METHOD_SENSOR_MAP (or a defined "core" subset)
  • Return per-sensor results so check_historical_data_access() can report which sensors are missing
  • Distinguish core sensors (battery_input_energy, battery_output_energy, grid_import, grid_export, load_energy) from optional (ev_energy_meter, solar forecasts)
  • Surface missing optional sensors as WARNING, missing core sensors as ERROR
  • This will make "Historical Data Access" reflect actual data availability, not just connectivity

🔄 ARCHITECTURAL IMPROVEMENTS (From Historical Design Analysis)

10. Machine Learning Predictions

Impact: Medium | Effort: High | Dependencies: Historical data, ML framework

Description: ML-based consumption and solar predictions to improve optimization accuracy beyond current HA sensor forecasts.

Implementation:

  • Integrate with existing PredictionProvider framework
  • Historical data analysis for pattern recognition (weather, season, usage patterns)
  • Adaptive prediction models with confidence scoring
  • Accuracy tracking and model performance metrics

11. Performance Monitoring and Metrics

Impact: Medium | Effort: Medium | Dependencies: Analytics framework

Description: Comprehensive performance tracking for optimization effectiveness and system reliability.

Implementation:

  • Optimization accuracy tracking (predicted vs actual savings)
  • Battery performance degradation monitoring
  • Energy balance validation metrics and alerts
  • Component timing and performance metrics collection
  • Automated reporting and alerting for anomalies

12. Data Export and Analysis Tools

Impact: Low | Effort: Medium | Dependencies: Data stores

Description: Export capabilities for external analysis and system backup.

Implementation:

  • JSON/CSV export of historical energy data and optimization decisions
  • Configuration backup/restore functionality
  • Optimization decision logs with reasoning export
  • Integration with external analytics tools (Grafana, etc.)

🟠 POTENTIAL IMPROVEMENTS

Full Arbitrage Cycle Savings Display

Impact: Medium | Effort: Medium | Dependencies: models.py, savings page UI

Description: The savings table shows per-hour P&L (solar_only_cost - hourly_cost). This is correct and honest, but charging hours show negative savings and discharge savings appear in later hours (or the next day for overnight cycles). The daily total can appear negative when charging happened today but discharge is scheduled for tomorrow.

Idea: Add a "full cycle savings" summary somewhere in the savings page or dashboard that aggregates completed charge→discharge cycles and shows the net arbitrage profit per cycle. This would complement the existing per-hour P&L without changing the underlying formula.

Optimizer vs Dashboard Savings Baseline Mismatch

Impact: Medium | Effort: Medium | Dependencies: DP algorithm, models.py, daily_view_builder

Description: The optimizer and dashboard use different baselines for calculating savings, which causes confusing discrepancies between predicted and actual savings numbers.

The Two Calculations:

Optimizer (dp_battery_algorithm.py:897-906) Dashboard (models.py:231-242)
Baseline Grid-only: consumption × buy_price Solar-only: (consumption - solar) × buy_price - excess_solar × sell_price
Solar in baseline? No — set to zero (solar_only_cost=total_base_cost, # Simplified) Yes — uses real solar production data
Formula total_base_cost - total_optimized_cost solar_only_cost - hourly_cost
Used for Profitability gate decision (line 933) Dashboard total_savings display

Why This Matters:

The dashboard metric is correct — it answers "did the battery save money vs just having solar?" The optimizer's metric conflates solar savings with battery savings. When the optimizer reports +46 SEK, that includes value from solar production that you'd earn regardless of battery operation.

Concrete Risk: The profitability gate (grid_to_battery_solar_savings < min_action_profit_threshold) compares against the grid-only baseline. On sunny days with high solar production, this could approve battery schedules that appear profitable (because solar savings are included) but are actually unprofitable when measured by the dashboard's correct solar-only baseline.

In winter (low solar), the impact is negligible since both baselines converge. In summer (high solar), the optimizer could systematically overestimate battery profitability.

Potential Fix: Change the optimizer's total_base_cost to use the solar-only baseline:

# Current (grid-only baseline):
total_base_cost = sum(home_consumption[i] * buy_price[i] for i in range(len(buy_price)))

# Proposed (solar-only baseline - matches dashboard):
total_base_cost = sum(
    max(0, home_consumption[i] - solar_production[i]) * buy_price[i]
    - max(0, solar_production[i] - home_consumption[i]) * sell_price[i]
    for i in range(len(buy_price))
)

This would make the profitability gate compare apples-to-apples with the dashboard savings, and prevent approving battery operations that lose money relative to the solar-only baseline.


🔧 TECHNICAL DEBT

Simplify Health Check Severity Model

Impact: Low | Effort: Low-Medium | Dependencies: health_check.py, power_monitor.py, all callers of perform_health_check()

Description: The current model has two independent knobs that are easy to misconfigure:

  • is_required — marks the component as critical to the system (used only by the dashboard banner to set has_critical_errors)
  • required_methods — controls whether a failing sensor inside the component shows ERROR vs WARNING on the component card

These concepts are orthogonal but interact in non-obvious ways. The correct mapping of is_required → severity was never enforced, which caused Power Monitoring (is_required=False) to show ERROR instead of WARNING because required_methods=all_methods was passed. Fixed by hand, but the underlying design is fragile.

Proposed simplification: Derive required_methods automatically from is_required instead of requiring callers to pass both:

  • is_required=True → all methods are required → failure → ERROR
  • is_required=False → no methods are required → failure → WARNING

This eliminates the required_methods parameter entirely and makes the policy self-consistent: optional components can never show ERROR, required components always show ERROR on failure.

Files: core/bess/health_check.py, core/bess/power_monitor.py, core/bess/health_check.py (all perform_health_check() call sites)


Move charging_power_rate out of BatterySettings

Impact: Low | Effort: Medium | Dependencies: power_monitor.py, settings_store.py, migration

Description: charging_power_rate is stored in BatterySettings but it is not a battery hardware characteristic — it is a live Growatt number entity (battery_charge_power_limit) that is read and written via HA at runtime. It ended up in BatterySettings only because power_monitor.py needs an initial value before HA is read. That is weak justification for treating it as a user-facing battery setting.

What needs to change:

  • Remove charging_power_rate from BatterySettings dataclass (core/bess/settings.py)
  • Update power_monitor.py to use a local constant or read the initial value from HA directly
  • Update schema migration in settings_store.py (remove from battery section, or move to growatt section)
  • Verify _BATTERY_MODEL_ATTRS in api.py updates automatically (it is derived from the dataclass, so it will)
  • Update any tests that reference battery_settings.charging_power_rate

Files: core/bess/settings.py, core/bess/power_monitor.py, backend/settings_store.py


FormattingContext Architecture

Impact: Low | Effort: Low (45 min) | Dependencies: None

Description: Replace currency parameter passing with FormattingContext dataclass for better extensibility and i18n support.

Current State: Currency passed as string parameter through call chain

Implementation: Create frozen FormattingContext dataclass, update create_formatted_value() and dataclass from_internal() methods, modify API endpoints to create context from settings

Benefits: Type safety, extensibility for locale/timezone/precision without signature changes, future-proof for internationalization

Files: backend/api_dataclasses.py, backend/api.py

Hardcoded Fallback Values Violating CLAUDE.md ✅ Completed (v8.2.1)

Resolution: All hasattr guards and hardcoded fallback values removed from api.py. System now accesses battery_settings, controller, price_manager, _schedule_manager, and has_critical_sensor_failures directly. Added _get_intent_description() to SphScheduleManager to eliminate polymorphism-related hasattr.

Other Technical Debt

  • Refactor all API endpoints to use dataclass-based serialization (with robust mapping for all field variants) for consistent, type-safe, and future-proof API responses. Ensure all details and fields are preserved as in the original dict-based implementation.
  • Check if all sensors in config.yaml are actually needed and used (lifetime e.g.)

TOU Segment Matching is Fragile: The current TOU comparison uses exact matching on start_time, end_time, batt_mode. If a segment shifts by 15 minutes (e.g., 00:00-00:59 → 00:15-01:14), it's seen as completely different, resulting in 2 hardware writes (disable old + add new) instead of 1 update. Consider:

  • Overlap-based matching: If segments overlap significantly and have same mode, treat as "same"
  • Smart merging: Detect when segments can be extended/shortened rather than replaced

Remove Hourly Aggregation Legacy: With 15-min TOU resolution implemented, the hourly aggregation code is now legacy. Power rates are already set per-period in _apply_period_settings(). The following should be refactored or removed:

  • _calculate_hourly_settings_with_strategic_intents() - aggregates 15-min periods back to hourly
  • get_hourly_settings() - returns hourly settings (used by power monitor and display)
  • _get_hourly_intent() - majority voting for hourly intent (no longer needed for TOU)
  • hourly_settings dict - stores the aggregated hourly data

To remove:

  1. Update adjust_charging_power() in battery_system_manager.py to use period-based settings
  2. Update schedule display table to show 15-min periods (or keep hourly summary for readability)
  3. Update get_strategic_intent_summary() to work directly with periods
  4. Remove the hourly aggregation methods listed above

Sensor Collector InfluxDB Usage: Based on the code analysis: The function _get_hour_readings in SensorCollector is called by collect_energy_data(hour). This is not called every hour automatically by the system; it is called when the system wants to collect and record data for a specific hour. The actual historical data for the dashboard is served from the HistoricalDataStore, which is an in-memory store populated by calls to record_energy_data (which uses the output of collect_energy_data).

The _get_hour_readings (and thus the InfluxDB query) is called at startup (to reconstruct history) and whenever a new hour is completed and needs to be recorded. It is not called every hour by a scheduler, but it is called for each hour that needs to be reconstructed or recorded.