Skip to content

RFC: ML energy consumption predictor#31

Closed
pookey wants to merge 2 commits intojohanzander:mainfrom
pookey:ml-upstream-pr
Closed

RFC: ML energy consumption predictor#31
pookey wants to merge 2 commits intojohanzander:mainfrom
pookey:ml-upstream-pr

Conversation

@pookey
Copy link
Copy Markdown
Contributor

@pookey pookey commented Mar 6, 2026

Summary

This adds an optional ML module for predicting 24h energy consumption at 15-minute resolution using XGBoost. It integrates with the battery optimizer via a new consumption_strategy config setting.

image

Important caveats — please read before reviewing:

  • This is an RFC / experimental feature, not necessarily something that belongs in core. In my testing so far, the ML predictions are actually worse than a simple weekly average profile (influxdb_profile strategy). The model needs significant training data and tuning to outperform naive baselines.
  • Startup time impact: The ML module adds ~10-15s to boot (model training + prediction generation). This is acceptable for my use case but may not be for all users.
  • Docker image size: xgboost + scikit-learn add ~100MB+ to the image. The base image also switches from Alpine to Debian (required for prebuilt xgboost wheels).
  • The ml config section is fully optional — if omitted, the system behaves exactly as before with zero overhead.

I'm happy to discuss whether this should live in core, be a separate add-on, or be structured differently. Opening this PR mainly to share the work and get your thoughts on the approach.

What's included

  • ml/ module (6 files, ~2500 lines): Standalone XGBoost predictor with CLI (train, predict, evaluate, report)
  • 4 consumption strategies: sensor (default, unchanged), fixed, influxdb_profile (7-day weekly average from InfluxDB), ml_prediction
  • ML Report tab: Frontend dashboard showing model metrics, feature importance, forecast vs baselines
  • /api/ml-report endpoint: Model status, predictions, and comparison data
  • Config: New ml section in config.yaml (location, feature sensors, training params)
  • Daily retrain: Cron job at 23:00, predictions cached with date-based invalidation
  • Infrastructure: InfluxDB batch power sensor fetching, historical data persistence, sensor gap-filling

Changes from previous PR #23

All 9 review items from @johanzander's feedback on #23 have been addressed:

  • ✅ No hardcoded sensor IDs — config moved to config.yaml ml section with placeholder values
  • ✅ Timezone from config (ml.location.timezone), not hardcoded
  • ✅ Coordinates match timezone (configurable)
  • consumption_strategy loaded in from_ha_config() with direct dict access (no silent fallback)
  • ✅ No .get() with fallback defaults for required settings — fails loudly if missing
  • consumptionStrategy field has no default in APIBatterySettings
  • ✅ ML packages in shared requirements.txt (needed at runtime for strategy dispatch)
  • ✅ Logging doesn't trigger heavy queries — uses cached predictions
  • ✅ No duplicate CHANGELOG sections (CHANGELOG not modified in this PR)

What's NOT included (intentionally excluded)

  • Chart fixes (EnergyFlowChart, BatteryLevelChart)
  • Locale/currency cleanup
  • Test refactoring
  • Docs restructuring
  • Schedule verification / deployment failure tracking

Test plan

  • ml report command completes and generates HTML dashboard
  • Feature flags properly exclude features when disabled
  • System boots and optimizes normally with consumption_strategy: "sensor" (no ML overhead)
  • System boots with consumption_strategy: "ml_prediction" and uses ML forecasts
  • influxdb_profile strategy produces reasonable weekly average profiles
  • ML Report tab renders in frontend
  • Verify with fresh install (no existing model artifacts)

ipc-zpg and others added 2 commits March 6, 2026 19:42
Standalone ML module (ml/) using XGBoost to predict 24h energy consumption
at 15-minute resolution. Integrates with the battery optimizer via a new
consumption_strategy setting that supports four modes: sensor (default),
fixed, influxdb_profile (7-day weekly average), and ml_prediction.

Key components:
- ml/ module: CLI tool with train, predict, evaluate, and report commands
- MLReportPage: Frontend dashboard tab showing model metrics and forecasts
- /api/ml-report endpoint for model status and prediction data
- Configurable via config.yaml ml section (location, features, training params)
- Daily retrain cron job at 23:00, predictions cached with date-based invalidation
- InfluxDB batch power sensor fetching for training data
- Historical data persistence across container restarts

Infrastructure changes:
- Switched Docker base from Alpine to Debian (required for xgboost wheels)
- Added scikit-learn, xgboost, astral to requirements
- Power sensor gap-filling in SensorCollector for historical backfill
- Nordpool price sensor methods in HA controller

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs prevented ML predictions from working:

1. battery_system_manager.py: _retrain_ml_model() and
   _generate_ml_predictions() called load_config() with nonexistent
   keyword args (ml_section, influxdb_url, etc.), causing silent
   failures. Fixed to use load_config(app_options=self._addon_options).

2. ml/config.py: Used HA_TOKEN env var for HA API auth, but the HA
   supervisor sets HASSIO_TOKEN. Weather forecast calls failed with
   401 errors. Now checks HASSIO_TOKEN first, falling back to
   HA_TOKEN for local development.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pookey
Copy link
Copy Markdown
Contributor Author

pookey commented Mar 7, 2026

Superseded by #33 (consumption strategies) and #34 (ML predictor). Split into two focused PRs for easier review.

@pookey pookey closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants