NOTE: This project is actively developed. Expect frequent updates and changes.
MandVModeling is an open-source Python project for modeling and visualizing changepoint models. It provides tools for analyzing trends in time series data. Uses research-driven findings to make modifications and additions to CUNYBPL's changepointmodel Github repository.
Section 5.1 .3 .1 .3 .1 of ASHRAE Guideline 14 states the following for weekly, daily, and hourly data:
The use of more granular or detailed energy use data may decrease or increase the uncertainty in the computed savings . The uncertainty of regression models is inversely related to the number of points in the model, favoring a model with more granular data, but the aggregated data will have a reduced scatter and associated coefficient of variation of the root-mean-square error [CV(RMSE)], favoring a model with less granular data. Therefore, whether more or less granular data will be better is dependent on the number of points available and the scatter in the data for the chosen model type.With more granular data, however, there is often a need to track more independent variables to model the energy use and demand. For example, with daily data, there may be a need to account for different day types , since energy use may be different on weekdays and weekends . Such categorical (noncontinuous or nonnumeric) variables will often require separate models for each category.
Any additional, continuous independent variables that may need to be added with more granular data should generally be recorded at sufficiently granular time intervals to be able to be placed on coincident times with the energy data. Ideally, they would be measured at the same time as the energy data. However, it is common for weather data to be obtained from nearby weather sites , and such data will be interpolated to be placed on the same timestamp as the energy data. This introduces some uncertainty, but current approaches neglect this added uncertainty, with the implicit assumption that it is minor.
Regression models using hourly data points are allowed; however, there are situations that warrant aggregating hourly data into subdaily or daily occupied/unoccupied periods . Again, since uncertainty is inversely related to the number of points in the model, in some cases, it may be preferable to group the data into common categories, such as occupied or unoccupied, but keep the individual points separate rather than summing the energy use over the category.
This package was developed in order to work with higher granularity data, as cunybpl/changepointmodel was designed and tested for only monthly granularity. This package was designed to support not only monthly but also daily granularity data with relevant tests applied. ASHRAE Guideline 14 points out that by using a daily granularity, data can be split up by weekdays and weekends, which results in separate models for the different daytimes. This allows for more fine-grained research and reporting.
- Supports various changepoint models (two-parameter, three-parameter, four-parameter, five-parameter)
- Generates changepoint coordinates
- Handles time series data analysis
- Formatted and checked using Ruff
- [] Ensure that pydantic v2 is being used and is compatible with the entire package
- [] Ensure PEP 484 compliancy. Make sure everything has a specified type.
- [] Put up some badges on this readme to show relevant information about this package.
- [] Use Github Actions to develop a Python workflow
This project is licensed under the MIT License - see the LICENSE file for details.
This README provides an overview of the MandVModeling project structure, installation instructions, usage examples, features, contribution guidelines, license information, and acknowledgments. It serves as a starting point for users and contributors alike, providing essential information about the project's purpose, functionality, and how to engage with it.