Skip to content

VIEWS Stepshifter is a package that contains modeling procedure, evaluation and forecasting using Darts

Notifications You must be signed in to change notification settings

views-platform/views-stepshifter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

259 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub License GitHub branch check runs GitHub Issues or Pull Requests GitHub Release

VIEWS Twitter Header

Stepshifter: Time-series Forecasting Model 🔮

Part of the VIEWS Platform ecosystem for large-scale conflict forecasting.


📚 Table of Contents

  1. Overview
  2. Role in the VIEWS Pipeline
  3. Features
  4. Installation
  5. Usage
  6. Architecture
  7. Project Structure
  8. Contributing
  9. License
  10. Acknowledgements

🧠 Overview

Stepshifter is a machine learning model designed for time-series forecasting using Darts. It solves [regression and classification] tasks.

Key Capabilities:

  • Probabilistic Outputs: Binary outputs and point predictions.
  • Learning Approach:
  1. LinearRegressionModel,
  2. RandomForest,
  3. LightGBMModel,
  4. XGBModel,
  5. HurdleModel.
  • Integration-Ready: Built to seamlessly integrate into the larger VIEWS Pipeline.

🌍 Role in the VIEWS Pipeline

Stepshifter serves as part of the Violence & Impacts Early Warning System (VIEWS) pipeline. It complements the following repositories:

Integration Workflow

Stepshifter fits into the pipeline as follows:

  1. Data Input: Processes preprocessed data from views-pipeline-core.
  2. Model Execution: This modeling approach involves shifting all independent variables in time, in order to train models that can predict future values of the dependent variable.
  3. Post-Processing: Outputs are sent to views-evaluation for further analysis.

✨ Features

  • Darts models: Stepshifter model class supports multiple Darts forecasting models, including LinearRegressionModel, RandomForest, LightGBMModel, and XGBModel.
  • Automated Data Cleanup: Stepshifter model class automatically processes missing data and ensures consistent multi-index formatting for time-series data.
  • Hurdle model: Hurdle model class inherits from StepshifterModel. A hurdle model consists of two stages:
    1. Binary stage: Predicts whether the target variable is 0 or > 0.
    2. Positive stage: Predicts the value of the target variable when it is > 0.

⚙️ Installation

Prerequisites

  • Python >= 3.11
  • Access to views-pipeline-core.

Steps

See the organization/pipeline level docs


🚀 Usage

1. Run Training Locally

See the organization/pipeline level docs

2. Use in the VIEWS Pipeline

Stepshifter integrates seamlessly with the VIEWS pipeline. After processing, outputs can be passed to views-evaluation for further calibration or ensembling.


🏗 Architecture

1. Stepshifter Model

This modeling approach involves shifting all independent variables in time, in order to train models that can predict future values of the dependent variable. More details can be found in Appendix A of Hegre et al. (2020).

2. Hurdle Model

This approach differs from a traditional implementation in three aspects:

  1. In the first stage, since Darts doesn't support classification models, a regression model is used instead. These estimates are not strictly bounded between 0 and 1, but this is acceptable for the purpose of this step.
  2. To determine whether an observation is classified as "positive," we apply a threshold. The default threshold is 1, meaning that predictions above this value are considered positive outcomes. It is not set as 0 because most predictions won't be exactly 0. This threshold can be adjusted as a tunable hyperparameter to better suit specific requirements.
  3. In the second stage, a regression model is used to predict for the selected time series. Since Darts time series require a continuous timestamp, we can't get rid of those timestamps with negative prediction produced in the first stage like a traditional implementation. Instead we include the entire time series for countries or PRIO grids where the first stage yielded at least one positive prediction.

🚦 Workflow

  1. Input: VIEWS historical conflict data.
  2. Processing: Converting to Darts time series data.
  3. Prediction: Regression predictions.

Refer to the Appendix A of Hegre et al. (2020) for an in-depth explanation.

For more detailed information about the VIEWS Stepshifter models themselves, refer to the VIEWS models catalog.


🗂 Project Structure

views-stepshifter/
├── README.md          # Documentation
├── tests              # Unit and integration tests
├──  views-stepshifter # Main source code
│   ├── manager        # Management of stepshifter model lifecycle
│   ├── models         # Model algorithms
│   ├── src            # Folder template
│   ├── __init__.py    # Package initialization
├── .gitignore         # Git ignore rules
├── pyproject.toml     # Poetry project file
├── poetry.lock        # Dependency lock file

🤝 Contributing

We welcome contributions to this project! Please follow the guidelines in the VIEWS Documentation.


📜 License

This project is licensed under the LICENSE file.


💬 Acknowledgements

Views Funders

Special thanks to the VIEWS MD&D Team for their collaboration and support.

About

VIEWS Stepshifter is a package that contains modeling procedure, evaluation and forecasting using Darts

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 6

Languages