The OpenSTARLab PhaseLearn package is the fundamental library for modeling and analyzing play phases in sports. It provides a robust framework to train and perform inference on play phases using state-of-the-art spatio-temporal architectures.
- Definition of Play Phase: The estimation targets are Phases of Play as defined by FIFA.
- This package supports data preprocessed by the OpenSTARLab PreProcessing package.
- Install pytorch (recommended version 2.4.0 linux pip python3.9 cuda11.8)
pip install torch torchvision torchaudio
- To install this package via PyPI
pip install openstarlab-phaselearn
- To install manually
git clone git@github.com:open-starlab/PhaseLearn.git
cd ./PhaseLearn
pip install -e .
Get started with inference using pre-trained models.
-
Download Pre-trained Model: Download the model weights from MODEL URL and place them in the
model/directory. -
Generate Phase Data: Before estimating, you must convert the raw tracking into the
Phase Dataformat. -
Run Inference: Perform phase estimation on your data.
Follow these steps to train your own play phase model using the SoccerTrack-v2 dataset.
-
Data Acquisition:
- Tracking Data: Obtain the raw data from SoccerTrack-v2.
- Required Files Guide (Note: Different files are required for Match IDs >= 130000 and < 130000).
- Phase Annotation Data: Download the play phase annotation data from DATA URL and place them in the
data/directory.
- Tracking Data: Obtain the raw data from SoccerTrack-v2.
-
Generate Phase Data: Before training, you must convert the raw tracking into the
Phase Dataformat. -
Training & Evaluation: Once the data is prepared, you can execute the training pipeline.
PhaseLearn/
├── phase/sports/soccer/
│ ├── dataloaders/data_module.py
│ ├── inference/inference.py
│ ├── main_class_soccer/main.py
│ ├── models/
│ │ ├── model_yaml/
│ │ │ ├── train_baller2vec.yaml
│ │ │ ├── train_gat_transformer.yaml
│ │ │ ├── train_gcn_transformer.yaml
│ │ │ └── train_transformer.yaml
│ │ ├── baller2vec.py
│ │ ├── gat_transformer.py
│ │ ├── gcn_transformer.py
│ │ └── transformer.py
│ ├── trainers/train.py
│ └── utils/
│ ├── evaluation.py
│ ├── load_train_data.py
│ └── preprocessing.py
├── data/ # Data storage (gitignored except .gitkeep)
│ ├── phase_annotation_data/
│ │ ├── 117092/
│ │ │ ├── 117092_00_01-04_18_annotation
│ │ │ ├── ...
│ │ ├── ...
│ ├── phase_data/
│ │ ├── bepro/
│ │ │ ├── 117092/117092_main_data
│ │ │ ├── ...
│ │ └── statsbomb_skillcorner/
│ │ ├── 3894537_1018887/3894537_1018887_main_data
│ │ ├── ...
│ └── train_data/bepro/ # generated by preprocessing_data()
│ ├── label_np.bpy
│ └── sequence_np.bpy
└── model/ # Model storage (gitignored except .gitkeep)
├── baller2vec/
│ ├── ...
├── gat_transformer/
│ ├── 2team_mode/
│ │ ├── 20251221_155257/run1
│ │ │ ├── best.pth
│ │ │ ├── hyperparameters.json
│ │ │ ├── loss.csv
│ │ │ └── model_stats.txt
│ │ ├── ...
│ ├── ...
├── gcn_transformer/
│ ├── ...
└── trnsformer/
├── ...
- Release the package
- Provide pre-trained models
- Provide phase dataset
Development torch version
version 0.0.2 linux pip python3.9 cuda11.8
If you use phaselearn modeling in your research, please cite it as follows:
@article{playphase,
title={Estimating Probability Distributions of FIFA-Defined Phases of Play Based on Inter-Analyst Diversity},
author={Kento Kuroda, Keisuke Fujii and Yoshinari Kameda},
journal={International Journal of Computer Science in Sport},
year={2026},
volume={XX},
pages={XXX-XXX},
publisher={Publisher}
}![]() Kento Kuroda 💻 |
![]() Kenjiro Ide 💻 |
![]() Keisuke Fujii 🧑💻 |



