FinBench is a collection of tools, datasets and example implementations to evaluate and experiment with models and algorithms in the financial domain (time-series forecasting, ranking, portfolio simulation, factor modeling, etc.). The repository aims to provide a reproducible foundation for research, benchmarking and rapid prototyping in quantitative finance and financial machine learning.
- Structured datasets and data loaders for common financial tasks.
- Preprocessing and feature engineering utilities (technical indicators, rolling statistics, factor calculation).
- Baseline model implementations across multiple tasks (classification, regression, ranking, portfolio optimization).
- Evaluation and backtesting tools for reproducible experiment comparison.
- Per-model example training scripts and requirements to reproduce results.
| Type | Model | Loss Function | Data Normalization |
|---|---|---|---|
| Classification | THGNN | Cross-Entropy Loss | - |
| MAN-SF | Cross-Entropy Loss | Relative Price Scaling (High/Low divided by previous Adjusted Close) | |
| Adv-ALSTM | Hinge Loss | - | |
| HGTAN | Cross-Entropy Loss | Per-Ticker Max Scaling (Max Normalization) | |
| CNNPred2D / CNNPred3D | MSE Loss | Standardization (Z-score scaling via StandardScaler), Missing Value Imputation (fillna(0)) | |
| DGDNN | Cross-Entropy Loss | - | |
| Regression | D-Va | MSE + Regularization + Variational + Denoising Loss | - |
| ESTIMATE | RMSE Loss | Per-Ticker Max Scaling (Max Normalization) | |
| StockMixer | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) | |
| MASTER | MSE Loss | Daily Cross-Sectional Z-score Normalization (label only), Drop-Last Strategy in Training | |
| MATCC | MSE Loss | Robust Standardization (Robust Z-score using median & IQR), Drop-Last Strategy in Training | |
| HIST | MSE Loss | Robust Standardization + Daily Cross-Sectional Z-score, Missing Value Handling (Drop NaN Labels + fillna(0)) | |
| DiscoverPLF | Reconstruction + Prediction + KL Divergence Loss | Robust Standardization + Daily Cross-Sectional Z-score, Missing Value Handling (Drop NaN Labels + fillna(0)) | |
| FactorVAE | Negative Log-Likelihood + KL Divergence Loss | Robust Standardization + Daily Cross-Sectional Z-score, Missing Value Handling (Drop NaN Labels + fillna(0)) | |
| FinFormer | Concordance Correlation Coefficient (CCC) Loss | Robust Z-score Normalization + Missing Value Imputation + Label Filtering + Cross-Sectional Rank Normalization (CSRank) | |
| SAMBA | MAE Loss | Min-Max Scaling | |
| Ranking | STHAN-SR | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) |
| SVAT | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) | |
| RT-GCN | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) |
Classification/— Multiple classification model implementations and training scripts (e.g., Adv-ALSTM, CNNPred, DGDNN, HGTAN, MAN-SF, THGNN).Ranking/— Ranking models and related training pipelines.Regression/— Regression and forecasting models (FinFormer, FactorVAE, HIST and more).Evaluation/— Evaluation and backtesting utilities, evaluation scripts and configuration templates.
Note: Each model implementation includes their own requirements.txt and example training scripts.
- Clone the repository:
git clone https://github.com/softlab-unimore/finbench.git
cd finbench
-
Create and activate a Python virtual environment for each model and Evaluation package.
-
Install dependencies.
- Global evaluation tools (used by
Evaluation/):
pip install -r Evaluation/requirements.txt- Per-model dependencies: each model folder (for example
Classification/Adv-ALSTM/) contains arequirements.txtwith the packages needed for training and evaluation of that model. Follow the instructions in each model folder.
- Global evaluation tools (used by
-
Data loading:
Evaluation/main.pyprovide the script to extract data from the data sources and prepare it for training and evaluation. Please run from the root directory:cd Evaluation python3 main.py -
Model training: all the models provide a
train.py(ortrain_2D.py/train_3D.py) script inside their folder. Typical usage (adjust per-model arguments):cd ../<Type>/<Model_Folder> python3 train.py [<pararms>]Replace
<Model_Folder>with the appropriate value. Check the model folder for specific training instructions and required arguments.Typemust be one of:Ranking,Classification,Regression. -
Extract task level metrics: Use the provided tool to collect best validation runs and produce per-model CSV metric summaries.
-
Verify your results layout
- Results must follow the pattern:
<Type>/<Model>/results/<Universe>/<Config>/<Seed>/<Year>/*
- Results must follow the pattern:
-
Run the extractor
-
From the repository root, run:
python extract_model_metrics.py --type <TYPE> --model <MODEL_NAME>Replace
TYPEandMODEL_NAMEwith the appropriate type and model folder.TYPEmust be one of:Ranking,Classification,Regression. -
The script will create:
<Type>/<Model>/best_results.json— best test metrics selected by validation score.<Type>/<Model>/metrics.csv— tab-separated table of metrics per (Year, Seed, Universe) for common sl/pl configurations.
-
-
-
Evaluation:
-
evaluation.pyprovide mechanisms compute portfolio metrics on model predictions.cd Evaluation python3 evaluation.py --type <TYPE> --model <MODEL_NAME> --universe <UNIVERSE> --sl <SL> --pl <PL> --initial_year <YEAR> --top_k <K> --short_k <SK> -
quintile_analysis.pyprovides tools to compute quintile-based metrics and visualizations.cd Evaluation python3 quintile_analysis.py --type <TYPE> --model <MODEL_NAME> --universe <UNIVERSE> --initial_year <YEAR>
Replace
TYPE,MODEL_NAMEand the other placeholders with the appropriate values. Check theEvaluation/README.mdfile for specific instructions and available arguments. -
Check the docs or the training script in the model folder for model-specific flags and data requirements.
Almost all models were tested with Python 3.10; however, some exceptions (e.g., Adv-ASLTM) required different Python versions due to library compatibility issues.
Check the README.md in each model folder for specific Python version requirements and installation instructions.
This repository includes a LICENSE file at the project root. Review it for terms and conditions before using the code in production.
