- Install Python>=3.8, PyTorch 1.8.1.
- Numpy (
numpy) v1.15.2; - Matplotlib (
matplotlib) v3.0.0; - Orange (
Orange) v3.18.0; - Pandas (
pandas) v1.4.2; - Weke (
python-weka-wrapper3) v0.1.6 for multivariate time series (requires Oracle JDK 8 or OpenJDK 8); - PyTorch (
torch) v1.8.1 with CUDA 11.0; - Scikit-learn (
sklearn) v1.0.2; - Scipy (
scipy) v1.7.3; - Huggingface (
transformers) v4.30.1; - Absl-py (
absl-py) v1.2.0 ; - Einops (
einops) v0.4.1; - H5PY (
h5py) v3.7.0; keopscorev2.1opt-einsumv3.3.0pandasv1.4.2pytorch-waveletPyWaveletsv1.4.1scikit-imagev0.19.3statsmodelsv0.13.2sympyv1.11.1
The datasets manipulated in this code can be downloaded on the following locations:
- the UCR archive: https://www.cs.ucr.edu/~eamonn/time_series_data_2018/;
- the UEA archive: http://www.timeseriesclassification.com/;
- the long-term forecasting archive: https://github.com/thuml/Time-Series-Library.
datasetsdata and related methods;encodersfolder: implements encoder and its building blocks (dilated convolutions, causal CNN);lossesfolder: implements the triplet loss in the cases of a training set with all time series of the same length, and a training set with time series of unequal lengths;modelsfolder: implements LLM4TS and its building blocks (encoder + GPT attention + output head);utilsfolder: utils;main_encoderfile: handles learning for encoders (see usage below);main_LLM4TSfile: handles learning for LLM4TS. The prerequisite is to have a well trained encoder (see usage below);optimizersfile: optimizer methods for training models;optionsfile: input args;runningfile: methods to train and test models.
Download LLM from huggingface
To select text prototypes from GPT2
python losses/text_prototype.py --llm_model_dir= path/to/llm/folder/ --prototype_dir path/to/save/prototype/file/ --provide Flase(ramdom) or a text lisr --number_of_prototype 10
To train a model on the EthanolConcentration dataset from the UEA archive with specific gpu:
python main_encoder.py --data_dir path/to/EthanolConcentration/folder/ --gpu 0
Adding the --load_encoder option allows to load a model from the specified save path.
Setting the --gpu -1 option to use cpu.
To train a model on the EthanolConcentration dataset from the UEA archive with specific gpu:
python main_LLM4TS.py --output_dir experiments --comment "classification from Scratch" --name EthanolConcentration --records_file Classification_records.xls --data_dir path/to/EthanolConcentration/folder/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 50 --lr 0.001 --patch_size 8 --stride 8 --optimizer RAdam --d_model 768 --pos_encoding learnable --task classification --key_metric accuracy --gpu 0
Setting the --gpu -1 option to use cpu.
To train a model on the traffic dataset with specific gpu:
python main_encoder.py --root_path path/to/traffic/folder/ --data_path traffic.csv --model_id traffic --name traffic --data custom --seq_len 512 --output_dir ./experiments_encoder --gpu 0
To train a model on the traffic dataset with specific gpu:
python main_LLM4TS.py --root_path path/to/traffic/folder/ --data_path traffic.csv --model_id electricity --name electricity --data custom --seq_len 512 --label_len 48 --pred_len 96 --output_dir ./experiments --gpu 0
Setting the --gpu -1 option to use cpu.