Using strain data segments from a network of ground-based GW detectors, predict whether a GW signal is present in the strain segment.
Binary classification using SOTA deep neural networks of multivariate timeseries data.
Source: G2Net Gravitational Wave Detection (https://www.kaggle.com/competitions/g2net-gravitational-wave-detection/overview)
Each data sample (npy file) contains 3 time series (1 for each detector) and each spans 2 sec and is sampled at 2,048 Hz.
train/ - the training set files, one npy file per observation; labels are provided in a files shown below
test/ - the test set files; you must predict the probability that the observation contains a gravitational wave
training_labels.csv - target values of whether the associated signal contains a gravitational wave
-
src: Scripts FolderGettingStarted.ipynb: Start here; it walks through EDA and the modelling pipeline.configs: Folder contains config files used in the training scripts to provide parameters of dataloaders, models, etc.Subfolder
trainholds config JSON files used to perform experimental runs usingtrain.pyortrain_pl.py. Filebase.jsonwhich contains the basic config like choice of optimizer, no. of training epochs, etc. Fileoptim.jsoncontains the parameters for the chosen optimizer. Filestop_early.jsoncontains parameters for early stopping criteria. Filelr_schd.jsoncontains the parameters for the chosen learning rate scheduler. The rest of the JSON files correspond to the models; there will be one file per model. All relevant config files are read in the beginning of the training script.Subfolder
sweepcontains JSON files used to perform hyperparameter sweeps. Contains similar JSON files; just adapted to work with wandb sweep.dataloaders: Folder contains scripts that implememt useful functions to load data from local download.models: Folder contains implementations of SOTA DL models for time-series classification. Has atsaifolder containing source code of TSAI (https://github.com/timeseriesAI/tsai); contains SOTA models implementataions. Also has apytorchfolder for other custom implementations which may or may not usetsaimodules.wandb_sweep.py: Entry-point script for model training and hyperparameter tuning with W&B logging.train.py: Entry-point script for single run of training and evaluation using vanilla PyTorch with W&B logging. Will produce a run directory in results directory with test-set eval results and optionally model weights.train_pl.py: Entry-point script for single run of training and evaluation with PyTorch Lightning with W&B logging. Will produce a run directory in results directory with test-set eval results and optionally model weights. -
results: Folder to organize run results. -
environment.yml: File to create python environment. -
wandb_api_key.txt: File to hold your wandb API key for logging to your wandb. dashboard.Instructions: 1. Create a wandb account at wandb.ai 2. Create a new project. 3. Copy your API key and paste on the first line of this file.
Note: Make sure anaconda or miniconda is installed on your system.
- Clone this repository.
- In
environment.yml, specify appropriate env. name (first line; default:gwsearchenv) and path (last line); a standard practice is to specify this to be<path/to/anaconda or miniconda/dir>/envs/<name_of_env>. - Create the environment by running this command -
conda env create -f environment.yml. - Activate your newly create environment
conda activate gwsearchenv. - Run
GettingStarted.ipynb.