Training code for RibonanzaNet.
You may not want to retrain RibonanzaNet from scratch and rather just use pretrained checkpoints, so we have created example notebooks:
finetune: https://www.kaggle.com/code/shujun717/ribonanzanet-2d-structure-finetune
secondary structure inference: https://www.kaggle.com/code/shujun717/ribonanzanet-2d-structure-inference
chemical mapping inference: https://www.kaggle.com/code/shujun717/ribonanzanet-inference
You just need train_data.csv, test_sequences.csv, and sample_submission.csv from
https://www.kaggle.com/competitions/stanford-ribonanza-rna-folding/data
Create the environment from the environment file env.yml
conda env create -f env.yml
Install ranger optimizer
conda activate torch
git clone https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
cd Ranger-Deep-Learning-Optimizer
pip install -e .
First activate environment conda activate torch
Set up accelerate with accelerate config in the terminal or with --config_path option
For an example of a accelerate config file, see accelerate_config.yaml
accelerate launch run.py --config_path configs/pairwise.yaml
accelerate launch inference.py --config_path configs/pairwise.yaml
python make_submission.py --config_path configs/pairwise.yaml
This section explains the various parameters and settings in the configuration file for RibonanzaNet
-
learning_rate: 0.001
The learning rate for the optimizer. Determines the step size at each iteration while moving toward a minimum of the loss function. -
batch_size: 2
Number of samples processed per GPU per batch. -
test_batch_size: 8
Batch size used for testing the model per GPU per batch. -
epochs: 40
Total number of training epochs the model goes through. -
dropout: 0.05
The dropout rate for regularization to prevent overfitting. It represents the proportion of neurons that are randomly dropped out of the neural network during training. -
weight_decay: 0.0001
Regularization technique to prevent overfitting by penalizing large weights. -
k: 5 1D Convolution kernel size -
ninp: 256
The size of the input dimension. -
nlayers: 9
Number of RibonanzaNet blocks. -
nclass: 2
Number of classes for classification tasks. -
ntoken: 5
Number of tokens (AUGC + padding/N token) used in the model. -
nhead: 8
The number of heads in multi-head attention models. -
use_flip_aug: true
Indicates whether flip augmentation is used during training/inference. -
gradient_accumulation_steps: 2
Number of steps to accumulate gradients before performing a backward/update pass. -
use_triangular_attention: false
Specifies whether to use triangular attention mechanisms in the model. -
pairwise_dimension: 64
Dimension of pairwise interactions in the model.
-
use_data_percentage: 1
The fraction of data used from the dataset (1= full data training). -
use_dirty_data: true
Indicates whether to include training data that has only one of 2A3/DMS profiles with SN>1.
-
fold: 0
The current fold in use if the data is split into folds for cross-validation. -
nfolds: 6
Total number of folds for cross-validation. -
input_dir: "../../input/"
Directory for input data. Puttrain_data.csv,test_sequences.csv, andsample_submission.csvhere. -
gpu_id: "0"
Identifier for the GPU used for training. Useful in single-GPU setup.
logs has the csv log file with train/val oss,
models has model weights and optimizer states,
oofs has the val predictions