This project revolves around the implementation of predictive models capable of determining the RT of analytes based on various chemical features.
This repository contains a set of scripts for predicting Retention Time (RT) in chromatography data. The scripts are designed to handle data preprocessing, model training, and submission of the prediction.
To install the RT-Analyzer project, follow these steps:
- Clone the repository:
git clone https://github.com/your-username/RT-Analyzer.git- Install the following dependencies :
pandasnumpyscikit-learnkeraspytorchskorchxgboostrdkitmatplotlib
By running:
pip install numpy pandas scikit-learn keras pytorch skorch xgboost rdkit matplotlibTo obtain the predictions submitted on Kaggle, run the following file :
python main.pyThe main.py script features the following parameters:
model paths: Path to the submissions files of individual models ('nn.csv', 'keras.csv', 'gb.csv).output path: Path to store the submission file (default: 'submission.csv').submit: Flag to indicate whether to generate Kaggle submissionplot: Flag to enable/disable plotting during model training
To test individual models implemented in this project, follow these steps:
-
Open the relevant script file (e.g.,
models.py,non_linear_models.py,linear_models.py). -
Uncomment the code section pertaining to the model of interest.
-
Adjust as needed the model testing parameters
# Model testing parameters
submit = False
plot = True
cddd = False
file_path = 'nn.csv'- Run the modified script to evaluate the selected model.