Skip to content

EMIT-Lab/CRADLE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Chiral Nanoparticle g-Factor Machine Learning Prediction

This project focuses on predicting the g-factor of chiral nanoparticles using various machine learning models and feature encoding strategies. The repository contains the complete workflow from data preprocessing and feature encoding to model training, validation, and visualization.

KEncoding Strategies

Three different encoding approaches are implemented and compared:
1.Chemical Encoding (chemical_encoding/): Domain-specific feature representation for chiral nanoparticles using chemical descriptors and properties
2.One-Hot Encoding (onehot_encoding/): Categorical variable encoding for machine learning compatibility, creating binary columns for each category
3.Ordinal Encoding (ordinal_encoding/): Ordered categorical encoding preserving inherent relationships between values

Each encoding folder contains:
A processed_data.csv file with the encoded dataset
A complete analysis workflow in Jupyter notebooks (0-9)

Installation

  1. Create virtual environment:
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
.venv\Scripts\activate    # Windows
  1. Install dependencies:
pip install numpy pandas scikit-learn matplotlib seaborn Joblib Tqdm

Project Structure

β”œβ”€β”€ raw_data.csv                   # Original dataset (raw data)
β”œβ”€β”€ chemical_encoding/             # Chemical feature encoding results and analysis
β”‚   β”œβ”€β”€ processed_data.csv         # Processed data after chemical encoding
β”‚   β”œβ”€β”€ 0_Dataset_Description.ipynb
β”‚   β”œβ”€β”€ 1_Scaling_and_Transforming.ipynb
β”‚   β”œβ”€β”€ 2_size_aug_model.ipynb
β”‚   β”œβ”€β”€ 3_augmentation.ipynb
β”‚   β”œβ”€β”€ 4_g_aug_model.ipynb
β”‚   β”œβ”€β”€ 5_Corelation.ipynb
β”‚   β”œβ”€β”€ 6_Single_Output.ipynb
β”‚   β”œβ”€β”€ 7_k-fold_cross_validation.ipynb
β”‚   β”œβ”€β”€ 8_PCA.ipynb
β”‚   └── 9_Plot.ipynb
β”œβ”€β”€ onehot_encoding/               # One-hot encoding results and analysis
β”‚   β”œβ”€β”€ processed_data.csv         # Processed data after one-hot encoding
β”‚   β”œβ”€β”€ 0_Dataset_Description.ipynb
β”‚   β”œβ”€β”€ 1_Scaling_and_Transforming.ipynb
β”‚   β”œβ”€β”€ ... (same notebook structure as above)
β”‚   └── 9_Plot.ipynb
β”œβ”€β”€ ordinal_encoding/              # Ordinal encoding results and analysis
β”‚   β”œβ”€β”€ processed_data.csv         # Processed data after ordinal encoding
β”‚   β”œβ”€β”€ 0_Dataset_Description.ipynb
β”‚   β”œβ”€β”€ ... (same notebook structure as above)
β”‚   └── 9_Plot.ipynb
└── README.md                      # This file

Usage

raw_data.csv
    β”‚
    β”œβ”€β”€ chemical_encoding/ β†’ processed_data.csv β†’ Analysis (0-9 notebooks)
    β”œβ”€β”€ onehot_encoding/   β†’ processed_data.csv β†’ Analysis (0-9 notebooks)
    └── ordinal_encoding/  β†’ processed_data.csv β†’ Analysis (0-9 notebooks)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published