Benchmarking Band Gap Prediction For Semiconductor Materials Using Multimodal And Multi-fidelity Data

This repository contains the PyTorch Lightning implementation of the benchmark that described in our paper "Benchmarking Band Gap Prediction For Semiconductor Materials Using Multimodal And Multi-fidelity Data". We compiled a new multimodal, multi-idelity dataset from the Materials Project and BandgapDatabase1, consisting of 60,218 low-fidelity computational band gaps and 1,183 high-fidelity experimental band gaps. We evaluated seven ML models, including three traditional methods (linear regression, random forest regression and support vector regression) and four GNNs (CGCNN, CartNet, LEFTNet-Z and LEFTNet-Prop).

Repository Structure

cif_file.zip - Contains .cif files and the atomic encoding file used in the benchmark.

data/ - Directory containing MPIDs and corresponding band gap values:

pretrain_data.json - 60,218 PBE band gap values.
fine_tune/ - Experimental band gap values.
data_by_type/ - Data used for "leave-one-material-out" splits, categorized by material type. configs/ - Configuration files for training models.

models/ - Implementations of baseline models.

loaddata/ - Data preparation, splitting, and processing.

leave_one_material_out/ - Scripts and data for running leave-one-material-out experiments.

saved_models - Pretrained models.

Installation

Install dependencies with:

pip install -r requirements.txt

Training

To train a model, use the following command (add --pretrain to perform pretraining only once instead of k-fold training):

python main.py --cfg configs/PATH_TO_YOUR_CONFIG.yaml

After training, predictions can be generated using:

python test_model.py --cfg configs/PATH_TO_YOUR_CONFIG.yaml --checkpoint saved_models/PATH_TO_YOUR_MODEL.ckpt --cif_folder cif_file --test_data data/fine_tune/test_data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Benchmarking Band Gap Prediction For Semiconductor Materials Using Multimodal And Multi-fidelity Data

Repository Structure

Installation

Training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
data		data
leave_one_material_out		leave_one_material_out
loaddata		loaddata
models		models
.gitignore		.gitignore
README.md		README.md
change_ckpt.py		change_ckpt.py
cif_file.zip		cif_file.zip
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
test_model.py		test_model.py
trainer.py		trainer.py
tutorial.ipynb		tutorial.ipynb

Shef-AIRE/bandgap-benchmark

Folders and files

Latest commit

History

Repository files navigation

Benchmarking Band Gap Prediction For Semiconductor Materials Using Multimodal And Multi-fidelity Data

Repository Structure

Installation

Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages