Reproducible amino acid–diamond interface dataset for scientific and ML applications.
Features:
- 20 amino acids
- 2 diamond facets: (100), (111)
- 2 terminations: h, c=o
- boron doping (per structure)
- 828 total structures with atomic detail
- exports: cif, xyz, csv, numpy, metadata
1. Open amidiam_dataset.ipynb in Google Colab or JupyterLab.
2. Run all cells.
- All structure files, features, and metadata are written to
amidiam_dataset/. - Zipped complete dataset is saved as
amidiam_dataset.zip.
3. Demo visualization is included.
- Several representative atomic interfaces are viewable interactively via py3dmol in the notebook.
- After running, download
amidiam_dataset.zipto get all CIF/XYZ files, features, fingerprints and metadata for further use.
This repository provides structure generation and basic geometric features. Advanced analysis, computational results and extended feature engineering are detailed in the associated thesis and will be released upon publication.
MIT License (see LICENSE)
@msc{onyshchenko2025, title={Simulation studies of receptor-target biosensory interactions supported by machine learning models}, author={Onyshchenko, Olena}, year={2025}, school={Gdańsk University of Technology (Gdańsk Tech)} }