A Python tool for generating molecular conformations using RDKit's ETKDGv3 algorithm and optimizing them with UFF and MMFF force fields.
This project provides a simple and flexible way to generate multiple conformers of molecules, align them, optimize with different force fields, and output the results in SDF format with RMSD values for each conformer.
-
First, clone the repository:
git clone https://github.com/your-username/conformation-generator.git cd conformation-generator -
Install the required dependencies (RDKit): RDKit can be installed via conda:
conda create -c conda-forge -n rdkit-env rdkit conda activate rdkit-env
-
If you're using another environment, ensure that RDKit and Python dependencies are properly installed.
You can generate molecular conformations by running the `generate_conformer.py` script.
python generate_conformer.py --sdf <input_sdf> --out <output_sdf> --num <number_of_conformers> --max_iter <max_iterations> --n_cpu <number_of_cpu> --verbosepython generate_conformer.py --sdf example/molecule.sdf --out example/molecule_conf.sdf --num 100 --max_iter 200 --n_cpu 4 --verbose- Input: The script accepts a molecule in SDF format (`.sdf`) as input.
- Conformer Generation: It generates multiple conformers using the ETKDGv3 algorithm.
- Optimization: Two force fields (UFF and MMFF) are applied to optimize the generated conformers.
- Alignment and RMSD Calculation: The conformers are aligned, and the RMSD values for each conformer are computed.
- Output: The results are saved in an SDF file with each conformer and its associated RMSD value.
| Argument | Description | Default | Required |
|---|---|---|---|
| --sdf | Input SDF file with the molecule | N/A | Yes |
| --out | Output SDF file to save the generated conformers | _conf.sdf | No |
| --num | Number of conformations to generate | 100 | No |
| --max_iter | Maximum iterations for force field optimization | 200 | No |
| --n_cpu | Number of CPU cores to use for the calculations | 0 (all cores) | No |
| --verbose | Enable verbose output to display detailed information during execution | False | No |
This repository comes with an example directory that includes a sample molecule (`molecule.sdf`) and a sample run script (`run.sh`).
To try it out, simply run:
cd example
bash run.shHere’s the breakdown of what happens:
- The script reads the input molecule from
molecule.sdf. - It generates 100 conformers of the molecule.
- UFF and MMFF optimizations are applied.
- Conformers are aligned and written to
molecule_conf.sdf, with the RMSD values stored in each conformer entry.
The output will be an SDF file containing:
- All the generated conformers.
- RMSD values for each conformer, calculated based on the alignment with the reference conformer.
Each conformer in the output SDF file will have a property named RMSD that indicates how far it is (in terms of RMSD) from the first conformer. This allows you to easily assess the diversity of the generated conformations.
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to open issues or submit pull requests if you encounter any problems or have suggestions for improvements.