A pipeline for analyzing protein structures to identify cryptic pockets.
Create and activate the conda environment:
conda env create -f environment.yml
conda activate databaseBefore running the pipeline, set the FETCH_PATH in monomer_calcs.py to your local directory for CIF files.
Run the pipeline:
python monomer_calcs.py [options]Options:
--n_jobs: Number of parallel jobs (default: 16)--input_list: Use predefined PDB IDs--update_previous: Update existing data--max_res: Maximum resolution threshold in Å (default: 2.5)--custom_folder: Custom data folder (default: data)
monomer_calcs.py: Main pipeline for data collection and processingscoring_function.py: Scoring system for pocket analysisget_sites.py: Identifies and clusters binding sitesrefine_smiles.py: Processes and refines ligand information
data/: Main data directorymonomer_calcs/: Processed structure dataxyz_files/: Structure coordinatesxyz_files_local/: Local structure coordinatescheckpoints/: Processing checkpoints