Skip to content

Docking script fails with ZeroDivisionError due to data loader issue #40

@ojtgogogo

Description

@ojtgogogo

Hi,

First, thank you for developing and sharing this exciting tool.

I have been trying to set up and run the SurfDock pipeline. After an extensive debugging process, I have successfully configured the environment and the entire data preparation pipeline. However, I'm encountering a persistent ZeroDivisionError at the final docking step, which seems to stem from the data loader.

Crucially, this error occurs not only with my own data but also with the provided 1a0q official sample data.

Environment Details
OS: Ubuntu 24.04.1 LTS (running on WSL2)

GPU: NVIDIA GeForce RTX 3060

NVIDIA-SMI Output:

(SurfDock) ojt@pc:~/final_test/protein$ nvidia-smi
Tue Jul 8 11:13:25 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.75 Driver Version: 566.24 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A |
| 38% 34C P8 15W / 153W | 702MiB / 12288MiB | 6% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
Conda Environment: Created with python=3.10 and all dependencies from environment.yaml were installed.

Problem Description
I have created a modular 3-step pipeline: (1) Protein preparation, (2) Surface (.ply) generation with all features, and (3) Docking/Scoring.

Steps 1 and 2 run successfully. For any given protein (e.g., 1a0q), it correctly generates a feature-rich .ply file, including successful execution of MSMS and APBS.

In Step 3, the construct_csv_input.py script also runs successfully, creating a non-empty CSV file with all the required columns (protein_path, ligand_path, protein_surface, etc.).

The ESM embedding scripts are then called and also appear to complete successfully.

The pipeline then fails when running inference_accelerate.py.

Error Analysis
The log for inference_accelerate.py shows Loading data ........... followed by a progress bar of 0it [00:00, ?it/s].

This indicates that the ScreenDataset data loader is returning zero valid molecules, even when using the official 1a0q_ligand_for_Screen.sdf library.

This leads to the following ZeroDivisionError, as the script tries to divide by the number of loaded molecules, which is 0. The subsequent rescoring script also fails as no docking output is generated.

Request
Since this issue persists even with the official sample data, it suggests a potential bug or a very specific, undocumented dependency/environment incompatibility within the data loader of inference_accelerate.py.

Could you please provide any insight or guidance on what might be causing the data loader to fail silently?

Thank you for your time and help.
Juntaek

(SurfDock) ojt@pc:~/final_test/protein$ bash ~/SurfDock/bash_scripts/03_run_docking.sh 1a0q ~/final_test/ligands/1a0q_ligand_for_Screen.sdf

3▒ܰ▒: ▒▒▒ ▒▒ŷ ▒▒ ▒▒▒ھ▒ ▒▒▒▒▒մϴ▒...
▒▒▒▒ PLY ▒▒▒▒: /home/ojt/final_test/ply_files/1a0q.ply

▒۾▒ ▒▒▒(CSV) ▒▒ ▒▒...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 26051.58it/s]
CSV creation finished. Wrote 1 rows to /home/ojt/Screen_result/processed_data/1a0q_final_run/input_csv_files//docking_tasks.csv
▒▒▒▒ ESM ▒Ӻ▒▒▒ ▒▒ ▒▒...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 28.01it/s]
Transferred model to GPU
Read /home/ojt/Screen_result/processed_data/1a0q_final_run/esmbedding/protein.fasta with 2 sequences
Processing 1 of 1 batches (2 sequences)
have done ,just skip!: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8473.34it/s]
Assertion_list: []
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 388.24it/s]
▒▒ŷ ▒▒▒ø▒ ▒▒▒▒...
The following values were not passed to accelerate launch and had defaults used instead:
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
/home/ojt/precomputed
/home/ojt/precomputed
2025-07-08 11:15:34.276 | INFO | main::90 - Runing inference script in path: /home/ojt/final_test/protein
2025-07-08 11:15:34.277 | INFO | main::91 - Runing inference with args: Namespace(config=None, data_csv='/home/ojt/Screen_result/processed_data/1a0q_final_run/input_csv_files//docking_tasks.csv', model_dir='/home/ojt/SurfDock/model_weights/docking', ckpt='best_ema_inference_epoch_model.pt', confidence_model_dir='/home/ojt/SurfDock/model_weights/posepredict', confidence_ckpt='best_model.pt', save_docking_result=True, ligand_to_pocket_center=False, keep_input_pose=False, use_noise_to_rank=False, num_cpu=None, run_name='test_ns_48_nv_10_layer_62023-06-25_07-54-08_model', project='1a0q_final_run', surface_path='/PDBBind_processed_8A_surface/', esm_embeddings_path='/home/ojt/Screen_result/processed_data/1a0q_final_run/esmbedding/esm_embeddings.pt', out_dir='/home/ojt/Screen_result/docking_result/1a0q_final_run', batch_size=40, batch_size_molecule=1, cache_path='/PDBBIND/cache_PDBBIND_pocket_8A', data_dir='/PDBBIND/PDBBind_pocket_8A/', split_path='/data/splits/timesplit_test', no_overlap_names_path='/data/splits/timesplit_test_no_rec_overlap', no_model=False, no_random=False, no_final_step_noise=False, ode=False, wandb=False, wandb_dir='/test_workdir', inference_steps=20, limit_complexes=0, num_workers=1, num_process=20, tqdm=False, save_visualisation=False, samples_per_complex=40, save_docking_result_number=1, actual_steps=None, inference_mode='Screen', head_index=0, tail_index=-1, ligandsMaxAtoms=80, random_seed=42, force_optimize=False, mdn_dist_threshold_test=3.0)
device cuda:0 is used!
2025-07-08 11:15:36.710 | INFO | main:main_function:165 - loaded model weight for score model
2025-07-08 11:15:38.677 | INFO | main:main_function:188 - t schedule:[1. 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35
0.3 0.25 0.2 0.15 0.1 0.05]
2025-07-08 11:15:38.678 | INFO | main:main_function:189 - Loading data ...........
0it [00:00, ?it/s]
Traceback (most recent call last):
File "/home/ojt/SurfDock/inference_accelerate.py", line 470, in
main_function()
File "/home/ojt/SurfDock/inference_accelerate.py", line 445, in main_function
logger.info('Docking time used for one moleculer: {}',docking_time/ all_molecules)
ZeroDivisionError: float division by zero
[2025-07-08 11:15:39,749] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 3431) of binary: /home/ojt/miniconda3/envs/SurfDock/bin/python3.10
▒▒▒▒▒ھ(Rescoring) ▒▒▒▒...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 25731.93it/s]
CSV creation finished. Wrote 1 rows to /home/ojt/Screen_result/processed_data/1a0q_final_run/input_csv_files//score_inplace.csv
The following values were not passed to accelerate launch and had defaults used instead:
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
/home/ojt/precomputed
/home/ojt/precomputed
device cuda:0 is used!
0%| | 0/1 [00:00<?, ?it/s]2025-07-08 11:15:48.427 | INFO | score_in_place_dataset.score_dataset:get_complex:171 - Processing 1a0q
2025-07-08 11:15:48.563 | INFO | datasets.process_mols:extract_receptor_structure:359 - Found 416 LM embeddings for 416 residues
2025-07-08 11:16:00.570 | INFO | main:main_function:173 - Size of test dataset:
0%| | 0/3 [00:01<?, ?it/s]
2025-07-08 11:16:02.261 | ERROR | main:main_function:183 - File loading failed for ligand /home/ojt/final_test/ligands/1a0q_ligand_for_Screen.sdf. Error: | 0/3 [00:00<?, ?it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:13<00:00, 13.88s/it]
2025-07-08 11:16:02.262 | INFO | main:main_function:187 - screen time used:
=================================================
?? ▒▒▒ ▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒ ▒Ϸ▒Ǿ▒▒ϴ▒! ??
▒▒▒ ▒▒▒▒ ▒Ʒ▒ ▒▒▒ Ȯ▒▒▒ϼ▒▒▒:
/home/ojt/Screen_result/docking_result/1a0q_final_run
=================================================
(SurfDock) ojt@pc:~/final_test/protein$

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions