-
Notifications
You must be signed in to change notification settings - Fork 13
Description
First of all thank you for the great job you are doing with dimorphite_dl! It is simple to use and configure as well as to incorporate it to custom Python workflows!
My issue is about the order in which dimorphite_dl returns the protonated molecule variants. I noticed this when I run the dimorphite_dl from python environment with a molecule of methotrexate:
import dimorphite_dl
smiles = 'CN(Cc1cnc2[nH+]c(N)nc(N)c2n1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1'
smiles_variants = dimorphite_dl.protonate_smiles(
smiles, ph_min=6.0, ph_max=9.0, max_variants=512
)
with open('MT1.smi', "w") as f:
for i, smi in enumerate(smiles_variants, 1):
f.write(f"{smi} {i}\n")
f.write(f"{smiles} orig\n")I rerun it twice and the result was in a different order. The individual variants together were, however, same. When I sorted them (simply using smiles_variants.sorted()), the output files were exactly the same.
I use dimorphite_dl v2.0.2
Is this a feature or a bug? What is the reason of this behavior?
Thank you very much for your answer and for your ongoing job!
P.S.: Since the output files are a bit long, I'll paste here 10 first lines from each try.
Try no. 1:
C[NH+](Cc1c[nH+]c2nc(N)nc(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 1
C[NH+](Cc1c[nH+]c2[nH+]c(N)nc(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 2
C[NH+](Cc1c[nH+]c2[nH+]c(N)[nH+]c(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 3
CN(Cc1c[nH+]c2nc(N)nc(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 4
CN(Cc1cnc2[nH+]c(N)nc(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 5
CN(Cc1cnc2nc(N)[nH+]c(N)c2[nH+]1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 6
CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 7
CN(Cc1c[nH+]c2nc(N)[nH+]c(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 8
C[NH+](Cc1c[nH+]c2nc(N)nc(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 9
C[NH+](Cc1cnc2nc(N)nc(N)c2[nH+]1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 10
Try no. 2:
CN(Cc1cnc2nc(N)[nH+]c(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 1
C[NH+](Cc1c[nH+]c2nc(N)nc(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 2
C[NH+](Cc1c[nH+]c2[nH+]c(N)nc(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 3
C[NH+](Cc1c[nH+]c2[nH+]c(N)nc(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 4
CN(Cc1cnc2[nH+]c(N)nc(N)c2[nH+]1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 5
C[NH+](Cc1cnc2nc(N)nc(N)c2[nH+]1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 6
C[NH+](Cc1cnc2nc(N)[nH+]c(N)c2n1)c1ccc(C(=O)N[C@@H](CCC(=O)[O-])C(=O)[O-])cc1 7
C[NH+](Cc1c[nH+]c2nc(N)[nH+]c(N)c2[nH+]1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 8
C[NH+](Cc1c[nH+]c2nc(N)nc(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 9
C[NH+](Cc1cnc2[nH+]c(N)nc(N)c2n1)c1ccc(C(=O)[N-][C@@H](CCC(=O)[O-])C(=O)[O-])cc1 10