-
Notifications
You must be signed in to change notification settings - Fork 65
Description
Notes on PubChemFingerprints.py module
Issue in SMARTS implementation:
Feature #60 is a SMARTS for detection of chemical element cobalt (Co)
The PyBioMed implemented entry of smartsPatts dict is: 60:('[CO]', 0),
The implementation should be: 60:('[Co]', 0),
Issues on ring detection algorithm fatures:
Application case 1:
When applied on a "C=CCC1CCCC1" the features hit are:
144 >= 1 any ring size 5
145 >= 1 saturated or aromatic carbon-only ring size 5
146 >= 1 saturated or aromatic nitrogen-containing ring size 5
147 >= 1 saturated or aromatic heteroatom-containing ring size 5
Discussion on obtained fingerprint in Application case 1:
Hit no 144 is correct.
Hit on 145 is correct but this is by chance and the algorithm in func_2 is wrong (see. Proposed fixes)
Hit on 146 is wrong, the algorithm in func_3 is wrong (see. Proposed fixes)
Hit on 147 is wrong, the algorithm in func_4 is wrong (see. Proposed fixes)
Proposed fixes:
func_2(mol,bits):
The feature description: saturated or aromatic carbon-only ring
The PyBioMed algorithm test: (saturated) OR (aromatic AND carbon-only)
The algorithm test should be: (saturated AND carbon-only) OR (aromatic AND carbon-only) # or some other equivalent
func_3(mol,bits):
The feature description: saturated or aromatic nitrogen-containing
The PyBioMed algorithm test: (saturated) OR (aromatic AND nitrogen-containing)
The algorithm test should be: (saturated AND nitrogen-containing) OR (aromatic AND nitrogen-containing) # or some other equivalent
func_4(mol,bits):
The feature description: saturated or aromatic heteroatom-containing
The PyBioMed algorithm test: (saturated) OR (aromatic AND heteroatom-containing)
The algorithm test should be: (saturated AND heteroatom-containing) OR (aromatic AND heteroatom-containing) # or some other equivalent
PS: if anyone is interested in corrected implementation of these defs, provide me with contact information and I'll be happy to share my code with you