-
Notifications
You must be signed in to change notification settings - Fork 172
New hinge identification method #2179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@jamesmkrieger, this is the code from Frank for finding Hinges (PNAS ref in the main block of the PR) that I mentioned before. @stoneandbone very kindly implemented it. Does this overall implementation look okay to you? |
jamesmkrieger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I just have a few changes, mainly for efficiency.
I didn’t understand the bits with the bands and merging regions but I imagine they are right.
It would also be good to have some unit tests that verify the type and shape of the output as well as the numerical values and how they are affected by the parameters.
|
Thanks very much for making this |
|
Thanks @jamesmkrieger! I will work with @stoneandbone to address these. |
|
@jamesmkrieger I have addressed all your suggested fixes, and @AnthonyBogetti and I added a unit test. Please review at your convenience! :) |
|
The (random) test failures here are due to the pfam tests which should be fixed in #2189 maybe we can merge that one first. |
I agree |
prody/dynamics/analysis.py
Outdated
| 6) Return list of hinges by mode. | ||
| :param modes: GNM model or ModeSet | ||
| :type modes: GNM or Mode instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the format that sphinx can read like in other doc strings. Also put ModeSet again in type and put a more meaningful description than the types in the bit above, linking to what was said above about selecting modes.
Double check that param is ok too. We usually put arg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A user could also provide vectors as inputs directly
prody/dynamics/analysis.py
Outdated
| :param modes: GNM model or ModeSet | ||
| :type modes: GNM or Mode instance | ||
| :param n_modes: Number of modes to consider. Defaults to all. | ||
| :type n_modes: int or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if or is ok. We usually put a comma. Again remember this should be machine readable as well as human readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe handle default of None in the bit above as well or instead e.g. defaults to all modes if left as None
prody/dynamics/analysis.py
Outdated
| :param space: spacing between hinges. Defaults to None. Higher space value will minimize the number of local hinges while retaining global hinges. | ||
| :type space: int or None | ||
| :param trim: Whether to hinges at the N- or C-terminal ends by Protein length // 20. Defaults to False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps there could be a parameter for how much counts as terminal with options for a fraction as a float less than 1 or a number of residues as an int from 1 onwards. Dividing by 20 is huge for some proteins e.g. 1000//20=50 residues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having atoms as inputs would also allow you to handle multiple chains, which is currently lacking. At the moment there would be merging across chain boundaries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d suggest the default as 5 residues
prody/dynamics/analysis.py
Outdated
| raise TypeError("Input must be a GNM model or ModeSet instance.") | ||
| else: | ||
| if gnm._kirchhoff is None: | ||
| raise ValueError("Kirchhoff matrix not built. Build Kirchhoff matrix before calculating modes.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a structure could be provided to build the Kirchhoff internally. Providing a gnm object with the Kirchoff built is a bit counterintuitive
prody/dynamics/analysis.py
Outdated
|
|
||
| vecs = gnm[:n_modes].getEigvecs() | ||
| else: | ||
| ### Use all provided modes for Hinge detection ### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally simplify the nested if else logic or otherwise document it with a few more comments. It looks like this is the ModeSet case. You could maybe also put this input handling in a helper function so you can return the modes array. It could be easier for readability and could also be a good output to have for diagnostics.
it could also be good to return the calculated and selected gnm modes as an option more globally as well.
also, there seems to be no handling of the case where a gnm object is provided with modes calculated. In that case they should be used directly
prody/dynamics/analysis.py
Outdated
| vecs = gnm.getArray() | ||
|
|
||
|
|
||
| def detectHinges(v, threshold): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d suggest making this a separate function rather than a nested one so people can run it with their own vectors and threshold. It would also help with readability and maintainability
prody/dynamics/analysis.py
Outdated
| for i in crx: | ||
| r = [i, i + 1] | ||
|
|
||
| if abs(v[i]) > band and abs(v[i + 1]) > band: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again simplify nested if else blocks. ChatGPT suggests this. Please check if it makes sense
band = np.sqrt(1 / len(v)) / threshold
regs = []
for i in crx:
# Start hinge region around the sign change
start = i
end = i + 1
# Expand left
j = start
while j >= 0 and abs(v[j]) < band:
start = j
j -= 1
# Expand right
j = end
while j < len(v) and abs(v[j]) < band:
end = j
j += 1
# Store inclusive region
regs.append(list(range(start, end + 1)))
prody/dynamics/analysis.py
Outdated
| """ | ||
| [HZ384] H Zhang, M Gur, I Bahar (2024) Global hinge sites of proteins as target sites for drug binding Proc Natl Acad Sci USA 121 (49), e2414333121 | ||
| Updated hinge identification based on [HZ384]. This will: | ||
| 1) If GNM model is provided without specification of n_modes, number of modes will be determined to achieve 33% cumulative variance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess a different percentage could also be used as a decimal input to the number of modes
jamesmkrieger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to reorganise for readability and usability. We’d have two step functions outside the main function that could be helpful in their own right (see below)
- one for subsetting modes by cumulative variance that could be helpful for PCA and maybe ANM modes too
- One for calculating hinges for a single mode
please include better control of termini and multiple chains. Also handle calculation of kirchhoff inside and provided gnm with already calculated mode
the tests would also need cases for multiple chains e.g. 3kg2 that we analysed in Dutta et 2015
I haven’t had the chance to look over the tests or to test it and I don’t think I will be able to. If you and @AnthonyBogetti are happy after these changes, then I am too
|
@stoneandbone Please see and address James' most recent round of comments. |
James' changes were addressed, and due to lack of time on his end for a re-review, we have dismissed the review.
|
Miraculously, the current tests all passed. |
|
[laugh] Kwan, Ming Suet reacted to your message:
…________________________________
From: Anthony Bogetti ***@***.***>
Sent: Friday, December 19, 2025 7:25:35 PM
To: prody/ProDy ***@***.***>
Cc: stoneandbone ***@***.***>; Mention ***@***.***>
Subject: Re: [prody/ProDy] New hinge identification method (PR #2179)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
[https://avatars.githubusercontent.com/u/34397256?s=20&v=4]AnthonyBogetti left a comment (prody/ProDy#2179)<#2179 (comment)>
Miraculously, the current tests all passed.
—
Reply to this email directly, view it on GitHub<#2179 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BGTWRPRFJSHGLTUIQOXDSRD4CRGK7AVCNFSM6AAAAACIYTTJO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMNZWGI3DQNRRGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by e-mail and destroy all copies of the original.
|
| return gnm.getArray() | ||
|
|
||
| if not isinstance(gnm, GNMBase): | ||
| raise TypeError("Input must be a GNM model or ModeSet instance.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe allow gnm=None if atoms is not None as a new gnm can be instantiated? People are unlikely to input a newly instantiated gnm object themselves
[HZ384] Update: Enhanced Hinge Identification Function
This update refines the hinge site detection algorithm based on the methodology of [HZ384] H. Zhang, M. Gur, I. Bahar (2024), PNAS 121(49): e2414333121.
Summary of changes: