Skip to content

Conversation

@mateuszanotto
Copy link

@mateuszanotto mateuszanotto commented Sep 13, 2022

Added amber file format support (.itp)

Added a progress bar while qforce is being initialized (importing modules)

(there are more commits than necessary 'cause I was still figuring it out how to do it properly)

@selimsami
Copy link
Owner

Hi Mateusz,
The changes look very exciting, but the CI seems to be failing. Could you please see what's causing that? Thanks!

@mateuszanotto
Copy link
Author

Oh, i will look into it!

@mateuszanotto
Copy link
Author

I believe it was only missing the misc.py with the logo with # on front, for the .mol2 files

Copy link
Owner

@selimsami selimsami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Mateus,
Sorry for the delay in answering, was a bit busy with the move!
Thanks a lot for the additions.
The amber part seems to work well, well done!
I haven't managed to get the loading bar working, but I'll check this a bit more... Maybe some problem on my end.
In the meanwhile please have a look at the comments I made.
Cheers,
Selim

Comment on lines 594 to 611
def get_atom_types(self, topo, non_bonded):
atom_ids={} #dictonary containing original atom types
unique_at={} #dictonary containing new unique atom types
unique_masses={}

ascii_lowercase = list(string.ascii_lowercase)
ascii_digits = list(string.digits)
ascii_uppercase = list(string.ascii_uppercase)

for i, (lj_type, a_name) in enumerate(zip(non_bonded.lj_types, self.atom_names), start=1):
atom_ids[i]=lj_type
if self.n_atoms < 36:
if i <= 10:
unique_at[i]={atom_ids[i] : "Q{}".format(ascii_digits[i-1])}
elif i > 10:
unique_at[i]={atom_ids[i] : "Q{}".format(ascii_lowercase[i-11])}
else:
unique_at[i]={atom_ids[i] : "{}{}".format(ascii_uppercase[i-1], ascii_lowercase[i-1])}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain what is happening here? I didn't fully understand what you did with the ascii characters and why you needed that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Amber, if there are two CA-CA bonds, the second CA-CA entry overwrites the parameters for the first.
Because of that, we used ascii characters to create unique atom types for each bond

@mateuszanotto
Copy link
Author

Hi Selim,
Thank you so much for the comments!
I will work on them right away
Kind regards,
Mateus

@mateuszanotto
Copy link
Author

Hi Selim,
I uploaded the requested changes:
moved the ff.write(...) to the forcefield.py, removed unnecessary comments, removed Loading Bar from this PR and changed the styling to PEP8.

@mateuszanotto
Copy link
Author

mateuszanotto commented Apr 10, 2023

Yeah, I understand why this won't work for more complex cases.
It would be helpful if we tried to figure it out, thank you.

Amber uses this functional form:
Screenshot 2023-04-10 at 12 13 01

and it is written on the output as:
Screenshot 2023-04-10 at 12 09 43

I used the RB parameters (Cn) inside QForce as the Potential barrier.
In this graph, I plotted the Etorsion on Amber form using Cn/2 vs Gromacs form using Cn.
Screenshot 2023-04-03 at 10 00 53

One thing that I did was to see how Amber prints the parameter file for propane using gaff2. It prints the dihedrals with a single term and improper dihedrals, which I may be missing. Do Fourier/RB dihedrals cover both cases?

@selimsami
Copy link
Owner

They seem to have figured out the conversion in Parmed:

https://github.com/ParmEd/ParmEd/pull/837/files

More specifically:

        phi = 0 * u.degrees
        fc0 = 1.0*c0 + 0.5*c2 + 0.3750*c4
        fc1 = 1.0*c1 + 0.75*c3 + 0.6250*c5
        fc2 = 0.5*c2 + 0.5*c4
        fc3 = 0.25*c3 + 0.3125*c5
        fc4 = 0.125*c4
        fc5 = 0.0625*c5

Perhaps this is something we can adapt?

@mateuszanotto
Copy link
Author

I adapted the conversions as in Parmed and the results are better, but still off
propaneRB

I noticed that QForce is printing the dihedrals H-C-C-C and C-C-C-H.
I added manually the H-C-C-H dihedral and the vib. freq. get close to the QM reference. I think this may be what was missing. The .itp also doesn't print it, so I suppose Gromacs don't need it. Is there a way to retrieve this dihedral inside QForce?
propaneHCCH

@selimsami
Copy link
Owner

Hi @mateuszanotto ,
Could you push the latest version of the code so I can have a look at it?
Q-Force does not put a dihedral function in the non-primary direction of the dihedral (in this case: H-C-C-C), so I don't think that's the issue. And if it was, the same issue would also exist in GROMACS.

Conversion of RB terms
@mateuszanotto
Copy link
Author

I just pushed the latest version.
The case is that Amber's topology/coordinate generator asks for this non-primary direction.
Screenshot 2023-04-13 at 14 10 27
The program was taking the HCCC dihedral param. for this entry before. Though I understand that making Q-Force put dihedral functions for non-primary dihedrals is more complicated, right?

@selimsami
Copy link
Owner

3 questions:

  1. Maybe I'm missing something but why the c = dihed.equ/2 ?
  2. In the example I shared, equ = 0 always. Though I managed to reproduce the RB potential with equ = 180. Could you try those two options please?
  3. What happens if you add manually a dihedral function with no barrier (so no potential) to H-C-C-H ?

@mateuszanotto
Copy link
Author

Hi @selimsami

Answering your questions:
First, writing a function with no barrier for the H-C-C-H works quite well, so I did all the tests using it.

  1. Amber's dihedral function uses Vn/2

Screenshot 2023-04-14 at 10 47 01

but internally it's passed as Vn/divider (the eq. 15.5 that I showed before).

The .frcmod files are being written with the divider=1 and the c = dihed.equ/2, but I can switch it so it's less confusing.
Just to be sure I calculated the freq. using c = dihed.equ and div=1. This is the result.
propaneCn1

  1. Using equ=0 changes mainly the first 2 vib. modes. I understand that this will affect if the molecule prefers the trans (phi=180) or cis (phi=0) conformation, which is not relevant for propane. I alternate them because the parameters are fitted for 1+cos(n phi)=cos(n phi-0) and 1-cos(n phi)=cos(n phi-180), is that correct?
    propanePhase0

  2. Finally, these are the vib. modes with a function with no barrier for the H-C-C-H. Here I'm using a Fourier with 6 terms converted as in Parmed, the alternating phases, c = dihed.equ, and div = 2.
    propaneHCCH0
    I did it with c = dihed.equ/2, and div = 1, and it's the same.

@selimsami
Copy link
Owner

selimsami commented Apr 14, 2023

Hmm, I'm starting to think that we have more issues than just the dihedral potential.

Here are the results I get from GROMACS with and without a dihedral function.

With dihedral function:
Screenshot 2023-04-14 at 11 03 08 AM

Without dihedral function:
Screenshot 2023-04-14 at 11 03 11 AM

The mean absolute error (MAE) is 3.5% and 8% with and without the dih. function.

You see that the dihedral function only affects the lowest frequencies in my case. For you it seems to affect everything?

Even the C-H bond stretch freq. seems to be somewhat off in your case at ~3000 cm-1.

You could perhaps turn off all dihedral functions and show what kind of a plot and MAE you get?

Could you also share the final force field files you get that is printed by qforce?

@mateuszanotto
Copy link
Author

Hi @selimsami,

These are the results I get from AMBER with and without a dihedral function.

With dihedral function (4 terms):
dihedralfunc

Without dihedral function (potential=0):
noDihe

The (MAE) is 9.2% and 6.6% with and without the dih. function. So using Fourier transforms increases the error.
Indeed it affects the whole spectrum.

I'm usually calculating the frequency analysis on 5 frames as I'm using my notebook to do it. I have it running now with 50 frames to to rule out if it's not a statistical problem, but it should take a few days.

This is the force field file printed by qforce
propane_qforce.txt

but in order to build the topology I have to write the dihedrals with wild cards:
X- 1- 2- X [...]
so X can be any atom. This works for simple molecules such as propane, but for more complex ones this is troublesome for the end user. A way to avoid this would be printing every single dihedral.

A default frcmod is like this:
frcmod.txt
so it has only one entry for each atom type interacion. If there are two C-C bonds with different parameters, one would overwrite the other.

Replaced unique_at[] with ids[]
@selimsami
Copy link
Owner

Hi Mateus, thanks for the update!
What do you mean that you're running the frequency analysis on 5 frames? What is the procedure that you use to compute the vibrational frequencies?
Typically, you would compute the numerical derivative of the forces by minimally displacing the each atom in xyz directions. GROMACS has a tool for this. Probably AMBER too? This process should take couple of seconds on a laptop, not days for sure!

@mateuszanotto
Copy link
Author

Amber has two tools to calculate the vibrational frequency.
The method I'm using is the MM/PBSA, which is used to calculate binding energy from a set number of snapshots from the MD. It calculates entropy terms using quasi-harmonic approximation or the normal mode approximation. This is where I get my frequencies from.
The other method is the same as GROMACS. It diagonalizes the mass-weighted matrix of all atoms to obtain the eigenmodes and perform a quasi-harmonic analysis at a set temperature. I did many tests before, but at that time I never managed to get reasonable frequencies (all were below 20cm-1). I will try it again now that we have the dihedrals with the conversion term.

@selimsami
Copy link
Owner

Considering the referece (QM Hessian calculation), the second options seems like a more similar comparison indeed. There are no entropic contributions there.

One advice regarding that - make sure you have really good optimized structure. use high convergence criteria and double precision if it's possible. That makes a big difference on the results.

And I would not use any of the "quasi-harmonic analysis at a set temperature", There are no temperature effects on the reference vibrational spectra

@mateuszanotto
Copy link
Author

Hello @selimsami,

I investigated further the vib. freq. methods inside amber and they call the same routine internally, which uses the hessian.
I compared the frequencies from QForce FF for Amber with AMBER FF and GAFF2 and found two main problems.

  1. The urey-bradley bond terms are not being read. I'm creating a parameter file (.frcmod) and using the topology builder (leap) to create the topology (.prmtop), which is formatted differently. Leap ignores the interactions between 1-3 bonds. Using the AMBER force field with additional Urey-Bradley terms requires creating the topology's UB sections directly. Although, Amber MD can read UB terms from CHARMM force field (.psf), which includes Urey-Bradley.

  2. I set UB terms to False on settings.ini. It shifts the frequencies slightly, but not as much as expected. The MAE goes from 9.55% to 7.65%
    download
    And using the dihedrals from gaff2 with a single term, the MAE is 4.72%
    download (2)
    So the dihedrals are still a problem.

I'll work on converting the .itp to CHARMM FF with ParmEd to check how it performs on AMBER.
If you have any other suggestions, I can also test it.

Once it's validated, I thought that I can implement support to .psf files as it would make QForce available for CHARMM and AMBER at the same time. I didn't want to waste the .frcmod implementation but AMBER seems more stiff with this format than we first thought.

@selimsami
Copy link
Owner

Hi Mateus,
Thanks for the update.
Even with the UB turned off, the frequencies seem off? Even the higher frequencies so I think dihedrals are still not (the only) issue.
I am really not sure what the issue is. If you're sure the bond/angle terms are correct, I'd look more closely to the settings for the computation of the Hessian:

  • Are you doing the energy minimization to a VERY low force threshold?
  • Are you reading the optimized coordinates for the Hessian calculation with VERY high precision? For example PDB will only have 3 decimals, which is not enough.
  • Is the Hesssian mass-weighted?
  • Other settings that can cause differences?

@mateuszanotto
Copy link
Author

I was looking at the dihedrals from Amber FF and they were all much lower than the one I was using. I just found out that I didn't convert it to kcal/mol after implementing the conversion.

These are the frequencies straight from QForce with urey = False. MAE=4.84%
download (4)

It seems this was the main source of error.
The bonds/angles terms seem a bit underestimated with urey=false. Here I used gaff2 bonds and angles terms to see how they influence the spectra.
gaff2bonds
MAE = 3.22%
gaff2angles
MAE = 2.12%
both combined, the MAE is 0.48%

dihedrals converted from kJ to kcal/mol
Set a note to turn off urey if using amber
@selimsami
Copy link
Owner

Hi Mateusz,
I think you nailed the bonds/angles without Urey, that's great! This is what I get with GROMACS:
Screenshot 2023-05-15 at 3 09 39 PM

However, as you can see the dihedrals (3 lowest frequencies) still seem to be off. Any idea what's happening there?

Is shutting off UB gonna be the final solution for AMBER? That does not seem ideal. I remember you were mentioning a PSF file format that can work with this?

MAE = 3.22%
This looks a little weird. Looking at the plot you show from Q-Force to gaff bonds, especially at higher frequencies, the frequencies only get worse yet somehow the MAE decreases? any idea what's happening there?

@mateuszanotto
Copy link
Author

Oh, it is excellent that the AMBER and GROMACS results are so close without urey. Good to know things are working the way they should.
Regarding the 3 lowest frequencies, I didn't find any other FF on amber using Fourier transforms for the HCCC/HCCH dihedrals to compare. I also realized I was removing the 2 lowest freqs to calculate the MAE. I recalculated the MAEs using sklearn.metrics.
QForce - MAE=10.07%
QF/GAFF2 bonds - MAE=10.16%
QF/GAFF2 angles - MAE=8.93%

Still removing the 2 lowest freqs, I get:
QForce - MAE=5.17%
QF/GAFF2 bonds - MAE=5.27%
QF/GAFF2 angles - MAE=3.94%

Indeed, shutting off UB is not the best solution, and the lowest freqs are still off. Amber's manual says it has support for UB terms using PSF files directly, but I've never used it.
I'll focus on converting ITP to PSF with ParmEd and seeing how it performs with AMBER.

@xiki-tempula
Copy link
Contributor

Hi, I'm interested in using Qforce for my amber simulations. I wonder what is the state of the PR? I could offer some help as well?

@selimsami
Copy link
Owner

I think it was getting close but some of the discrepancies were not corrected in the end.
The issues I believe at the moment are with the dihedral conversion and the Urey-Bradley term.
But I don't use AMBER so I don't have the time/knowledge/interest to try to fix these issues.
Feel free to have a go at it though! I think Mateusz did a good job and it was close to being ready.

@xiki-tempula
Copy link
Contributor

Ok, I had a go but it seems that if I tried to generate the parameters in this way

loadamberparams ligand_qforce.frcmod
ligand = loadmol2 ligand_qforce.mol2
saveAmberParm ligand ligand.parm7 ligand.rst7

Then I got message saying that some of the torsions are missing.

Then I checked the Gromacs topology and these torsions are missing as well. I guess in Gromacs as long as any atom is connected via a bond, it is fine. But for amber, it demand all dihedrals need to be present.

@xiki-tempula
Copy link
Contributor

xiki-tempula commented Dec 21, 2023

For example, in the ligand, Qforce defines
4 8 16 12
5 8 16 12
But amber also require
4-8-16-32
5-8-16-32
Which is not really necessary as 32 is fixed by the 16 8 12 32 improper.
image

@xiki-tempula
Copy link
Contributor

And the parmed seems to be using a different conversion factor now?
https://github.com/ParmEd/ParmEd/blob/master/parmed/topologyobjects.py#L2724

@xiki-tempula
Copy link
Contributor

Also it seems that all the improper torsions are missing from the frcmod

frcmod.write(f" 1 {fc[0]:>15.2f} {0.00:>15.2f} 1\n")


# improper dihedrals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this part is indented incorrectly

@mateuszanotto
Copy link
Author

Indeed I was struggling with improper torsions, but I didn't catch the mistake

frcmod.write("\nIMPROPER\n")
for dihed in terms['dihedral/improper']:
ids = dihed.atomids + 1
equ = np.degrees(dihed.equ)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amber only accept 180 as the equilibrium value for improper
Warning: Expected Improper Torsion PHASE=180 (0.000000)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants