Fix to bootstrap of free energy surfaces, affecting timing and quantitative results#535
Fix to bootstrap of free energy surfaces, affecting timing and quantitative results#535
Conversation
…changed indices, taking too long.
Codecov Report❌ Patch coverage is 🚀 New features to boost your workflow:
|
|
Couple more changes I will put in to fix uncertainties, don't do anything yet to it . . . |
|
OK, I think I got the changes in I needed to. |
|
@mrshirts when you are ready for review go ahead and add whoever you want to review this PR 😄 |
| self.u_kn[:, bootstrap_indices], self.N_k, initial_f_k=self.mbar.f_k | ||
| ) | ||
| x_nb = x_n[bootstrap_indices] | ||
| # recompute MBAR. |
There was a problem hiding this comment.
This was unnecessary - it was running MBAR too many times. This saves approximately 2X time.
mrshirts
left a comment
There was a problem hiding this comment.
My comments on this for other people.
| fall[:, b] = h["f"] - h["f"][j] | ||
| df_i = np.std(fall, axis=1) | ||
| fall[:, b] = h["f"] - h["f"][j] # subtract out the reference bin | ||
| df_i = np.std(fall, ddof=1, axis=1) |
There was a problem hiding this comment.
Fixing the std definition.
| histogram_datas[b]["f"] - histogram_datas[b]["f"].transpose() | ||
| ) | ||
| dfxij_vals = np.std(fall, axis=2) | ||
| dfxij_vals = np.std(fall, ddof=1, axis=2) |
There was a problem hiding this comment.
Fixing std definition
| kde = self.kde | ||
| kde.fit(x_n, sample_weight=self.w_n) | ||
| kde = self.kde # use these new weights for the KDE | ||
| w_n = self.w_n |
There was a problem hiding this comment.
I actually can't remember if this was 100% necessary to get updated weights . . .
| fall[:, b] = h["f"] - h["f"][j] | ||
| df_i = np.std(fall, axis=1) | ||
| fall[:, b] = h["f"] - h["f"][j] # subtract out the reference bin | ||
| df_i = np.std(fall, ddof=1, axis=1) |
There was a problem hiding this comment.
Fix bootstrap std definition
| if reference_point == "from-lowest": | ||
| fmin = np.min(f_i) | ||
| f_i = f_i - fmin | ||
| wheremin = np.argmin( |
There was a problem hiding this comment.
Need to find the location that this is zeroed at for the actual computation of the std.
| elif reference_point == "from-specified": | ||
| fmin = -self.kde.score_samples(np.array(fes_reference).reshape(1, -1)) | ||
| f_i = f_i - fmin | ||
| wheremin = np.argmin( |
There was a problem hiding this comment.
Need to find the location that this is zeroed at for the actual computation of the std.
| fall[:, b] = -self.kdes[b].score_samples(x) - fmin | ||
| df_i = np.std(fall, axis=1) | ||
| fall[:, b] = -self.kdes[b].score_samples(x) | ||
| fall[:, b] -= fall[wheremin, b] |
There was a problem hiding this comment.
Zero out at the correct location.
|
Suggestions for anyone else who should review - or if there's anyone who could take a look? We are looking at some free energy surface problems for OpenFF, so we want to get this through. |
|
I guess another way to make sure we are doing this correctly is to know what was the issue that the OpenFF folks were having and try to reproduce it here and make it a unit test? I don't know if that's possible, but that would be great to have. |
Free energy surface code was calling MBAR after each call to randomizing bootstraps. It does not appear to affect the results, but slows things down by a factor of a little less than 2x (146 vs 87 seconds for one sample run).