Conversation
|
Database file: |
|
Looking super good overall! The key thing i've noticed is the processed_data function in analysis. The issue with this is that as we have the decorators inside the funciton, if one model has no data, the script will fail. so i think we need to think about how to split it up so that its more robust update: this is an us problem not a you problem |
ElliottKasoar
left a comment
There was a problem hiding this comment.
Thanks for this, it's looking really nice! I've left a few mostly minor comments.
My main suggestions/questions revolve around the app callbacks. I think what do have works really well, but if possible it would be great to integrate any modifications you need into the existing helper functions we have.
Also, in terms of visualisation, I wonder how tricky it would be to view the structure as a trajectory vs distance, a bit like our live NEBs example - so if you click on a point, it shows the structure at that distance, but you can also play the trajectory as the distance changes.
| global DATABASE_INFO_SAVED | ||
| if not DATABASE_INFO_SAVED: | ||
| OUT_PATH.mkdir(parents=True, exist_ok=True) | ||
| database_info_path = OUT_PATH / "database_info.yml" |
There was a problem hiding this comment.
Can you not just check if (OUT_PATH / "database_info.yml").exists() ?
| water_energy = atoms.get_potential_energy() | ||
|
|
||
| # Iterate through strain conditions | ||
| for strain in strains: |
There was a problem hiding this comment.
It's fairly minor since it's so short, but it might be nice to use tqdm for this, just so we can see progress
For slower models, it can take a minute or more, and since it's very reasonable to run this locally, that might be concerning
| @plot_scatter( | ||
| filename=OUT_PATH / model / f"figure_{orientation}_{strain}.json", | ||
| title=f"{orientation} binding energy curve ({strain[1:5]}% strain)", | ||
| x_label="Distance / Å", | ||
| y_label="Adsorption energy / meV", | ||
| show_line=True, | ||
| ) | ||
| def plot_model_binding_energy_curve( | ||
| model, orientation, strain | ||
| ) -> dict[str, tuple[list[float], list[float]]]: | ||
| return { | ||
| "ref": ( | ||
| results["distances"], | ||
| results["ref"][orientation][strain]["energies"], | ||
| ), | ||
| model: ( | ||
| results["distances"], | ||
| results[model][orientation][strain]["energies"], | ||
| ), | ||
| } |
There was a problem hiding this comment.
I would probably move this and the other plots out of this function, probably just to a top-level function that is called here.
You may still need to wrap it in a function to parameterise the plot_scatter decorators, but it makes processed_data a bit less unwieldy
| for i in range(len(processed_data["distances"])): | ||
| deviations.append( | ||
| abs( | ||
| processed_data[model][orientation][strain]["energies"][i] | ||
| - processed_data["ref"][orientation][strain]["energies"][i] | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Can this be rewritten to use mae from analysis.utils? This is absolutely fine, but we intend at some point to make changes such that RMSE etc. are computed alongside MAE in a swappable way, and so it would be simpler if we're able to swap about a single function.
| ) | ||
| ) | ||
| results[model] = np.nan_to_num( | ||
| np.mean(deviations), nan=99999, posinf=99999, neginf=99999 |
There was a problem hiding this comment.
For now, if a model fails I'd probably return None. We have plans to translate something like np.inf to the 0 score, but None would probably be most consistent, if it works ok here.
| ) | ||
| ) | ||
| results[model] = np.nan_to_num( | ||
| np.mean(deviations), nan=999, posinf=999, neginf=999 |
| good: 40.0 | ||
| bad: 1000.0 | ||
| unit: meV | ||
| weight: 1.0 | ||
| tooltip: Mean Absolute Error across all orientations, distances, and strains | ||
| level_of_theory: PBE | ||
| Binding Energies MAE: | ||
| good: 40.0 | ||
| bad: 1000.0 | ||
| unit: meV | ||
| weight: 1.0 | ||
| tooltip: Mean Absolute Error of binding energies across all orientations and strains | ||
| level_of_theory: PBE | ||
| Binding Lengths MAE: | ||
| good: 0.0 | ||
| bad: 1.0 |
There was a problem hiding this comment.
Have you thought about these good and bad thresholds (genuine question)?
1 eV seems quite large to me
| ) | ||
|
|
||
|
|
||
| def struct_from_scatter_custom(scatter_id, struct_id, structs): |
There was a problem hiding this comment.
What does the normal struct_from_scatter function not do that you need?
| ) | ||
|
|
||
|
|
||
| def plot_and_struct_from_scatter(scatter_id, plot_id, plots_list, struct_id, structs): |
There was a problem hiding this comment.
I'm not sure I understand why this is needed in addition to the other callbacks we have/you have added?
| return ( | ||
| Div("Click on a metric to view plot."), | ||
| Div("Click on a metric to view plot."), | ||
| Div("Click on a metric to view the structure."), | ||
| ) |
There was a problem hiding this comment.
Why do we want all of these displayed, and what does this do that we can't do with existing callback helpers?
Pre-review checklist for PR author
PR author must check the checkboxes below when creating the PR.
Summary
Added a new benchmark based on the adsorption energy curves of a single water molecule (at various orientations) on a sheet of graphene (under various strain conditions), which is useful for understanding nanoscale wetting. Reference calculations based on PBE functional, calculated using FHI-aims on "intermediate" settings.
Three metrics of equal weight:
Linked issue
Resolves #292
Progress
Testing
Benchmark tested on all currently-implemented models, i.e.:
with no issues. At current time of writing, mace-mp-0a performs the best on all metrics, whereas pet-mad completely fails to produce physically-reasonable adsorption energy curves.
New decorators/callbacks
Added a new "plot_from_scatter" callback, which implements almost identical functionality to "struct_from_scatter" except that it renders a Plotly Graph object instead.