Skip to content

Add new category containing interstitial benchmarks#337

Open
zhonganr wants to merge 3 commits intoddmms:mainfrom
zhonganr:benchmark_interstitial
Open

Add new category containing interstitial benchmarks#337
zhonganr wants to merge 3 commits intoddmms:mainfrom
zhonganr:benchmark_interstitial

Conversation

@zhonganr
Copy link

@zhonganr zhonganr commented Feb 4, 2026

Add a new category interstitial to assess the models' predictive performance for interstitial defect properties. Two benchmarks FE1SIA and Relastab are included:

  • FE1SIA evaluates the formation energy of a single self-interstitial atom (SIA) in a host lattice for distinct configurations.
  • Relastab evaluates the ability of models to correctly rank the stability of different interstitial configurations.
    Related to Interstitial benchmarks #339

@ElliottKasoar ElliottKasoar added the new benchmark Proposals and suggestions for new benchmarks label Feb 6, 2026
calc = model.get_calculator()

data_path = download_s3_data(
key="inputs/interstitial/FE1SIA/DB.zip",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to keep the data for the two tests together?

Ideally, we'd have inputs/interstitial/FE1SIA/FE1SIA.zip and inputs/interstitial/relative_stability/relative_stability.zip unless there's anything connecting them?

This also reduces the chance of clashes between directories on unzipping in the cache.

Also note: normally we'd want relative_stability to be consistent with the test name, so if Relastab is the name you're happy with, then actually I'd go with inputs/interstitial/Relastab/Relastab.zip

If you're happy with this, I'm happy to upload zipped versions of the two individual folders within DB.zip in this form?

ref_formation_energy = energy_ref_raw - (n_config / n_bulk) * energy_bulk

# Read structure
atoms = read(poscar_path, format="vasp")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a default spin (multiplicity) and charge? See similar changes here: https://github.com/ddmms/ml-peg/pull/384/changes

I very recently discovered Orb's omol model always requires both to be set, annoyingly

Comment on lines +51 to +53
except (ValueError, IndexError):
print("Skipping ref.poscar: distinct energy value not found in header.")
energy_bulk = 0.0 # Fallback or break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is only a single reference file, shouldn't we be quite confident that we can read it?

calc = model.get_calculator()

data_path = download_s3_data(
key="inputs/interstitial/relative_stability/DB.zip",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

Comment on lines +54 to +56
except (IndexError, ValueError) as e:
print(f"Warning: Could not extract energy from header '{header}': {e}")
atoms.info["ref"] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this happen? Given we only have a few files, we probably don't need to keep files with missing references?


# Calculate
atoms.calc = calc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above regarding spin/charge

Comment on lines +44 to +46
calculate_rmsd.py
DB/
DB.zip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
calculate_rmsd.py
DB/
DB.zip

I don't think we should need any of these?

@ElliottKasoar
Copy link
Collaborator

Thanks for adding this and sharing the data! It's looking great so far!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new benchmark Proposals and suggestions for new benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants