Update BAWL to use reduced composition and allow for primitive structure reduction.#21
Update BAWL to use reduced composition and allow for primitive structure reduction.#21msiron-entalpic wants to merge 1 commit intomainfrom
Conversation
|
As suggested by @msiron-entalpic, we should enable this for any hasher or similarity matcher for fairness in the benchmark and enable them altogether when we want to reduce to the most primitive cell cf #22. |
|
There is small disagreement between moyo and spglib in 2% of the MP structures' space group. Is it obvious to you all what is the source of truth? |
|
The hasher worked trivially with |
|
@bkmi spacegroup determination in Materials Project uses |
|
Do you know the default parameters for moyo? |
|
btw these parameters are quite strict. I might suggest something more premissive, such as symmetry precision = 0.1 (or 0.01 like pymatgen). Maybe also the angle default from pymatgen. |
|
More experiments, unfortunately all is not well. I am using a version where I allow for a higher tolerance for matches... but this causes hanging behavior. I think it's due to the underlying moyo library. from material_hasher.hasher.bawl import BAWLHasher
from pymatgen.io.cif import CifParser
from pymatgen.transformations.standard_transformations import PerturbStructureTransformation
from pymatgen.io.ase import AseAtomsAdaptor
from copy import deepcopy
parser = CifParser("the.cif")
s = parser.get_structures()[0]
sprc = deepcopy(s).make_supercell((2,1,1))
BAWLHasher(primitive_reduction=False).get_material_hash(sprc) == BAWLHasher(primitive_reduction=False).get_material_hash(s)
# false 👍
BAWLHasher(primitive_reduction=True).get_material_hash(sprc) == BAWLHasher(primitive_reduction=True).get_material_hash(s)
# true 👍
atoms = AseAtomsAdaptor.get_atoms(deepcopy(sprc))
atoms.rattle(0.01)
rsprc = AseAtomsAdaptor.get_structure(atoms)
BAWLHasher(primitive_reduction=True).get_material_hash(rsprc) == BAWLHasher(primitive_reduction=True).get_material_hash(s)
# this hangs 👎
I brought this up in spglib/moyo#89 |
I believe it is 1e-4 for symprec and for angle_tolerance it should be the same as spglib |
Unsure about the implications of this, but let me make a fix to be able to pass on custom tolerance parameters for BAWL! |
|
I did that in my fork. Sorry I didn't pull request it. I'll do it now. |
|
Done |
|
I used relatively permissive parameters. I think we should try to figure out what's the deal with the hanging before merging this (using the permissive parameters). I think making it work with permissive parameters is important. |
Actually you're right, I was thinking of downstream effect since we use this code for labeling fingerprints for LeMat-Bulk, but currently we use Pymatgen with the less stringent symprec, so if we move to Moyo we should also use the less stringent parameters by default. |
|
It sounds like the hanging might have been caused by using degrees when I should have used radians. I'm busy for a while but will follow up later. This tool would be extremely useful for us. |
|
I fixed it. Specifically, I finished my pull request onto Thank you!! |
|
The hasher does not work reliably with moyo. I am investigating making pymatgen the default. It is not recommended to use -- p.s. it remains to be seen whether standard, primitive cells (used in spglib & moyo) are the same as niggli reduced primitive cells. pymatgen claims that standard, primitive are defined in the source |
|
I cannot comment yet on the quality of the hash, but I'm getting much more stable behavior using my new implementation that uses the reduction and primitive functions from pymatgen. |
Problem: currently for some structures, the conventional cell returns a different hash than the primitive cell. For supercell, the full composition is used, so also a different hash is returned. (pointed out by @bkmi in #20 )
Solution: Ideally a supercell, or a conventional vs. primitive cell of a material should return the same hash. Propose update to BAWL to account for both reduced composition (which we initially did in LeMat-Bulk) and to account for transformation to primitive cell using Moyopy.
To do: would welcome benchmarks that transform materials to either primitive or conventional to test whether various hash function well against that. Or move to switch all hash to consider only the primitive transformation via Moyo?
In PR it is not reduced to primitive by default, it must be turned on:
bh = BAWLHasher(primitive_reduction=True)