-
Notifications
You must be signed in to change notification settings - Fork 43
_match_attrs to handle matching attributes that are named differently in datasets under comparison #378
Description
_match_attrs can currently only handle comparing a list of attributes among two datasets. However, for dedrifting experiments based on preindustrial control simulations, the attribute to be matched from the experiment dataset is 'parent_variant_label', which needs to correspond with the 'variant_label' of the preIndustrial contol run. Would be nice to have a function for this to allow the user to match dataset dictionaries of experiments to be dedrifted with a dataset dictionary of preindustrial control runs.
Something like:
def _match_twosided_attrs(ds_a, ds_b, attrs_a, attrs_b):
"""returns the number of matched attrs between two datasets"""
if len(attrs_a)!=len(attrs_b):
raise Exception('lists of attributes in each dataset must be of equal length.')
try:
n_match = sum([ds_a.attrs[attrs_a[i]] == ds_b.attrs[attrs_b[i]] for i in range(len(attrs_a))])
return n_match
except KeyError:
raise ValueError(
f"Cannot match datasets because at least one of the datasets does not contain all attributes [{attrs_a} and {attrs_b}]."
)or alternatively an argument indicating parent is being compared with child, which automatically changes parent_variant_label to variant_label (and similar attributes) using this existing line? ds.attrs[f"parent_{ma}"] not in reference.attrs[ma]