Summary
The current Metric class has a _empty_field_to_none model validator that automatically converts empty string fields to None during parsing. This behavior is enabled by default for consistency with fgpyo's Metric, but could be improved with a more flexible, opt-in approach.
Context
#6 (comment)
It might be nice to have "" be a ClassVar so that users could override it with their own special "empty token".
The current implementation hardcodes the empty string to None conversion, which doesn't allow for:
- Disabling the behavior when not needed
- Customizing which values map to
None
- Supporting multiple input representations (e.g., both "" and "None" ->
None)
- Handling other sentinel values like "NA" ->
np.nan
Suggested Solution
Create a mixin class (e.g., RemapValues) that:
- Uses a dict class variable for arbitrary value mappings instead of a single "empty token"
- Generalizes to support cases like:
- multiple string representations of null: {"": None, "None": None}
- mixed null/NaN handling: {"": None, "NA": np.nan}
- (Maybe?) Is opt-in rather than enabled by default on the base Metric class
Summary
The current
Metricclass has a_empty_field_to_nonemodel validator that automatically converts empty string fields toNoneduring parsing. This behavior is enabled by default for consistency with fgpyo'sMetric, but could be improved with a more flexible, opt-in approach.Context
#6 (comment)
The current implementation hardcodes the empty string to
Noneconversion, which doesn't allow for:NoneNone)np.nanSuggested Solution
Create a mixin class (e.g.,
RemapValues) that: