Skip to content

FeatureSpace persistence #57

@leliel12

Description

@leliel12

Let's implement a persistence interface for feature spaces, the simplest
way is to divide this into two stages:

1. Simplify the FSpace into a simpler structure

The easy way is to create a dictionaty with lists and values.

>>> fs = feets.FeatureSpace(only=["PeriodLS", "Mean"])
>>> fs
<FeatureSpace: <Mean>, <LombScargle {'lscargle_kwds': '<MANY CONFIGURATIONS>', 'fap_kwds': '<MANY CONFIGURATIONS>', 'nperiods': 3}>>

>>> fs.to_dict()
{
    "selected_features": ["Mean", "PeriodLS"],
    "required_data": ["magnitude", "time"],
    "dask_options": {},
    "extractors": [
        {
            "Mean": {}
        },
        {
            "LombScargle": {
                'lscargle_kwds': {
                    'autopower_kwds': {
                        'normalization': 'standard',
                        'nyquist_factor': 100
                    }
                },
                'fap_kwds': {
                    'method': 'baluev',
                    'samples_per_peak': 5,
                    'nyquist_factor': 5,
                    'method_kwds': None,
                    'minimum_frequency': None,
                    'maximum_frequency': None
                },
                'nperiods': 3
            }
        }
    ]
}

The implementation should be in two stages, a to_dict in
feets.core.FeatureSpace, and a to_dict for each extractor:

Obviously a valid test for this feature would be:

def test_FeatureSpace_to_dict():
    fs = feets.FeatureSpace(only=["PeriodLS", "Mean"])
    expected = {
        "selected_features": list(fs.features),
        "required_data": list(fs.required_data),
        "dask_options": dict(fs.dask_options),
        "extractors": [ext.to_dict() for ext in fs.execution_plan_]
    
    }
    assert fs.to_dict() == expected

Similarly, a test needs to be made for each extractor. Perhaps the name.

2. Implement persistence

Having the dictionary representation makes persistence in formats like
.yaml or .json very straightforward/simple.

The the two functions that implemets this functionalitty has exactlye the same
signature

FeatureSpace.to_yaml(*, stream_or_buff=None, **kwargs)
FeatureSpace.to_json(*, stream_or_buff=None, **kwargs)
  • If stream_or_buff is:
    • None, the yaml/json code is returned as string.
    • is an instance of str is the path to a file where the code must be
      stored
    • is a file-like object, the code must write the code inside the file-like.
  • **kwargs is always passed as an extra argument to the json/yaml writer.
>>> fs.to_yaml()  # without the stream_or_buff
'dask_options: {}\nextractors:\n- Mean: {}\n- LombScargle:\n    fap_kwds:\n      maximum_frequency: null\n      method: baluev\n      method_kwds: null\n      minimum_frequency: null\n      nyquist_factor: 5\n      samples_per_peak: 5\n    lscargle_kwds:\n      autopower_kwds:\n        normalization: standard\n        nyquist_factor: 100\n    nperiods: 3\nrequired_data:\n- magnitude\n- time\nselected_features:\n- Mean\n- PeriodLS\n'

>>> fs.to_json()
'{"selected_features": ["Mean", "PeriodLS"], "required_data": ["magnitude", "time"], "dask_options": {}, "extractors": [{"Mean": {}}, {"LombScargle": {"lscargle_kwds": {"autopower_kwds": {"normalization": "standard", "nyquist_factor": 100}}, "fap_kwds": {"method": "baluev", "samples_per_peak": 5, "nyquist_factor": 5, "method_kwds": null, "minimum_frequency": null, "maximum_frequency": null}, "nperiods": 3}}]}'

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions