Skip to content

4.0 Release Ideas #498

@mpu-creare

Description

@mpu-creare

The 3.x release has a few backward compatible features that we should eliminate for the sake of a better overall design.

  • Interpolation should be its own node class
    • The InterpolationMixin is leading to some very tricky bugs, and right now it introduced a multi-threading error
  • Caching should be it's own node class
  • The JSON format is not JSON, this should be fixed.

This will give a much cleaner design and the user will have much more control over caching in general.

It comes at the cost of a more cumbersome API. For example, currently:

node = podpac.data.Rasterio(source=<path to dataset>)

gives you a node that you can interpolate to any coordinate system with caching enabled -- this is generally what you want. With the new proposal you need:

data_node = podpac.data.Rasterio(source=<path to dataset>)
cache_node = podpac.Cache(source=data_node)  # Flexibility lets you put that AFTER interpolation, if desired
node = podpac.Interpolation(source=cache_node)

So, instead, what if we made a new class method on Node that acts as a factory:

node = podpac.data.Rasterio.make(
    source=<path to dataset>,
    add_cache=True,
    add_interp=True,
    cache_interp_order="cache_first"
)
# or simply
node = podpac.data.Rasterio.make(True, True, source=<path to dataset>)

where:

class Node(...):
    ...
    @classmethod
    def make(cls, add_cache=None, add_interp=None, cache_interp_order="cache_first", factory=None, factory_kwargs=None, **kwargs):
        """
        Parameters
        -------------
        add_cache: bool, dict
                Default is None. If `True` or `dict`, will add a `Cache` node to the pipeline definition. If `dict`, the parameters in the dictionary
                will be passed on to the `Cache` node creation
        add_interp: bool, dict
                Default is None. If `True` or `dict`, will add an `Interpolation` node to the pipeline definition. If `dict`, the parameters in the dictionary
                will be passed on to the `Interpolation` node creation
        cache_interp_order: enum
                Default is "cache_first". Options are "cache_first" or "interp_first". This controls the order of nodes added to the source
                node when BOTH caching and interpolation are used.
        factory: callable
                Default is None. A function that will be called to create the class. If supplied, all other inputs are ignore and the 
                function returns: `factory(cls(**kwargs), **factory_kwargs)`.
        factory_kwargs: dict
                Key-word arguments that will be passed to the factory. Default is None. 

        """

I'm not sure about the factory parts... seems like you could just create a function and do: factory(podpac.data.Rasterio(source=...), kw1=...)

Anyway, thoughts welcome!

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions