-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Right now custom nodes need to apply certain logic to find upstream data - sometimes filtering on keys. However upstream operations might cause the keys to be different names, or in a different format.
I think that there may be a way to implement a standard way to find upstream data in the AbstractNode class. Every node should be able to take a standard configuration which will search for upstream data for this custom node. We can specify the potential "data args" to a node in this way.
For example if I am looking for data_1, but it is keyed to my_upstream_key_1, we can have a configuration which fixes this mapping for us. Example:
...
class: MyCustomNode
upstream_data:
filter_for_key: my_filter
data_1_key: my_upstream_key_1
data_2_key: my_upstream_key_2
...
Our documentation for each class can include the data which is needed in the data_object, for example:
data_1 (pd.DataFrame): dataframe of training data
data_2 (int): number of cv folds
...
In this way we can ensure that when we are searching upstream, we can always find the data by including an optional remapping.