[r] add poc matrix projection interface#158
Closed
immanuelazn wants to merge 5 commits intoia/lsifrom
Closed
Conversation
Collaborator
Author
|
Closed as we are changing direction to #167 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add an sklearn-like interface for creating pipelines in terms of fitting, projection, and combining operations on matrices.
Current thoughts:
I deviated from the design docs a little bit to make the inheritance make more sense. I made a default
PipelineBase, withPipelineinheriting form it. I then made aPipelineStepthat inherits from PipelineBase.EstimatorandTransformerboth inherit from this class.PipelineBaseandPipelinediffer because Pipelines should have steps. This isn't true for single PipelineStepsPipelineBaseandPipelineStepdiffer to indicate each step has a step_name associated with it. Also to allow for shared interface for transformers/predictors in how they are printed and how they can be concatenated to create a pipelineI had to change
transform()toproject(), given I found a generic base function with the same name. Additionally, I foundpredict()in the stats package, and changed it toestimate()Tests will be added in another sister PR, as there are no transformers/estimators that are built to test functionality here.
I'm not sure which methods I should provide detail to, given that we are not sure how much of this we want to expose. I provided them to the generics themselves, to allow for a meta-look on how to use methods in both
PipelineStepsandPipeline. However, it isn't clear to me whether I need to continue providing an extensive docstring for every overriden method in child classes.I'm not sure which Classes I should be exposing to the reference either. I found that previous BPCells classes (ie IterableMatrix) aren't heavily described in the reference. I provided some information on
Pipeline,Estimator, andTransformer, and exposed them to the reference page. I also tried to provide information on how to create aTransformer, andEstimatoryourself on the docstring.I don't think I'm completely sold on using the
show()method as an analog of the python__repr__()dunder. I think it could be more useful to make it act more similarly to what you used to displayIterableMatrix, ie where we still have information on what steps are in a pipeline, but also macro information, like hyper params or details on what the step has fit to. In this case, we would have a__repr__()analog somewhere else.Probably redundant to have both
project()andestimate(). What do you think for just combining them into one?