Skip to content

Build Variable Importance Function #4

@gaffney2010

Description

@gaffney2010

Per Dan:

Build in variable importance function that uses:
built in functions with sci-kit learn
Shapley Value based importance (run-time would be 2^n (number of models to fit) where n is the number of predictors/features in the model)
Perhaps we could use correlation to make a network so that instead of testing all coalitions, we only test those with high correlation
The assumption would be that the contribution of independent variables woud be roughly additive. (this seems fair)
We would still look at all possible subsets, but for uncorrelated variables, we could just add up their contributions
If Shaply Value importance is fit on training and evaluated on holdout, then after we calculate Shapley we could just remove all variables with a negative shapley value
This would be an alternative to forward/backward regression for variable selection

Figure out a way to evaluate variable importance when using dummy variables

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions