This is a pytorch implementation of the Paper "Identifying the relevant dependencies of the neural network response on characteristics of the input space" (S. Wunsch, R. Friese, R. Wolf, G. Quast).
As explained in the paper, the method computes the taylorcoefficients of a taylored model function.
The analysis of taylorcoefficients is the optimal method to identify not only first order feature importance, but also higher order importance (i.e. the importance of combined features).
This module can be applied to any differentiable pytorch model.
pip install tayloranalysis
Import tayloranalysis
import tayloranalysis as ta
Wrap either an already initialized PyTorch class instance or the class itself to extend it with the tayloranalysis functionality.
model = ...
model = extend_model(model)
Compute taylorcoefficients: for example tctensor x_test
combinations = [(0,), (0,1)]
x_test = torch.randn(#batch, #features)
forwards_kwargs = {"x": x_test, "more_inputs": misc}
tc_dict = model.get_tc(forward_kwargs_tctensor_key="x",
tc_idx_list=combinations,
reduce_func=torch.mean,)
The output in this example is a dict containing the taylorcoefficients $<\mathrm{TC}{0}>$, $<\mathrm{TC}{0,1}>$.
This package is designed in a way to allow for maximal flexibility. While the reduction function has to be specified (e.g. mean, median, absolute values etc.) the visualization is up to the user. At this point you should have a look at our example.