Integrated Gradients not faithful to original formulation 

The current implementation of integrated gradients interpolates on the entire patch mask. However, this introduces dependencies between upstream nodes and downstream nodes. Let f_k(x) denote the output of a component k, and f_k_alpha(x) denote the output of a component with previous component outputs interpolated with alpha.  We want to be substituting f_k(x) for alpha f_k(x) + (1-alpha) f_k(x), but instead we're substituting alpha * f_k_alpha(x) + (1-alpha) f_k_alpha(x). [Sparse Feature Circuits](https://arxiv.org/pdf/2403.19647) addresses this in section 2: 

> This [IG] cannot be done in parallel for two nodes when one is downstream of another, but can be
done in parallel for arbitrarily many nodes which do not depend on each other. Thus the
additional cost of computing ˆIEig over ˆIEatp scales linearly in N and the serial depth of m’s
computation graph.

I don't _think_ the edge case is different, and thus think the current implementation in `prune_algos.mask_gradient` is incorrect, and should be adjusted to compute scores iteratively over source node layers. 

(This would change the time complexity from O(forward * N) to O(forward * n_layers * N), so maybe just add as an optional setting)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrated Gradients not faithful to original formulation #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Integrated Gradients not faithful to original formulation #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions