Suggested update to README.md #2
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
An alternative approach for data attribution with the models with non-identifiable parameters (e.g., LLMs) is to make a semi-supervised connection to the observed data, conditional on the output prediction, by adding a bottleneck ("exemplar") layer to the model, and re-casting the prediction as a function over the training set's labels and representation-space via a metric-learner approximation. I added an early work from 2021 introducing this line of work ("Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition" https://doi.org/10.1162/coli_a_00416).
How do we know that the matched exemplars are actually relevant, or equivalently, that the approximation is faithful to the original model? One simple (but meaningful) metric is whether the prediction of the metric-learner approximation matches the class of the prediction of the original model, and if they do not, the discrepancies should be concentrated in low probability regions. Remarkably, relatively simple functions over the representation space and labels achieve that property. More importantly is that then leads to methods for which we can close the loop on the connection between the data, the representation space, the predictions, and the predictive uncertainty. In other words, interpretability-by-exemplar and uncertainty-awareness become intrinsic properties of the model, as shown in subsequent works (e.g., https://arxiv.org/abs/2502.20167).
This parallel line of work addresses the bulk of the limitations mentioned in Section 7, including efficiency, evaluation, real-world applicability, and addressing the effective sample size (which is related to "Emphasizing Group Influence over Pointwise Influence"). Related to that latter point, other influence-based approaches lack a direct means to control for the predictive uncertainty (and out-of-distribution points), so it is relatively likely that those influence-analysis-based approaches would be misleading (and perhaps severely so) when the epistemic uncertainty is high.