Skip to content

Conversation

@allenschmaltz
Copy link

An alternative approach for data attribution with the models with non-identifiable parameters (e.g., LLMs) is to make a semi-supervised connection to the observed data, conditional on the output prediction, by adding a bottleneck ("exemplar") layer to the model, and re-casting the prediction as a function over the training set's labels and representation-space via a metric-learner approximation. I added an early work from 2021 introducing this line of work ("Detecting Local Insights from Global Labels: Supervised & Zero-Shot Sequence Labeling via a Convolutional Decomposition" https://doi.org/10.1162/coli_a_00416).

How do we know that the matched exemplars are actually relevant, or equivalently, that the approximation is faithful to the original model? One simple (but meaningful) metric is whether the prediction of the metric-learner approximation matches the class of the prediction of the original model, and if they do not, the discrepancies should be concentrated in low probability regions. Remarkably, relatively simple functions over the representation space and labels achieve that property. More importantly is that then leads to methods for which we can close the loop on the connection between the data, the representation space, the predictions, and the predictive uncertainty. In other words, interpretability-by-exemplar and uncertainty-awareness become intrinsic properties of the model, as shown in subsequent works (e.g., https://arxiv.org/abs/2502.20167).

This parallel line of work addresses the bulk of the limitations mentioned in Section 7, including efficiency, evaluation, real-world applicability, and addressing the effective sample size (which is related to "Emphasizing Group Influence over Pointwise Influence"). Related to that latter point, other influence-based approaches lack a direct means to control for the predictive uncertainty (and out-of-distribution points), so it is relatively likely that those influence-analysis-based approaches would be misleading (and perhaps severely so) when the epistemic uncertainty is high.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant