-
Notifications
You must be signed in to change notification settings - Fork 24
Description
I'm digging into compositional groundings using the GroundingInsightExporter, and I noticed that the returned "score" for a given slot grounding does not equal the "avg match" score produced by averaging all the positive examples. In some cases, the second best grounding by "score" has a higher "avg match" score than the top grounding, and in fact is sometimes the preferred grounding.
As an example, in a sentence like "X caused population growth", the top theme grounding is "wm/concept/population_demographics/" with a score of 0.88844055 but an avg match score of 0.60294354. The second best theme grounding is "wm/concept/population_demographics/population_density/population_growth" with a score of 0.86057734 (lower than the top grounding) but an avg match of 0.7405923 (higher than the top grounding).
Any idea why these scores are different, and where they are computed? I think I tracked down where "avg match" is getting computed, but the regular "score" is nested within nests of different grounding classes. Any help is greatly appreciated!