-
Notifications
You must be signed in to change notification settings - Fork 272
RFC: Generalize Voting to Associative Connections #359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RFC: Generalize Voting to Associative Connections #359
Conversation
…association document
nielsleadholm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 2. **Probabilistic Vote Mapping** | ||
| - Map incoming votes to local object IDs using learned associations | ||
| - Weight votes by association confidence | ||
| - Handle uncertainty in associations gracefully |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Can you please clarify what this means exactly? What kind of uncertainty are you anticipating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The “uncertainty” refers to noisy or conflicting association signals—e.g., multiple LMs proposing different object IDs with similar confidence, or spurious co-occurrences when objects briefly overlap. In the implementation (UnsupervisedAssociator in src/tbp/monty/frameworks/models/unsupervised_association.py), we address this by maintaining a decayed confidence history (AssociationData.update_confidence() / get_average_confidence()), computing spatial and temporal consistency scores, and combining all of these into a weighted strength via get_association_strength(). An association is only used once its strength clears min_association_threshold, and it keeps decaying if new evidence doesn’t arrive. Instead of locking onto the first co-occurrence, we continuously re-score the link and only forward votes when the accumulated evidence is sufficiently strong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying.
|
|
||
| 1. **Multi-Modal Hypothesis Clustering** | ||
| - Group hypotheses from different LMs based on spatial/temporal consistency | ||
| - Use clustering to identify likely same-object hypotheses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Can you please clarify what you mean by clustering here? This is not a term we've used before, so I'm wondering if the proposal to change how votes are processed by LMs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term "clustering" in the RFC was a bit of an overstatement—it refers to how we group temporally related associations in the
_calculate_temporal_clustering()
function. Instead of traditional clustering, we analyze how densely associations occur in time. For example, if multiple LMs report the same object ID in quick succession, we consider that a stronger signal than sporadic reports. The implementation in
UnsupervisedAssociator
uses a simple density metric (number of associations per time unit) to score this temporal grouping, which is then factored into the overall association strength. Would you prefer we update the RFC to use "temporal grouping" instead of "clustering" to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. Yes I think temporal consistency is already clear enough, in which case I don't think you need this line.
| 2. **Temporal Sequence Learning**: Dynamic object and scene understanding | ||
| 3. **Language Grounding**: Associate learned words with grounded objects | ||
| 4. **Advanced Clustering**: More sophisticated hypothesis grouping algorithms | ||
| 4. **Richer Hypothesis Grouping**: Explore graph-based or probabilistic grouping atop the learned association strengths once the foundational pipeline is validated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Can you please provide a sentence or two what you mean by graph-based or probabilistic grouping here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that if object A and B appear at the same time, and B and C appear at the same time, maybe it makes sense to keep (small) possibility that A and C belong to the same 'scene' even though they never appeared together.
RFC proposal for this feature explained in docs