Skip to content

Conversation

@nnt-git13
Copy link
Contributor

@nnt-git13 nnt-git13 commented Oct 6, 2025

Author: Nathan Teshome

Summary:
This pull request introduces the working version of the Brain-score Wayback Filter, developed as part of my UROP project.

The goal of this tool is to introduce a dual-date slider similar to the existing sliders in Benchmark Properties to synchronize date inputs that let users travel through the leaderboard to any historical window, so models and scores are shown as they existed then, not just today. The feature wires the UI to backend filtering on Score.end_timestamp, adds similar robust UX behaviors (keyboard & mouse), and ensures performant, debounced grid updates in AG Grid.

Features Implemented:

  • Dual Date Slider (Wayback UI)

    • Interactive two-handle slider for start and end dates with live labels.
    • Calendar that supports date time filtering through date selection.
    • Direct text inputs mirror the slider; changes in either stay in sync.
    • Debounced updates to avoid excessive renders/network calls while dragging.
  • Backend Time-Window Filtering

    • Leaderboard rows filter by Score.end_timestamp (leaf scores only), matching the Brain-Score data model.
    • Supports open-ended ranges (only start or only end provided).
  • AG Grid Integration

    • Efficient row filtering without full data reloads; preserves column state, sort, and selection.
    • Visual highlight of filtered state; clear-filters control restores full view.
  • Export File Support

    • Marks unique timestamp range representing the timestamp range that works for the Wayback Filter.

FInal Visualization Demo:

Brain-Score_PR_Vid.mov

Copy link
Contributor

@KartikP KartikP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nnt-git13 ! Great job with this so far. Would it be possible to eliminate the weird whitespacing indent changes?

Copy link
Contributor

@KartikP KartikP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @nnt-git13 ! The functionality for the most part look good. There are still some things we should address before this feature is completely wrapped up. That said, I've left a few comments, some of which are mentioned once but apply multiple times throughout the PR. Once those are addressed, and @mike-ferguson gives a final pass afterwards, I think it's ready to merge.

Once merged, we'll continue to refine this feature with a dedicated panel for the wayback and the historical model score/rank on model card pages.

@nnt-git13 nnt-git13 requested a review from KartikP December 8, 2025 05:30
@KartikP KartikP closed this Dec 8, 2025
@KartikP KartikP reopened this Dec 8, 2025
@KartikP KartikP closed this Dec 9, 2025
@KartikP KartikP reopened this Dec 9, 2025
@KartikP
Copy link
Contributor

KartikP commented Dec 15, 2025

Update web_tests db

@KartikP KartikP closed this Dec 15, 2025
@KartikP KartikP reopened this Dec 15, 2025
Adds start_timestamp and end_timestamp fields to materialized views:
- Added to mv_base_scores and mv_final_benchmark_context
- Added to final_agg_scores table structure
- Included in INSERT statements for leaf and parent scores
- For parent nodes, uses MIN(start_timestamp) and MAX(end_timestamp) from children
- Added to final model/benchmark context SELECT statements
- Included in score_json JSON aggregation
@KartikP
Copy link
Contributor

KartikP commented Dec 15, 2025

Update

  • Cleaned up some of the code
  • Fix reset all filters button which wasn't resetting the wayback slider
  • Disable start_timestamp slider handle
  • Fixed calendar input box overflow and sizing
  • Updated web_test db with timestamps (copied dev --> web_tests)
  • Updated other tests that were failing due to updating leaderboard scores

Copy link
Member

@mike-ferguson mike-ferguson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Really great stuff with just a few bugs left, but overall very nicely done!

@KartikP
Copy link
Contributor

KartikP commented Dec 17, 2025

We have a built in debounce of 100ms. This is intentional to prevent every increment of a filter change from trigger leaderboard update. That said, the total latency is longer than that.

Upon investigation, one of the main bottleneck was in updateFilteredScores() which performs benchmark depth calculations inside the model row loop to rebuild benchmark hierarchy map. This was unnecessary and only needs to be done once upon filter update.

I have made this improvement. Might not be terribly noticeable but all functionality preserved.

@KartikP
Copy link
Contributor

KartikP commented Dec 17, 2025

I've broken the score aggregation. Will address this.

Will build a better unit test to catch this issue.

Update: Fixed

1. Cache wayback filtering results: build Set of hidden benchmarks once, reuse for O(1) lookups instead of iterating grid nodes

2. Cache root parent lookups: build Map of benchmark -> root parent once, reuse for color recalculation instead of traversing hierarchy for each benchmark
…all models failed

- Problem: Wayback filtering was hiding benchmarks when all values were X, however when a model property filter was applied that produced results where all models visible had all failures, it was hiding the benchmark column and disrupting aggregation
- Solution: Introduced logic where only hides columns when wayback filtering is active and produces results where all vlaues in a column are X. When wayback filtering inactive, don't hide columns based on X values online.
- Update wayback test to make sure sort by score instead of rank.
@KartikP
Copy link
Contributor

KartikP commented Dec 18, 2025

Preview

Screen.Recording.2025-12-18.at.6.15.29.AM.mov

Copy link
Contributor

@KartikP KartikP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and ready!

@KartikP KartikP merged commit c29d598 into master Dec 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants