(1) A new benchmark, MV-RGBT, is collected to make it representative of multi-modal warranting scenarios, filling the gap between the data in current benchmarks and imaging conditions which motivate RGBT tracking.
(2) A new problem, `when to fuse', is posed to develop reliable fusion strategies for RGBT trackers, as in MMW scenarios multi-modal information fusion may be counterproductive. To facilitate its discussion, a new solution, MoETrack, with multiple tracking experts is proposed. It performs state-of-the-art on several benchmarks, including MV-RGBT, LasHeR, and VTUAV-ST.
(3) A new compositional perspective for method evaluation is provided by categorising MV-RGBT into two subsets, MV-RGBT-RGB and MV-RGBT-TIR, promoting a novel in-depth analysis and offering insightful recommendations for future developments in RGBT tracking.
🫵Find our survey work at repo
The novelty of MoETrack is two-folds: (1) The joint training of multiple experts, leading to more reliable predictions generated from each expert, and (2) the adaptive modality switcher significantly improve the tracking robustness, especially encountering multi-modal warranting scenarios.
⭐ More detailed introduction of the dataset will be available here
This method is built on MPLT, researchers can download our baseline repo and replace its lib dictory with the provided lib.zip.




