Skip to content

forward_sliding behavior is confusing #15

@EasternJournalist

Description

@EasternJournalist

The output shape of forward_sliding varies depending on the number of input frames T.

  • When T > 2, it performs normal tracking and returns a flow of shape (B, T, 2, H, W) for each frame.
  • When T == 2, it performs optical flow inference and returns a single flow map of shape (B, 2, H, W).

I assume this design is intended to optimize pair-wise optical flow inference. However, the behavior is confusing and inconsistent from a video-tracking perspective. It requires handling a special case when the sequence contains only two frames.

It would be clearer to separate the two purposes into distinct functions, such as forward_sliding and forward_pair. The forward_sliding function should consistently return the same number of frames as the input sequence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions