-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Hi @Gengzigang,
Have you tried using different backbones with PCT?
I switched the backbone to HRNet, which produces features of size (batch_size, 72, 96, 48). This is different from the original SwinV2 backbone, which outputs (batch_size, 8, 8, 1024).
However, I noticed that the class head (link: pct_head.py#L175) only modifies the feature channels.
So, in my HRNet version of PCT (link: pct_base_classifier.py#L101), I adjusted the parameters to scale the input size from 2 to 72 * 96 * 2. This roughly matches the parameter count of the Swin backbone, which scales to 8 * 8 * 256.
Despite this, I still find my FPS is slower compared to heatmap-based methods.
Could you share your experience with this? I'd really appreciate your insights!
Thanks a lot!
