[Bug] Include last attention layer in feature output by dillonalaird · Pull Request #4 · apple/ml-fastvit

dillonalaird · 2023-11-30T19:21:59Z

The last out indice should be 7 instead of 6, at least for the SA12 architecture. On SA12 if we return index 6 as the final layer it skips the last attention layer, while 7 includes it. The Timm implementation does include the final attention layer as output. I have trained both models for segmentation tasks on ADE20k using mmsegmentation with this configuration:

model = dict(
    type='EncoderDecoder',
    data_preprocessor=data_preprocessor,
    backbone=dict(
        type='FastViTSA12',
        pretrained=True,
    ),

    neck=dict(
        type='FPN',
        in_channels=[64, 128, 256, 512],
        out_channels=256,
        num_outs=4,
    ),

    decode_head=dict(
        type='FPNHead',
        in_channels=[256, 256, 256, 256],
        in_index=[0, 1, 2, 3],
        feature_strides=[4, 8, 16, 32],
        channels=128,
        dropout_ratio=0.1,
        num_classes=1,
        norm_cfg=norm_cfg,
        align_corners=False,
        loss_decode=dict(
            type='CrossEntropyLoss',
            use_sigmoid=False,
            loss_weight=1.0,
        ),
    ),
)

Some of the differences are:

Model	Parameters	ADE20k Val mIoU
Apple FastViT SA12 FPN	8.3M	30 mIoU
Timm FastViT SA12 FPN	14.6M	39 mIoU

Using the final attention layer the performance numbers and size line up much more closely to the papers reported numbers.

include last attention layer in feature output

ce7d33a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Include last attention layer in feature output#4

[Bug] Include last attention layer in feature output#4
dillonalaird wants to merge 1 commit intoapple:mainfrom
dillonalaird:main

dillonalaird commented Nov 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dillonalaird commented Nov 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant