I’m encountering an embedding dimension mismatch when using UltraSAM.pth in my custom pipeline, even though the official example runs successfully.
Error: AssertionError: was expecting embedding dimension of 128, but got 256.
I verified the checkpoint embedding size using:
python -c "import torch; print([v.shape for k,v in torch.load('UltraSAM.pth', map_location='cpu')['state_dict'].items() if 'prompt_encoder' in k and len(v.shape)==2][0])"
output:
torch.Size([2, 128])
What is the correct way to configure the model backbone and decoder to use UltraSAM.pth in a custom pipeline (for segmentation)?
Is there any specific modification required in the custom pipeline / data preprocessor for UltraSAM to generate segmentation masks?