Hello community,
I've been trying to fine-tune a mask2former model for a binary segmentation task. However every time I start the training the model starts only predicting null-tensors after a few epochs.
I first thought that I was getting unlucky and the majority of images that RandomCrop gave for training were completely empty. However after implementing code that guarantees that my object is always on screen the model still outputs null-tensors after a few epochs.
I also implemented low-rank adaptation to account for catastrophic forgetting but that didn't help either.
At this point I am convinced that it has to do something with the way I am initializing my image processor, model or LoraConfig so I'd appreciate some help here.
This is the initialization of the image_processor:
from transformers import Mask2FormerImageProcessor
IMAGE_PROCESSOR = Mask2FormerImageProcessor.from_pretrained(
"facebook/mask2former-swin-base-IN21k-ade-semantic"
, do_rescale = False, do_normalize = True
, do_resize = False
, num_labels = 2, ignore_index = 0
)
This is my model initialization:
from transformers import Mask2FormerForUniversalSegmentation
MODEL = Mask2FormerForUniversalSegmentation.from_pretrained(
"facebook/mask2former-swin-base-IN21k-ade-semantic"
, num_labels = 2
, ignore_mismatched_sizes = True
)
And lastly this is the initialization of LoraConfig and everything related to that:
from peft import LoraConfig, get_peft_model
target_modules = ['q_proj','k_proj','v_proj','out_proj', 'class_predictor']
config = LoraConfig(
r = 8
, lora_alpha = 16
, target_modules = target_modules
, lora_dropout = 0.3
, bias = "lora_only"
, init_lora_weights = "pissa"
, use_rslora = True
, modules_to_save = ['decode_head']
)
LORA_MODEL = get_peft_model(MODEL, config)
OPTIMIZER = torch.optim.AdamW(LORA_MODEL.parameters(), lr = 0.00001)
Any insight would be appreciated since I am at my wit's end.