Skip to content

Index Error when using Transfer Learning #24

@EMUNES

Description

@EMUNES

This is a great project which makes object detection in fastai much easier. Everything works pretty well until I use transfer learning for object detection. My code to build the learner is pretty much the same with the cocotiny_retina_net example in the project code:
image
I set size, bs = 512, 16 for the learner to train 8 rounds, and after that I use learn.data=data in which the new data has size, bs = 1024, 4 in it. Those are all the difference for transfer learning but when I train the model with image size of 1024 I always get:

IndexError                                Traceback (most recent call last)
<ipython-input-30-d81c6bd29d71> in <module>
----> 1 learn.lr_find()

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\fastai\train.py in lr_find(learn, start_lr, end_lr, num_it, stop_div, wd)
     39     cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
     40     epochs = int(np.ceil(num_it/len(learn.data.train_dl))) * (num_distrib() or 1)
---> 41     learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
     42 
     43 def to_fp16(learn:Learner, loss_scale:float=None, max_noskip:int=1000, dynamic:bool=True, clip:float=None,

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\fastai\basic_train.py in fit(self, epochs, lr, wd, callbacks)
    198         else: self.opt.lr,self.opt.wd = lr,wd
    199         callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
--> 200         fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
    201 
    202     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\fastai\basic_train.py in fit(epochs, learn, callbacks, metrics)
     99             for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
    100                 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 101                 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
    102                 if cb_handler.on_batch_end(loss): break
    103 

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\fastai\basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     28 
     29     if not loss_func: return to_detach(out), to_detach(yb[0])
---> 30     loss = loss_func(out, *yb)
     31 
     32     if opt is not None:

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\object_detection_fastai\loss\RetinaNetFocalLoss.py in forward(self, output, bbox_tgts, clas_tgts)
     53         focal_loss = torch.tensor(0, dtype=torch.float32).to(clas_preds.device)
     54         for cp, bp, ct, bt in zip(clas_preds, bbox_preds, clas_tgts, bbox_tgts):
---> 55             bb, focal = self._one_loss(cp, bp, ct, bt)
     56 
     57             bb_loss += bb

D:\ProgramFile\anaconda\envs\ai\lib\site-packages\object_detection_fastai\loss\RetinaNetFocalLoss.py in _one_loss(self, clas_pred, bbox_pred, clas_tgt, bbox_tgt)
     32         bbox_mask = matches >= 0
     33         if bbox_mask.sum() != 0:
---> 34             bbox_pred = bbox_pred[bbox_mask]
     35             bbox_tgt = bbox_tgt[matches[bbox_mask]]
     36             bb_loss = self.reg_loss(bbox_pred, bbox_to_activ(bbox_tgt, self.anchors[bbox_mask]))

IndexError: The shape of the mask [24480] at index 0 does not match the shape of the indexed tensor [24192, 4] at index 0

I work on win10-cuda1.02-pytorch1.5.0-torchvision0.6 and everything works right besides this. Isn't it the same size in each batch when I transfer size, bs=512, 16 to size, bs=1024, 4? How could this Index error occur?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions