Skip to content

Issue: Mask shaping issue during data loading #4

@jonas4climate

Description

@jonas4climate

I am currently working on my final year project further exploring Machine Learning applied to the notation assembly process and have encountered an issue when attempting to run the train.py file from your project. The downloading of the data did not work with the script by default and I had to modify the requirements.txt file to include all dependencies needed but then did get it to run. However at 13% into the loading process I now get this issue:

> python munglinker/train.py
BaseConvnet(
  (cnn): Sequential(
    (conv0): Conv2d(3, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm0): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pooling0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv1): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu1): ReLU(inplace=True)
    (pooling1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu2): ReLU(inplace=True)
    (pooling2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu3): ReLU(inplace=True)
    (pooling3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv4): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm4): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu4): ReLU(inplace=True)
    (pooling4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fully_connected): Linear(in_features=4096, out_features=1, bias=True)
  (output_activation): Sigmoid()
)
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1          [-1, 8, 256, 512]             224
       BatchNorm2d-2          [-1, 8, 256, 512]              16
              ReLU-3          [-1, 8, 256, 512]               0
         MaxPool2d-4          [-1, 8, 128, 256]               0
            Conv2d-5         [-1, 16, 128, 256]           1,168
       BatchNorm2d-6         [-1, 16, 128, 256]              32
              ReLU-7         [-1, 16, 128, 256]               0
         MaxPool2d-8          [-1, 16, 64, 128]               0
            Conv2d-9          [-1, 32, 64, 128]           4,640
      BatchNorm2d-10          [-1, 32, 64, 128]              64
             ReLU-11          [-1, 32, 64, 128]               0
        MaxPool2d-12           [-1, 32, 32, 64]               0
           Conv2d-13           [-1, 32, 32, 64]           9,248
      BatchNorm2d-14           [-1, 32, 32, 64]              64
             ReLU-15           [-1, 32, 32, 64]               0
        MaxPool2d-16           [-1, 32, 16, 32]               0
           Conv2d-17           [-1, 32, 16, 32]           9,248
      BatchNorm2d-18           [-1, 32, 16, 32]              64
             ReLU-19           [-1, 32, 16, 32]               0
        MaxPool2d-20            [-1, 32, 8, 16]               0
           Linear-21                    [-1, 1]           4,097
          Sigmoid-22                    [-1, 1]               0
================================================================
Total params: 28,865
Trainable params: 28,865
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.50
Forward/backward pass size (MB): 47.53
Params size (MB): 0.11
Estimated Total Size (MB): 49.14
----------------------------------------------------------------
Training Strategy E2EOMR_mob_split_muscima_bboxes_2020-11-23-05-33-08:
  Training for 100 epochs with a batch-size of 100.
  Optimizing with <class 'torch.optim.adam.Adam'>, starting at a Learning rate of 0.001
  Loss is computed by BCELoss()
  Early stopping after 10 epochs without improvement.
  Checkpointing every 1 epochs into models/default_model.tsd.ckpt.
  Saving best model into models/default_model.tsd.
  Validating by best validation loss.
  After early stopping, refining for max 10 epochs with patience of 2 epochs without improvement and a learning rate reduction factor of 0.2.

/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py:346: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  split = yaml.load(hdl)
/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py:352: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  config = yaml.load(hdl)
Loading training data...
Loading mung/image pairs from disk:  13%|█████▉                                       | 11/84 [00:14<01:38,  1.35s/it]
Traceback (most recent call last):
  File "munglinker/train.py", line 171, in <module>
    main(args)
  File "munglinker/train.py", line 137, in main
    load_test_data=False
  File "/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py", line 495, in load_munglinker_data
    masks_to_bounding_boxes=train_on_bounding_boxes)
  File "/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py", line 430, in __load_munglinker_data
    mungo.set_mask(image_mask)
  File "/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/venv/lib/python3.7/site-packages/muscima/cropobject.py", line 416, in set_mask
    ''.format(mask.shape, (b - t, r - l)))
ValueError: Mask shape (0, 28) does not correspond to integer shape (21, 28) of CropObject.

Here the requirements.txt if this has possibly any origin here:

torch # Renamed from pytorch
numpy>=1.13.3
muscima
omrdatasettools # prepare_dataset.py doesn't work using this
tqdm
tensorboardX
torchsummary
midiutil
pyyaml # Added
sklearn # Added

Do you have any idea where the issue could originate or if this is a simple fix?

Thank you so much.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions