Issue: Mask shaping issue during data loading

I am currently working on my final year project further exploring Machine Learning applied to the notation assembly process and have encountered an issue when attempting to run the `train.py` file from your project. The downloading of the data did not work with the script by default and I had to modify the` requirements.txt` file to include all dependencies needed but then did get it to run. However at 13% into the loading process I now get this issue:

```
> python munglinker/train.py
BaseConvnet(
  (cnn): Sequential(
    (conv0): Conv2d(3, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm0): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pooling0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv1): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu1): ReLU(inplace=True)
    (pooling1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu2): ReLU(inplace=True)
    (pooling2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu3): ReLU(inplace=True)
    (pooling3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (conv4): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (batchnorm4): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu4): ReLU(inplace=True)
    (pooling4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fully_connected): Linear(in_features=4096, out_features=1, bias=True)
  (output_activation): Sigmoid()
)
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1          [-1, 8, 256, 512]             224
       BatchNorm2d-2          [-1, 8, 256, 512]              16
              ReLU-3          [-1, 8, 256, 512]               0
         MaxPool2d-4          [-1, 8, 128, 256]               0
            Conv2d-5         [-1, 16, 128, 256]           1,168
       BatchNorm2d-6         [-1, 16, 128, 256]              32
              ReLU-7         [-1, 16, 128, 256]               0
         MaxPool2d-8          [-1, 16, 64, 128]               0
            Conv2d-9          [-1, 32, 64, 128]           4,640
      BatchNorm2d-10          [-1, 32, 64, 128]              64
             ReLU-11          [-1, 32, 64, 128]               0
        MaxPool2d-12           [-1, 32, 32, 64]               0
           Conv2d-13           [-1, 32, 32, 64]           9,248
      BatchNorm2d-14           [-1, 32, 32, 64]              64
             ReLU-15           [-1, 32, 32, 64]               0
        MaxPool2d-16           [-1, 32, 16, 32]               0
           Conv2d-17           [-1, 32, 16, 32]           9,248
      BatchNorm2d-18           [-1, 32, 16, 32]              64
             ReLU-19           [-1, 32, 16, 32]               0
        MaxPool2d-20            [-1, 32, 8, 16]               0
           Linear-21                    [-1, 1]           4,097
          Sigmoid-22                    [-1, 1]               0
================================================================
Total params: 28,865
Trainable params: 28,865
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.50
Forward/backward pass size (MB): 47.53
Params size (MB): 0.11
Estimated Total Size (MB): 49.14
----------------------------------------------------------------
Training Strategy E2EOMR_mob_split_muscima_bboxes_2020-11-23-05-33-08:
  Training for 100 epochs with a batch-size of 100.
  Optimizing with <class 'torch.optim.adam.Adam'>, starting at a Learning rate of 0.001
  Loss is computed by BCELoss()
  Early stopping after 10 epochs without improvement.
  Checkpointing every 1 epochs into models/default_model.tsd.ckpt.
  Saving best model into models/default_model.tsd.
  Validating by best validation loss.
  After early stopping, refining for max 10 epochs with patience of 2 epochs without improvement and a learning rate reduction factor of 0.2.

/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py:346: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  split = yaml.load(hdl)
/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py:352: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  config = yaml.load(hdl)
Loading training data...
Loading mung/image pairs from disk:  13%|█████▉                                       | 11/84 [00:14<01:38,  1.35s/it]
Traceback (most recent call last):
  File "munglinker/train.py", line 171, in <module>
    main(args)
  File "munglinker/train.py", line 137, in main
    load_test_data=False
  File "/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py", line 495, in load_munglinker_data
    masks_to_bounding_boxes=train_on_bounding_boxes)
  File "/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/munglinker/data_pool.py", line 430, in __load_munglinker_data
    mungo.set_mask(image_mask)
  File "/Users/jonasschafer/CS/Uni/Year3/Final Project/MuNG-generation/venv/lib/python3.7/site-packages/muscima/cropobject.py", line 416, in set_mask
    ''.format(mask.shape, (b - t, r - l)))
ValueError: Mask shape (0, 28) does not correspond to integer shape (21, 28) of CropObject.
```

Here the requirements.txt if this has possibly any origin here:
```
torch # Renamed from pytorch
numpy>=1.13.3
muscima
omrdatasettools # prepare_dataset.py doesn't work using this
tqdm
tensorboardX
torchsummary
midiutil
pyyaml # Added
sklearn # Added
```

Do you have any idea where the issue could originate or if this is a simple fix? 

Thank you so much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue: Mask shaping issue during data loading #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue: Mask shaping issue during data loading #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions