Issue with evaluation when using a custom dataset

Hi there, thank you so much for putting this repository together for this implementation, it's very interesting!

I'm working on implementing this with a custom COCO instances formatted dataset rather than the original COCO 2017 instances dataset. I did an initial test run using the original COCO dataset, and was able to see the validation segm AP results gradually begin to increase as expected in as little as 500 iterations with a batch size of 2 for a quick test:

`python3 -W ignore train_net.py --config-file ./configs/coco/instance-segmentation/deit/maskformer2_deit_base_bs16_50ep.yaml --num-gpus 2 --num-machines 1 SSL.PERCENTAGE 100 SSL.TRAIN_SSL False OUTPUT_DIR ./output-teacher`

My problems arise when I begin integrating my custom dataset. I am able to successfully register my training/test set using `register_coco_instances` from `data.datasets` > `coco.py`. I then update the configuration accordingly: 

```
cfg.DATASETS.TRAIN = ("custom_train",)
cfg.DATASETS.TEST = ("custom_test",)
```

Inside the `coco_unlabel` folder, I create the symlinks for the `images` folder pointing to my training images folder and the symlink for the `val2017` folder to my validation set as per the instructions. I point ` DETECTRON2_DATASETS` to the location where `coco_unlabel` lives, and it appears to pick it up.

Up to here, everything works fine. The training job starts using:

`python3 -W ignore train_net.py --config-file ./configs/coco/instance-segmentation/deit/maskformer2_deit_base_bs16_50ep.yaml --num-gpus 2 --num-machines 1 SSL.PERCENTAGE 100 SSL.TRAIN_SSL False OUTPUT_DIR ./output-teacher`

When the training job attempts to do the first evaluation step (set to 500 for testing), an error shows explaining my test set doesn't appear to be registeredm even though it picked up the training set:

```
[03/02 22:16:41 d2.utils.events]:  eta: 2 days, 13:54:06  iter: 499  total_loss: 50.87  loss_ce: 0.1988  loss_mask: 1.255  loss_dice: 3.667  loss_ce_0: 1.005  loss_mask_0: 0.8838  loss_dice_0: 3.57  loss_ce_1: 0.1726  loss_mask_1: 1.169  loss_dice_1: 3.563  loss_ce_2: 0.1709  loss_mask_2: 1.215  loss_dice_2: 3.544  loss_ce_3: 0.1839  loss_mask_3: 1.165  loss_dice_3: 3.657  loss_ce_4: 0.1798  loss_mask_4: 1.212  loss_dice_4: 3.613  loss_ce_5: 0.2062  loss_mask_5: 1.233  loss_dice_5: 3.729  loss_ce_6: 0.2123  loss_mask_6: 1.267  loss_dice_6: 3.744  loss_ce_7: 0.2188  loss_mask_7: 1.259  loss_dice_7: 3.683  loss_ce_8: 0.1927  loss_mask_8: 1.263  loss_dice_8: 3.703    time: 0.6120  last_time: 0.6109  data_time: 0.0064  last_data_time: 0.0057   lr: 0.0001  max_mem: 10689M
Traceback (most recent call last):
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/data/catalog.py", line 51, in get
    f = self[name]
  File "/usr/lib/python3.10/collections/__init__.py", line 1106, in __getitem__
    raise KeyError(key)
KeyError: 'custom_test'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/b/GuidedDistillation/train_net.py", line 470, in <module>
    launch(
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/engine/launch.py", line 84, in launch
    main_func(*args)
  File "/home/b/GuidedDistillation/train_net.py", line 464, in main
    return trainer.train()
  File "/home/b/GuidedDistillation/modules/defaults.py", line 566, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/b/GuidedDistillation/modules/train_loop.py", line 165, in train
    self.after_step()
  File "/home/b/GuidedDistillation/modules/train_loop.py", line 199, in after_step
    h.after_step()
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/engine/hooks.py", line 556, in after_step
    self._do_eval()
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/engine/hooks.py", line 529, in _do_eval
    results = self._func()
  File "/home/b/GuidedDistillation/modules/defaults.py", line 525, in test_and_save_results
    self._last_eval_results = self.test(self.cfg, self.model)
  File "/home/b/GuidedDistillation/modules/defaults.py", line 691, in test
    evaluator = cls.build_evaluator(cfg, dataset_name)
  File "/home/b/GuidedDistillation/train_net.py", line 115, in build_evaluator
    evaluator_list.append(COCOEvaluator(dataset_name, output_dir=output_folder))
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/evaluation/coco_evaluation.py", line 142, in __init__
    convert_to_coco_json(dataset_name, cache_path, allow_cached=allow_cached_coco)
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/data/datasets/coco.py", line 511, in convert_to_coco_json
    coco_dict = convert_to_coco_dict(dataset_name)
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/data/datasets/coco.py", line 354, in convert_to_coco_dict
    dataset_dicts = DatasetCatalog.get(dataset_name)
  File "/home/b/.local/lib/python3.10/site-packages/detectron2/data/catalog.py", line 53, in get
    raise KeyError(
KeyError: "Dataset 'custom_test' is not registered!
```

If I register the test set with `detectron2.data.datasets` instead of `data.datasets`, the evaluation works, but the AP values are always 0 no matter how long the job runs:

```
[03/02 22:23:40 d2.utils.events]:  eta: 2 days, 11:02:34  iter: 479  total_loss: 51.48  loss_ce: 0.2197  loss_mask: 0.9924  loss_dice: 3.769  loss_ce_0: 1.136  loss_mask_0: 0.8419  loss_dice_0: 3.583  loss_ce_1: 0.2115  loss_mask_1: 1.055  loss_dice_1: 3.583  loss_ce_2: 0.1991  loss_mask_2: 1.087  loss_dice_2: 3.628  loss_ce_3: 0.2439  loss_mask_3: 1.014  loss_dice_3: 3.63  loss_ce_4: 0.2733  loss_mask_4: 0.9731  loss_dice_4: 3.611  loss_ce_5: 0.2954  loss_mask_5: 0.9499  loss_dice_5: 3.639  loss_ce_6: 0.2749  loss_mask_6: 1.042  loss_dice_6: 3.646  loss_ce_7: 0.2482  loss_mask_7: 0.9416  loss_dice_7: 3.682  loss_ce_8: 0.2521  loss_mask_8: 1.016  loss_dice_8: 3.729    time: 0.5927  last_time: 0.5788  data_time: 0.0061  last_data_time: 0.0044   lr: 0.0001  max_mem: 10690M
[03/02 22:23:52 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[03/02 22:23:52 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[03/02 22:23:52 d2.data.common]: Serializing 74 elements to byte tensors and concatenating them all ...
[03/02 22:23:52 d2.data.common]: Serialized dataset takes 0.05 MiB
[03/02 22:23:52 d2.evaluation.evaluator]: Start inference on 74 batches
[03/02 22:23:54 d2.evaluation.evaluator]: Inference done 11/74. Dataloading: 0.0010 s/iter. Inference: 0.1043 s/iter. Eval: 0.0543 s/iter. Total: 0.1596 s/iter. ETA=0:00:10
[03/02 22:23:59 d2.evaluation.evaluator]: Inference done 44/74. Dataloading: 0.0010 s/iter. Inference: 0.1034 s/iter. Eval: 0.0521 s/iter. Total: 0.1565 s/iter. ETA=0:00:04
[03/02 22:24:04 d2.evaluation.evaluator]: Total inference time: 0:00:11.024099 (0.159770 s / iter per device, on 1 devices)
[03/02 22:24:04 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:07 (0.106315 s / iter per device, on 1 devices)
[03/02 22:24:04 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[03/02 22:24:04 d2.evaluation.coco_evaluation]: Saving results to ./output-teacher/inference/coco_instances_results.json
[03/02 22:24:04 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
[03/02 22:24:04 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[03/02 22:24:04 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.00 seconds.
[03/02 22:24:04 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[03/02 22:24:04 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.00 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
[03/02 22:24:04 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|  AP   |  AP50  |  AP75  |  APs  |  APm  |  APl  |
|:-----:|:------:|:------:|:-----:|:-----:|:-----:|
| 0.000 | 0.000  | 0.000  | 0.000 | 0.000 | 0.000 |
Loading and preparing results...
DONE (t=0.06s)
creating index...
index created!
[03/02 22:24:04 d2.evaluation.fast_eval_api]: Evaluate annotation type *segm*
[03/02 22:24:04 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.01 seconds.
[03/02 22:24:04 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[03/02 22:24:04 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.00 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
[03/02 22:24:04 d2.evaluation.coco_evaluation]: Evaluation results for segm: 
|  AP   |  AP50  |  AP75  |  APs  |  APm  |  APl  |
|:-----:|:------:|:------:|:-----:|:-----:|:-----:|
| 0.000 | 0.000  | 0.000  | 0.000 | 0.000 | 0.000 |
[03/02 22:24:04 d2.evaluation.testing]: copypaste: Task: bbox
[03/02 22:24:04 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[03/02 22:24:04 d2.evaluation.testing]: copypaste: 0.0000,0.0000,0.0000,0.0000,0.0000,0.0000
[03/02 22:24:04 d2.evaluation.testing]: copypaste: Task: segm
[03/02 22:24:04 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[03/02 22:24:04 d2.evaluation.testing]: copypaste: 0.0000,0.0000,0.0000,0.0000,0.0000,0.0000
```

Am I missing something here? I'm assuming its related to registering my datasets, as the original COCO dataset implementation from the guide appears to work. I've also made sure to update the `NUM_CLASSES` field across the config according to the classes available in my custom dataset. I've also tried the Dino/R50 bases as well with no luck. Thank you!




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with evaluation when using a custom dataset #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with evaluation when using a custom dataset #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions