This is the official repository for the Multi-Modal Classification.
This challenge focuses on Object Classification utilizing multi-modal data source including RGB, depth, and infrared images. You can visit the official website for more details.
In this track, we provide a dataset named MMC (Multi-Modal Object Classification), which comprises 3,000 multi-modal image pairs (2000 for training and 1000 for testing) across 13 classes.
| Depth | Thermal-IR | RGB |
|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
MMC
├── train_2k
│ ├──color
│ │ ├── train_0001.png
│ │ ├── train_0002.png
│ │ ├── ... ...
│ │ ├── train_4000.png
│ ├──depth
│ │ ├── train_0001.png
│ │ ├── train_0002.png
│ │ ├── ... ...
│ │ ├── train_4000.png
│ ├──infrared
│ │ ├── train_0001.png
│ │ ├── train_0002.png
│ │ ├── ... ...
│ │ ├── train_4000.png
│ │ ├── ... ...
| |——train_labels.txt
├── test
│ ├──color
│ │ ├── test_0001.png
│ │ ├── test_0002.png
│ │ ├── ... ...
│ │ ├── test_1000.png
│ ├──depth
│ │ ├── test_0001.png
│ │ ├── test_0002.png
│ │ ├── ... ...
│ │ ├── test_1000.png
│ ├──infrared
│ │ ├── test_0001.png
│ │ ├── test_0002.png
│ │ ├── ... ...
│ │ ├── test_1000.png
This code is based on Resnet18.
- ❗Note!!! The validation set is not provided, you should divide the train set appropriately by yourself to validate during training.
- We have modified the model to accommodate this multimodal task, while you can also build your own model to accomplish this task
- Change the root path of the dataset (your path to train_2k)
- run
tain.pyTrain your own model:
python train.py --root path_to_train_2k \
--train_labels train_labels.txt \
--val_labels val_labels.txt \
--epochs 80 \
--eval_period 1 \
--batch 64 \
--num_classes 13 \
--output_file path_to_save_the_log_file_and_the_model \
Generate the predictions submission.csv for test set:
run Inference.py
python Inference.py --root path_to_test_2k \
--checkpoint path_to_the_best_model \
--save path_to_save_the_submissionfile \
--num_classes 13
- ❗Note Results
submission.csvwill be generated automatically and it's the only file you need to submit to the platform for evaluation.





