This repo was used to enter the AIcrowd Snake Species Identification Challenge; an entry based on this repo placed first in the first qualifying round of the competition. The competition aims to stimulate the development of a snake species identification application, so bite victims and health practitioners can prioritize care for potentially-harmful bites. For the qualifying round, the competition provided ~82k images and ~18k test images covering 45 species.
The competition entry built on the core repo tools; new code was written to:
- Converted the competition data set to COCO format, for compatibility with the training code
- Call into the existing code to train both ResNeXt and Inceptionv4 networks (around 80 epochs)
- Aggregate results from the ResNeXt and Inceptionv4 models (post-hoc averaging)
- Run inference on the test data and prepare a submission for the competition
This approach yielded an F1 of 0.809 for Inceptionv4 and 0.804 for ResNext101. The averaged predictions achieved an F1 of 0.846, which placed first in the qualifying round of the competition.
-
Follow the steps in README.md to create the required docker or conda environment.
-
Download the training and test data from here.
-
Unzip the training and test zipfiles into a folder called "data" in the PyTorchClassification directory, or symlink a directory called
datato point to your data directory. When you unzip the training data, images should end up indata/train(e.g.data/train/[class]/[filename].jpg). Test data should end up indata/round1/[filename].jpg. -
Run the following commands:
# cd into the PyTorchClassification directory
python snakes/folder_to_coco.py # Creates the COCO annotation format for the dataset
python run_snakes_training.py # Trains both ResNext101 and Inceptionv4 architectures
python snakes/test_snakes.py # Generates predictions on the test dataset
python snakes/merge_snakes_results.py # Merges the results from the two different architectures
