Costa Rican Butterfly Classifier

The following project contains a ConvNet architecture implemented in pytorch to classify the following four types of butterflies found in Costa Rica:

Siproeta Stelenes
Morpho Helenor
Euptoieta Hegesia Meridania
Biblis Hyperia Aganisa

Dependencies

To use the code, certain python packages are required. You can install them by running the following command:

pip install -r requirements.txt

Instruction Manual

Directories

The project contains the following directories when you clone it.

butterfly_classificator/
├── checkpoints/
│   └── checkpoint.pth
├── results/
│   ├── epoch_metrics.csv
│   ├── final_results.csv
│   ├── testing_matrix.csv
│   └── training_matrix.csv
├── src/
│   ├── convnet.py
│   ├── csv_writer.py
│   └── trainer.py
└── utils/
    ├── data_augmenter.py
    ├── preprocessor.py
    └── splitter.py

In results you will find .csv files with the results from the last training execution. These can be used to plot the data.
In src you will find the relevant files to train the ConvNet model.
In utils you will find all files related to preprocessing the images.

To run any file with code you will need a version of the dataset.

Dataset

The dataset for the project was constructed from photos taken from Inaturalist, you can download three datasets that were created for this project:

Initial dataset which contains the photos without being split.
Split dataset which contains the photos split into a training, testing and validation split with a 80-10-10 proportion.
Balanced dataset which contains a balanced version of the dataset through data augmentation.

Auxiliary tools

If you want to create your own split, you can use the auxiliary tools in the utils directory to:

Resize the images.
Split the data.
Balance the classes.

Resizing the images

The images were resized to 200x200 pixels to be easier to process by the model, to do this you must have a data directory with the images and execute the following command:

python preprocessor.py

This will create a new directory titled preprocessed_data.

Splitting the data

To split the data you must have the preprocessed_data directory from the previous step and execute the following command:

python splitter.py

This will create a new directory titled split_datawith the following splits:

Training: 80% of the images.
Testing: 10% of the images.
Validation: 10% of the images.

If you want to modify these splits you can change the following line in the code:

splitfolders.ratio(input_path, output=output_path, seed=18, ratio=(0.80, 0.10, 0.10))

Training the ConvNet

Once you either downloaded the dataset or created your own version you can train the model by running the following command inside the src directory:

python trainer.py

This will update the .csv files inside the results directory and will output the training process and final results for the following information for training and validation:

Loss
Macro Accuracy
Macro F1
Macro Precision
Macro Recall
Confusion matrix

Result visualization

To observe the result you can use the following colab notebook in which you can graph the metric results and each confusion matrix from the .csv files in the results directory.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
checkpoints		checkpoints
img		img
results		results
src		src
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Costa Rican Butterfly Classifier

Dependencies

Instruction Manual

Directories

Dataset

Auxiliary tools

Resizing the images

Splitting the data

Training the ConvNet

Result visualization

Made by

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Costa Rican Butterfly Classifier

Dependencies

Instruction Manual

Directories

Dataset

Auxiliary tools

Resizing the images

Splitting the data

Training the ConvNet

Result visualization

Made by

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages