A scalable open-source framework for machine learning based image collection, annotation and classification: a case study for automatic fish species identification
Open-source modular framework for large scale image storage, handling, annotation and automatic classification, using cost- and labour-efficient methodologies. This framework is based on TensorFlow Lite Model Maker library and includes data augmentation and transfer learning techniques, applied to different convolutional neural network models.
We demonstrate the implementation of this framework in an example case study for automatic fish species identification from images taken through a recreational fishing smartphone application. The framework presented here is highly customisable for further advancement and community based image collection and annotation.
Overview of the framework, with the main tools used, consisting of three modules and six steps. Platforms and specific tools (e.g. Google colab and Google Cloud Platform) indicated here were used in the example application, but could be replaced with other tools.
The publication presenting this work can be found here.
A free online course explaining this framework was ran in October, 2022 and all course materials are available on https://fishsizeproject.github.io/Course-MLforImageProcessing
The study has been supported by the European Regional Development Fund (project No 01.2.2-LMT-K-718-02-0006) under grant agreement with the Research Council of Lithuania (LMTLT). The study was also supported by the Pew Fellows Program in Marine Conservation.
