- Timeframe: roughly two weeks
- Participants: six (varying activity levels / time budgets)
- Backgrounds: students, data scientists, software developers
- There exists a strong imbalance between labeled and unlabelled sound data (a lot of unlabeled data, some labeled data)
- To develop robust AI models, there is the need for labeled input data (supervised training)
- The area in the greater Manaus region (Amazonas, Brasil) has a vast biodiversity, which is mostly unexplored
- Creating sound / activity clusters appears to be a challenge (call with Flor) even with existing closed source software, and a need for transparent / open source solutions is evident
- Create a feature extractor (simple or encoder-decoder based (foundation model))
- Address the need for clear and concise algorithm in classifying / clustering complex sound data for downstream labelling tasks
- Create a pipeline for data ingestion, feature extraction, clustering, labelling and finally updateing / finetuneing a foundation model