95/23 Add San Diego Experiments and XC downloader #96

Sean1572 · 2025-10-17T21:11:05Z

No description provided.

* Added some experimental inferance pipeline Needed this to test unseen burrowing owl dataset. Will help alot with getting data formatted correctly * Add testing code for loading in models for inferance * feat: add inferance pipeline * Block CSVs from commits * Move visualization packages to optional dependecy * Clean code (remove comments and remove run_spefific code from library) * Clean up timm_model * Add BirdMAE Model Training

Sean1572 · 2025-10-17T21:25:51Z

Merge only after merging inference pipeline see #88

BUG NOTICE: Continue broke inferance in waveform_preprocessors.py... need to rethink how addressing corrutpted audio files works in model training. Maybe replace with empty data so it doesn't break training?

kgarwoodsdzwa · 2026-01-13T17:52:36Z

@Sean1572 was thinking of merging this before doing the push to main, but there's merge conflicts with the dev branch. i believe mainly with the inference script, train script, and trainer script, and raw data extractor etc. maybe this could be resolved by putting all the revised ones needed in a jupyter notebook in an example folder?

Sean1572 · 2026-01-14T00:08:05Z

whoot_model_training/whoot_model_training/preprocessors/waveform_preprocessors.py

+            except Exception as e:
+                print(e)
+                print("File Likely is corrupted, moving on")
+                break


@kgarwoodsdzwa I realized an issue with this section of the code, part of the reason I didn't want it pushed to main

This was my cheap way of handling file corruption. Previously this was a continue and it was fine (just a smaller batch size). This however is a huge issue in inference because it can disalign what is the file name to the model output (I passed in 15 files, but only get 14 predictions...). Raising an error fixed this because I could just skip the batch. During training we have no such error handling since trainer doesn't allow for it.

So we need to tackle error handling as part of this repo

Linted most the model_trainer with flake8 and some of data downloader, realized that data downloader needs a major code clean up

Sean1572 added 5 commits October 10, 2025 15:07

Adds working code for XC data and few shot model training

a840ee5

Add soundfile

add3efc

Add code for using XC api

a27c56d

Reslove train.py conflicts

0aee295

This was linked to issues Oct 17, 2025

San Diego Fewshot Dataset and Training #95

Open

add tools to download from xeno canto for other bird models and species #23

Open

Fix bugs with birdmae data loading

b5510f2

Sean1572 marked this pull request as draft October 17, 2025 21:25

Sean1572 added 6 commits October 17, 2025 16:02

Linted (round 1)

3e95f11

Remove perch model

d71134f

fix: adjust for nas data

6e6fe33

lint: finsh flake8 linting

f0f414b

Clean up few-shot experiment branch

1cd04bd

Merge branch 'dev' into few_shot_experiments

53fb144

Sean1572 marked this pull request as ready for review December 12, 2025 23:40

Sean1572 added 2 commits December 17, 2025 17:08

Add inferance fixes for fewshot

afca698

Add fixes to model training

a722c09

BUG NOTICE: Continue broke inferance in waveform_preprocessors.py... need to rethink how addressing corrutpted audio files works in model training. Maybe replace with empty data so it doesn't break training?

Merge branch 'dev' into few_shot_experiments

87471ee

Sean1572 commented Jan 14, 2026

View reviewed changes

Sean1572 added 4 commits January 13, 2026 16:28

Fix bug with corrupted files

d1136d2

Outdated inference script

cf750d1

Lint round 1

0111bec

Linted most the model_trainer with flake8 and some of data downloader, realized that data downloader needs a major code clean up

Linited Data Downloader

18304e7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

95/23 Add San Diego Experiments and XC downloader #96

95/23 Add San Diego Experiments and XC downloader #96

Uh oh!

Sean1572 commented Oct 17, 2025

Uh oh!

Sean1572 commented Oct 17, 2025

Uh oh!

kgarwoodsdzwa commented Jan 13, 2026

Uh oh!

Sean1572 Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

95/23 Add San Diego Experiments and XC downloader #96

Are you sure you want to change the base?

95/23 Add San Diego Experiments and XC downloader #96

Uh oh!

Conversation

Sean1572 commented Oct 17, 2025

Uh oh!

Sean1572 commented Oct 17, 2025

Uh oh!

kgarwoodsdzwa commented Jan 13, 2026

Uh oh!

Sean1572 Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants