CNN vocalization classifier for North American birds - classifies songs, calls & alarms using spectrograms.
Download v1.0 Models (70 MB) - 46 species, trained December 2024
Extract to trained-models/ folder.
This project trains deep learning models to classify bird vocalizations into three categories:
- Song - Territorial/mating vocalizations
- Call - Contact/communication calls
- Alarm - Warning/distress calls
Currently supports 50 common North American species based on eBird frequency data.
- Open the notebook in Google Colab (Pro+ recommended for A100 GPU)
- Run all cells to train models
- Models are saved to
data/models/
git clone https://github.com/RonnyCHL/birdcall-usa.git
cd birdcall-usa
pip install torch torchaudio librosa scikit-learn matplotlib seaborn requestspython full_pipeline.pypython full_pipeline.py --priority 1python full_pipeline.py --species "American Robin"python full_pipeline.py --dry-run| Species | Scientific Name | eBird Freq |
|---|---|---|
| Mourning Dove | Zenaida macroura | 35% |
| Northern Cardinal | Cardinalis cardinalis | 34% |
| American Robin | Turdus migratorius | 33% |
| American Crow | Corvus brachyrhynchos | 32% |
| Blue Jay | Cyanocitta cristata | 28% |
| Song Sparrow | Melospiza melodia | 25% |
| Red-winged Blackbird | Agelaius phoeniceus | 25% |
| European Starling | Sturnus vulgaris | 25% |
| American Goldfinch | Spinus tristis | 24% |
| Canada Goose | Branta canadensis | 23% |
| House Finch | Haemorhous mexicanus | 23% |
| Downy Woodpecker | Dryobates pubescens | 23% |
| Mallard | Anas platyrhynchos | 22% |
Red-bellied Woodpecker, House Sparrow, Turkey Vulture, Black-capped Chickadee, Tufted Titmouse, Dark-eyed Junco, White-breasted Nuthatch, Northern Flicker, Great Blue Heron, Northern Mockingbird, Carolina Wren, Red-tailed Hawk, Common Grackle
Barn Swallow, Yellow-rumped Warbler, Ring-billed Gull, Gray Catbird, Common Yellowthroat, Brown-headed Cowbird, Chipping Sparrow, Tree Swallow, Eastern Bluebird, White-throated Sparrow, Killdeer, Eastern Phoebe, Cedar Waxwing, Ruby-throated Hummingbird, Baltimore Oriole, Indigo Bunting, Eastern Towhee, Brown Thrasher, Purple Finch, Pine Warbler, Carolina Chickadee, Eastern Meadowlark, Wood Thrush, Scarlet Tanager
- Input: 128x128 mel-spectrogram
- Architecture: 4-layer CNN with batch normalization
- Output: 3 classes (song, call, alarm)
- Training: ~150 samples per class from Xeno-canto
Audio recordings are automatically downloaded from Xeno-canto, the world's largest collection of bird sounds.
birdcall-usa/
├── data/
│ ├── raw/ # Downloaded audio files
│ ├── models/ # Trained .pt model files
│ └── spectrograms-*/ # Generated spectrograms
├── logs/ # Training logs and plots
├── notebooks/
│ └── train_colab.ipynb # Colab training notebook
├── src/
│ ├── classifiers/ # CNN model code
│ ├── collectors/ # Xeno-canto downloader
│ └── processors/ # Spectrogram generator
├── full_pipeline.py # Main training script
└── us_bird_species.py # Species list
| Scope | Species | Time (A100) |
|---|---|---|
| Priority 1 | 13 | ~1 hour |
| Priority 1+2 | 26 | ~2-3 hours |
| All | 50 | ~4-6 hours |
- emsn-vocalization - Original Dutch/European version
- BirdNET - Species identification model
MIT License - see LICENSE file.
- Audio data from Xeno-canto contributors
- Species frequency data from eBird