AI model that generates Bach-style music compositions using machine learning.
JSBach/
├── scripts/ # Data collection scripts
│ ├── scrape_bach_midi.py # Scraper for jsbach.net
│ └── scrape_bachcentral.py # Scraper for bachcentral.com
│
├── data/
│ ├── raw/ # Raw downloaded MIDI files
│ │ ├── jsbach_net/ # ~257 files from jsbach.net
│ │ └── bachcentral/ # Files from bachcentral.com (run scraper)
│ │
│ ├── final/ # Merged & deduplicated dataset (future)
│ └── processed/ # Piano-converted training data (future)
│
└── notebooks/ # Jupyter notebooks for exploration (future)
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install requests beautifulsoup4 pretty_midi numpy pandas music21cd scripts
python scrape_bach_midi.pycd scripts
python scrape_bachcentral.py✅ Phase 1: Data Collection (In Progress)
- jsbach.net: 257 MIDI files downloaded
- bachcentral.com: Ready to download
⏳ Next Steps:
- Complete bachcentral download
- Deduplicate and merge datasets
- Convert all instruments to piano
- Create training sequences
- Build and train model
Current Collection (jsbach.net - 257 files):
- Goldberg Variations: 33 files
- English Suites: 48 files
- Solo Cello: 36 files
- Solo Violin: 31 files
- Organ Works: 32 files
- Well-Tempered Clavier (partial): 49 files
- Art of Fugue: 4 files
- And more...
Target: 500-600 unique Bach MIDI files for training