ML_Music / Assignment 2

CSE153 music generation assignment with two sub-tasks:

Task 1: Unconditioned symbolic generation — LSTM models for pitch and duration trained on classical piano MIDI.
Task 2: Conditioned symbolic generation — builds on Task 1 with arpeggio-focused filtering and constrained decoding.

Data Setup

Place classical piano MIDI files under dataset/ (notebook reads dataset/**/*.mid).
Adjust preprocessing constants in the notebook if needed (e.g., MIN_SEQ_LEN, MAX_SEQ_LEN, TRANSPOSE_RANGE).

Task 1: Unconditioned Workflow

Preprocessing (Task1 section in notebook)
- Parse MIDI, quantize to 1/16 notes, drop short sequences, split at length 128, augment with ±3/±6 semitone transposition.
- Build pitch_to_id / duration_to_id; save to Task1_preprocessed/.
EDA
- Run EDA cells for pitch-class distribution, duration distribution, entropy, length histograms, and summary stats.
Training & Generation
- Train separate pitch/duration LSTMs (embed 128, hidden 256, 2 layers, dropout 0.3, label smoothing 0.1).
- Best models (by val perplexity) saved to Task1_outputs/; generates generated_music.mid.
Baselines & Metrics
- Trigram baseline vs. LSTM: pitch perplexity ~10 vs. trigram ~498, showing stronger sequence modeling.

Task 2: Arpeggio-Conditioned Workflow

Preprocessing
- Keep only arpeggio-like sequences: durations in {1,2,4}, intervals ≤24 semitones, min length 32; apply the same transposition augmentation.
- Save processed data and vocab to Task2_preprocessed/.
Training & Generation
- Train pitch/duration LSTMs (same hyperparams as Task 1) with constrained decoding:
  - nucleus/top-k sampling (top_p=0.9, top_k=50), n-gram repeat blocking, repeat-threshold control;
  - progressive relaxation of duration set and top-p if sampling fails.
- Outputs Task2_outputs/generated_music.mid.
Baselines & Metrics
- Against n-gram baselines: typical test perplexity ~9.6 (pitch) and ~1.2 (duration), outperforming unstructured baselines.

Tunable Knobs

Faster runs: lower NUM_EPOCHS, HIDDEN_SIZE, batch size, or subsample MIDI files.
More diversity: raise TEMPERATURE, TOP_P, or TOP_K; if quality drops, tighten n-gram/repeat constraints.
Longer pieces: raise MAX_GENERATION_LENGTH, noting stability limits on very long sequences.

References & Data

Classical piano MIDI dataset (Kaggle, sourced from piano-midi.de, 19 composers).
Related work: Melissa Jalali Monfared’s LSTM example; Sulun et al. (2022) transformer with continuous emotion conditioning.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dataset		dataset
.DS_Store		.DS_Store
ML_Music.ipynb		ML_Music.ipynb
README.md		README.md
Symbolic_Conditioned.mid		Symbolic_Conditioned.mid
Symbolic_Unconditioned.mid		Symbolic_Unconditioned.mid
Task1_LSTM.ipynb		Task1_LSTM.ipynb
Task1_Preprocessing.ipynb		Task1_Preprocessing.ipynb
Task1_Trivial.ipynb		Task1_Trivial.ipynb
Task2_LSTM.ipynb		Task2_LSTM.ipynb
Task2_Preprocessing.ipynb		Task2_Preprocessing.ipynb
Task2_Trivial.ipynb		Task2_Trivial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML_Music / Assignment 2

Data Setup

Task 1: Unconditioned Workflow

Task 2: Arpeggio-Conditioned Workflow

Tunable Knobs

References & Data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML_Music / Assignment 2

Data Setup

Task 1: Unconditioned Workflow

Task 2: Arpeggio-Conditioned Workflow

Tunable Knobs

References & Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages