A transformer-based model that can generate sequences of piano notes and their respective durations and velocities.
This is an example piano music generation from a model trained on 10000 training iterations with a learning rate of 3e-4 with 256 emedding dimensions, 6 transformer blocks, and 6 heads in each multi-head self attention layer. The script to convert the tokens, durations, and velocities into an audio file has its limitations. Namely, that each note's reverb will be cut off when the next notes begins; however, this example still exhibits the models understanding of melody, chord progression, etc., despite the lossiness in the audio file creation scheme.
Before you begin, ensure you have the following installed on your system:
- Docker: Install Docker from here.
- NVIDIA Container Toolkit (for GPU support): Follow the installation guide here.
First, clone this repository to your local machine:
git clone https://github.com/SamuelReeder/piano.git
cd pianoThis project is fully Dockerized, meaning all dependencies and environment setup is handled via Docker.
To build the Docker image, use the Makefile:
make buildTo run the Docker container with GPU support, use the following command:
make runThis command will:
- Start the Docker container interactively.
- Mount the current directory to
/workspaceinside the container, so any changes made are reflected on your host system. - Enable GPU support (assuming you have the NVIDIA Container Toolkit installed).
Once inside the Docker container, you can run the following scripts:
To train a model, run the following command:
python train.py <model_name>This will train a model with the specified name and save it to the models directory. Please update the parameters in the train.py script to customize the training.
To generate a musical piece and a .wav file consisting of the simulated audio, run the following command:
python generate.py <model_name> <max_tokens> <output_file_name>