This project tries to predict the music genre of a given music video clip.
Explore the docs »
Report Bug
·
Request Feature
The goal of this project is to predict the music genre of the vectorized music videos of the Music4AllOnion dataset. After approaches like k-Nearest Neighbor or a simple Neural Network in order to verify the correct usage of the dataset.
The main idea to predict the music genre is to make a multi-label prediction with Transfer Learning using a ResNet50 model. To fit the input shape, the vectors will be reshaped to tensors with (64, 64, 3) values.
This project is based on the paper Moscati, Marta & Deldjoo, Yashar & Schedl, Markus & Parada-Cabaleiro, Emilia & Zangerle, Eva. (2022). Music4All-Onion — A Large-Scale Multi-faceted Content-Centric Music Recommendation Dataset.
To get a local copy up and running follow these simple steps.
-
Python Env Setup
-
Windows
-
Install Anaconda
-
Open Anaconda Prompt and type:
conda update -n base -c defaults conda conda create --name Python3.10 python=3.10 conda activate Python3.10 conda install pandas matplotlib numpy pip install tensorflow-datasets python -m pip install "tensorflow<2.11" -
If a GPU is available, it should be listed with the following command:
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
-
-
Ubuntu
-
Install Anaconda like stated here
-
Open terminal and type:
conda create --name Python3.10 python=3.10 conda activate Python3.10 conda install pandas matplotlib numpy conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/ python3 -m pip install tensorflow tensorflow-datasets
-
If a GPU is available, it should be listed with the following command:
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
-
-
-
Clone the repo
git clone https://github.com/TristanBandat/genre-prediction-video.git
-
Download the data files:
Note: The files need to be placed in a folder calleddata/. -
Create dataset
cd datasets/Music4AllOnionDC/ tfds build Music4AllOnionDC.py --data_dir [CWD]/data/
-
Version 1.0.1
INCP vectors with shape (4096,) and labels with shape (685,). -
Version 2.0.0
ResNet vectors with shape (4096,) and labels with shape (685,). -
Version 3.0.0
VGG19 vectors with shape (8192,) and labels with shape (685,). -
Version 3.0.1
VGG19 vectors with shape (4096,) and labels with shape (685,). The compression was achieved by taking the mean of 2 mean values and the maximum for 2 max values each for each data point. -
Version 3.0.2
VGG19 vectors with shape (64, 64, 3) and labels with shape (685,). The datapoints of version 3.0.1 were reshaped to (64, 64) and then repeated 3 times to fit the ResNet50 input shape.
-
k-Nearest Neighbor
We tried different values but k=3 gave the best result.
-
Test accuracy (INCP vectors): 12.97%
-
Test accuracy (ResNet vectors): 13.26%
-
Test accuracy (VGG19 vectors): 12.49%
-
-
Decision Tree
Because of the huge computational power needed for this model,
the test set was used for both training and testing.
-
Test accuracy (INCP vectors): 6.91%
-
Test accuracy (ResNet vectors): 6.52%
-
Test accuracy (VGG19 vectors): 7.95%
-
-
Simple Neural Network
Very simple NN with one big hidden layer.
-
Test accuracy (INCP vectors): 17.07%
-
Test accuracy (ResNet vectors): 16.09%
- Test accuracy (VGG19 vectors): 14.49%
-
-
Deep Neural Network
Same as the simple NN but with 20 smaller hidden layers.
- Test accuracy (INCP vectors): 7.35%
-
Test accuracy (ResNet vectors): 9.14%
-
Test accuracy (VGG19 vectors): 7.35%
-
LSTM
A Model with 2 LSTM and 2 dense hidden layers.
-
Test accuracy (INCP vectors): 7.35%
-
Test accuracy (ResNet vectors): 7.35%
-
Test accuracy (VGG19 vectors): 7.35%
-
-
ResNet50 with Transfer Learning
The ResNet50 with an additional output layer to fit the output shape.
- Test accuracy (VGG19 vectors): 10.47%
See the open issues for a list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to learn, inspire, and create.
Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Note: The Project was done as part of a AI Bachelor course and it may not be maintained very well!
Distributed under the GPL-3.0 License. See LICENSE for more information.
Tristan Bandat - @TBandat - tristan.bandat@gmail.com
Philipp Meingaßner - p.meingassner@gmail.com
Project Link: https://github.com/TristanBandat/genre-prediction-video