This repository includes an implementation of a sentiment analysis on GAN generated videos. The implementation takes inspiration from both Matt Siegelman's (2019) "Deep Music Visualizer" repo and Fabio Carrara's "Visual Sentiment Analysis" repo.
To run this project, first you will have to generate a video from input audio, then run the sentiment analysis to get the sentiment of the generated video.
- Install requirements in
music-visualizer/requirements.txt(with Python 3.6), and install PyTorch.
-
cd music-visualizer -
python visualize.py --resolution 128 --duration 60 --song song_name.mp3 --pitch_sensitivity 220 --tempo_sensitivity 0.25 --mood moodNote here that we are limiting our settings to resolution = 128 and duration to 1 minute. The pitch sensitivity and the tempo sensitivity are the two parameters we are experimenting with. The audio file name and the mood should also be specified.For example, to generate a video for
calm.mp3, withpitch_sensitivity = 220andtempo_sensitivity = 0.25, you can run:python visualize.py --resolution 128 --duration 60 --song calm.mp3 --pitch_sensitivity 220 --tempo_sensitivity 0.25 --mood calm
This module will ensure the videos are generated to the correct path for the sentiment analysis module.
Note: This part has to be run with Python 3.6 or lower, so we executed it on local instead of on Google Colab.
Once we have generated videos, we will perform sentiment analysis on the video by performing image sentiment analysis from all the frames in the video.
-
cd sentiment-analysis -
Download the pretrained models:
./download_models.sh -
Extract the frames from the video using
frame_extract.py. For example,python frame_extract.py calm 220,25The first positional command line argument specifies the mood (
calm) and the second argument will be 2 parameter settings, separated by a comma. -
Run sentiment analysis prediction of the extracted frames using
predict.py. For example,python predict.py calm_220_25.txt --model vgg19_finetuned_all --batch-size 64 > calm_220_25.csvThe input
.txtfile generated from 4. should contain the paths to extracted frames. User must specify an output CSV file name. The naming convention for both files is{mood}_{pitch_sensitivity}_{tempo_sensitivity}. Note that this step should be run on GPU. This can be achieved by running all command lines on Google Colab. An example of how to run the whole pipeline on GPU is inproject_pipeline.ipynb -
Run sentiment analysis prediction of the video using
parse_predictions.pyFor example,python parse_predictions.py calm_220_25.csvThe possible predictions are 'Negative', 'Neutral' and 'Positive'.
Note: This part has to be run on GPU, so we used the results from part 1 and executed the code on Google Colab.