Human Emotion Audio Recognition Tool
This project was optimized for portability. Therefore, it contains all the necessary code and the raw dataset. To begin, you must install a virual environment and all the required packages within it.
Python in version 3 or above is required and utilizing CUDA cores is highly recommended.
EAll dependencies are listed in the requirements.txt files. There are two of them:
- In the
/srcfolder, if you only want to run the main code. - In the
/notebooksfolder, if you want to work with the notebooks.
To install the packages after initializing virtual environment, run the command:
pip install -r requirements.txt
Unfortunately, you'll have to install the PyTorch module manually. To do that, visit their site, choose the configuration for your system, and install the necessary packages. For this project only torch and torchaudio is required.
For data augmentation, it is crucial to install the “extras” needed to run the preprocess.py script properly. After installing audiomentations, run:
pip install audiomentations[extras]
LFinally, you will need an account on Weights & Biases website to track the logs during the experiments. After running any script, you’ll be prompted for your API key and shown instructions.
As mentioned above, the project was optimized for portability and includes a basic dataset, so you can launch the final system script immediately.
Other experiments require augmentation to reproduce the thesis results. If augmentation is not necessary and the default dataset is sufficient, replace any paths from augmented-features to split-features.
Keep in mind though, that the preprocessing is still required, but the augmentation process could be skipped this way (as it takes a long time).
-
preprocess.py: Preprocesses the dataset and creates all the files needed to train the models.
-
exp_* scripts: Each corresponds to a specific experiment (names are self-explanatory).
-
final_system.py: Runs the architecture proposed in the thesis.
There is currently no command-line parameterization. However, the code was designed to be easily adapted via method arguments.
The only inconvenience is excluding specific audio features you don’t want the model to process. You could do that by typing the exact names of them, or go to the src/audio_processing/features/extractor.py file and comment out all the features listed in the get_features_dict method. That's the fastest way to do it.