Skip to content

RooTender/HEART

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HEART ❤️

Human Emotion Audio Recognition Tool

Instructions

This project was optimized for portability. Therefore, it contains all the necessary code and the raw dataset. To begin, you must install a virual environment and all the required packages within it.

Python in version 3 or above is required and utilizing CUDA cores is highly recommended.

Required python modules

EAll dependencies are listed in the requirements.txt files. There are two of them:

  1. In the /src folder, if you only want to run the main code.
  2. In the /notebooks folder, if you want to work with the notebooks.

To install the packages after initializing virtual environment, run the command:

pip install -r requirements.txt

Note

Unfortunately, you'll have to install the PyTorch module manually. To do that, visit their site, choose the configuration for your system, and install the necessary packages. For this project only torch and torchaudio is required.

For data augmentation, it is crucial to install the “extras” needed to run the preprocess.py script properly. After installing audiomentations, run:

pip install audiomentations[extras]

LFinally, you will need an account on Weights & Biases website to track the logs during the experiments. After running any script, you’ll be prompted for your API key and shown instructions.

Scripts

As mentioned above, the project was optimized for portability and includes a basic dataset, so you can launch the final system script immediately.

Other experiments require augmentation to reproduce the thesis results. If augmentation is not necessary and the default dataset is sufficient, replace any paths from augmented-features to split-features.

Keep in mind though, that the preprocessing is still required, but the augmentation process could be skipped this way (as it takes a long time).

Description

  • preprocess.py: Preprocesses the dataset and creates all the files needed to train the models.

  • exp_* scripts: Each corresponds to a specific experiment (names are self-explanatory).

  • final_system.py: Runs the architecture proposed in the thesis.

Note

There is currently no command-line parameterization. However, the code was designed to be easily adapted via method arguments.

The only inconvenience is excluding specific audio features you don’t want the model to process. You could do that by typing the exact names of them, or go to the src/audio_processing/features/extractor.py file and comment out all the features listed in the get_features_dict method. That's the fastest way to do it.

About

Human Emotion Audio Recognition Tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published