Audio-Classification

This Deep Learning model is for classifying audio using Tensorflow and Kapre (Keras Audio Preprocessors)

Environment setup

The development environment can be setup as follows:

conda create --name audio python==3.7
conda activate audio
git clone git@github.com:Patricknmaina/Deep-Audio-Classifier.git
cd Deep-Audio-Classifier/
pip install -r requirements.txt

To access the jupyter environment, run the following command:

jupyter-notebook

Audio Preprocessing

audio_clean.py is used to downsample the audio and remove dead cells in the audio with threshold detection using a signal envelope.

The file is run using the following command:

python scripts/audio_clean.py

In order to downsample the audio wavfiles by delta time, uncomment the split_wavs function. This will create a clean directory with the downsampled mono audio wavfiles.

Model Training

In the train.py file, change the model_type to either conv1d, conv2d or lstm in the main function.

The delta time and sample rate should be constant from audio_clean.py

Then run the code as follows:

python scripts/train.py

The training process looks as follows:

Model Plots

After training the 3 models, the training logs are saved and implemented in notebooks/Model Plots.ipynb

The confusion matrix is developed in notebooks/Confusion Matrix and ROC.ipynb

The Receiver Operating Characteristic (ROC) curve is also developed in notebooks/Confusion Matrix and ROC.ipynb

Model prediction

In predict.py, change the model_type to either conv1d, conv2d or lstm in the main function to evaluate the performance of the trained models with test datasets.

Run the following code:

python scripts/predict.py

The prediction process is shown below:

Disclaimer

This project was purely for experimental purposes. I will be experimenting with more updated wav files (possibly a different audio corpus), as well as more model architectures. So stay tuned!

References

Kapre -> for computation of audio transforms from time to frequency domain

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio-Classification

Environment setup

Audio Preprocessing

Model Training

Model Plots

Model prediction

Disclaimer

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
clean		clean
images		images
logs		logs
models		models
notebooks		notebooks
scripts		scripts
wavfiles		wavfiles
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Patricknmaina/deep-audio-classifier

Folders and files

Latest commit

History

Repository files navigation

Audio-Classification

Environment setup

Audio Preprocessing

Model Training

Model Plots

Model prediction

Disclaimer

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages