AsanaNet

Introduction

The aim of this project is to create an image classification model to detect yoga positions from videos.
A possible application could be getting a list of all the positions performed during a yoga class video present on Youtube or in your computer; it is also possible to record your practices through your WebCam to get insights about them and keep track of your improvemens.
A Asanas Database was also created to back up the model with information about each position.

Libraries used

Numpy
Pandas
os
cv2
requests, BeautifulSoup
sklearn
tensorflow, keras, pickle

Database Creation

The following information were obtained for each position that the model can detect:

Asana Sanskrit Name
Asana English Name
Difficulty level
Pose Type
Instructions
Drishti (where to focus the sight)
Cautions
Benefits

These were collected webscraping the site yogapedia.com.
All these information are collected in the dataframe data/asanas_df.csv.

Model Creation

Training Data

The images used to train the model were obtained starting from a Yoga Asanas classification dataset on Kaggle. The dataset has been slighlty modified in order to:

delete duplicates
delete images of poorly performed position
delete images with writings in it or with too many people or objects in the background
correctly categorize all images
add new data so that almost the same amount of images was available for each position.

At the end of this process between 40 and 50 images were available for each of the 84 positions.

Model

The key considerations include choosing between Convolutional Neural Networks (CNN) and DenseNet architectures. Additionally, various strategies such as data augmentation, early stopping, learning rate decay, dropout, and experimenting with different parameters are discussed.

Convolutional Neural Networks (CNN)

CNNs are a natural choice for image classification tasks due to their ability to automatically learn hierarchical features from image data. CNNs consist of convolutional layers that scan the input image with small filters, enabling the model to capture local patterns.

DenseNet

DenseNet, or Densely Connected Convolutional Networks, focuses on connecting each layer to every other layer in a feedforward fashion. This architecture facilitates feature reuse and enhances the flow of information through the network. DenseNet is beneficial when dealing with limited data or when a highly expressive model is needed.

Strategies and Techniques

Data Augmentation: is employed to artificially expand the dataset, helping the model generalize better to unseen data. Techniques such as rotation, flipping, and zooming are applied to augment the training set.
Early Stopping: prevents overfitting by monitoring the model's performance on a validation set. Training is halted when there is no improvement in the validation accuracy, preventing the model from learning noise in the training data.
Learning Rate Decay: Learning rate decay involves systematically reducing the learning rate during training. This can help the model converge faster in the beginning and fine-tune more precisely towards the end of training.
Dropout: is a regularization technique where random neurons are dropped during training, preventing the model from relying too heavily on specific features. This enhances generalization and robustness.
Experimenting with Different Parameters: The following parameters are systematically varied to optimize model performance:
- Number of Epochs: The number of times the entire training dataset is passed through the neural network. This parameter is adjusted to find the right balance between underfitting and overfitting.
- Batch Size: The number of training examples utilized in one iteration. A smaller batch size may provide regularization effects and reduce memory requirements.
- Optimizer: Different optimization algorithms, such as Adam, SGD, or RMSprop, are tested to identify the one that works best for the specific image classification task.

Accuracy Results

Below are reported the accuracy results for each combination of batch sizes and optimizer used for CNN model and DenseNet model.
In the end only 30 numbers of epochs were used since the Early Stopping tecnique halted the process before the thirtieth iteration almost in every case.

The final choice was using DenseNet model with 30 epochs, batch size of 16 and Adam optimizer.

Functioning example

I've tested the model with a short yoga sequence video and some results that I think highlights which could be the next possible steps to improve the model are reported below.

This example show how the model can get confused when it is not performed a specific position but the body is transitioning from a position to the next one.

In this case the position tadasana was not guessed correctly in one occasion and this may have happened because the model focused on the shape of the cloud instead of the body.

Conclusions (Next Steps)

First of all, the training data should be incremented in order to have at least one hundred images for each positions. It may also be useful to find a way to remove the background both in the training images and in the videos screenshot that the model is used on.
After this, different combinations of model parameters should be performed again in order to find the best ones.

Canva Link:

https://www.canva.com/design/DAF2GWbsSOI/9EPb6kACrElXApp04keUYA/edit

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data		data
images		images
models		models
notebook		notebook
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AsanaNet

Introduction

Libraries used

Database Creation

Model Creation

Training Data

Model

Convolutional Neural Networks (CNN)

DenseNet

Strategies and Techniques

Accuracy Results

Functioning example

Conclusions (Next Steps)

Canva Link:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AsanaNet

Introduction

Libraries used

Database Creation

Model Creation

Training Data

Model

Convolutional Neural Networks (CNN)

DenseNet

Strategies and Techniques

Accuracy Results

Functioning example

Conclusions (Next Steps)

Canva Link:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages