Unsupervised learnig: Clustering on handwritten digits

The goal of the assignment was to perform classification over the Semeion Handwritten Digit data set using:

Latent Class Analysis (LCA);
Mean Shift;
Normalized cut. for the mean shift and normalized cut, we assume that the images are vectors in a 256-dimensional Euclidean space. The number of clusters k varies from 5 to 15, for LCA and normalized-cut, while for mean shift vary the kernel width.

LCA models can be referred to as finite mixtures of Bernoulli distributions so we coded the EM algorithm supposing that given a Bernoulli mixture model, the goal is to maximize the likelihood function with respect to the parameters. Because it is not guaranteed that the Mean shift will converge in a high dimensional space like in our case, we need to apply Principal Component Analysis (PCA) to our dataset before actually running the mean shift algorithm. For both PCA and Mean shift, we used the functions present in scikit-learn. Also for the spectral clustering algorithm we used the function provided by sklearn.

The three models are compared by rand index and by plotting each computed cluster in grey scale.

You can find more in the "report.pdf" file!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
clustering		clustering
analysis.R		analysis.R
readme.md		readme.md
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised learnig: Clustering on handwritten digits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Unsupervised learnig: Clustering on handwritten digits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages