I am aware there are still a few inconsistencies in notation. The aim of this weekend project was to provide a straightforward example of a PCA use case, and to see how far we could get in a single weekend. Markdown has been cleaned up, but no code changes have been made since Sunday, March 5th.
UPDATE August 25th: ChirPCA INTERACTIVE NOTEBOOK with much more interesting BirdCLEF data.
INTERACTIVE DERIVATION NOTEBOOK (with better formatted markdown)
The goal of principal component anaylsis (PCA) is to take a set of data in
Through PCA we are creating a matrix that will project our original data
For now, it is only important to know that each column in
For the block matrix
$$\mathbf{U} = \begin{bmatrix}
\overbrace{\mathbf{U}{a}}^{\text{PC's}} & \overbrace{\mathbf{U}{b}}^{\text{excluded PC's}}
\end{bmatrix}
$$
We can create a projection
We posit that our transformation matrix,
Under a well-suited PCA use case,
PCA can be achieved through the eigendecomposition of the feature correlation matrix, and this explained-variance-maximization is equivalent to the minimization of the MSE. The proof is as follows:
Start with a vector
Over all terms, the mean squared error is then defined as
Note that the first term in eq
For a single vector
We know that for a scalar
Helpful resources:
- Boyd, Stephen, Stephen P. Boyd, and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.
- https://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch18.pdf