Collaborative Filtering using SVD, CUR, and PQ Matrix Decomposition

This project demonstrates collaborative filtering for recommendation systems using three matrix decomposition techniques: Singular Value Decomposition (SVD), CUR decomposition, and PQ decomposition (Matrix Factorization). These techniques help in dimensionality reduction and latent feature extraction, improving the scalability and accuracy of recommendation systems.

Methods

SVD Decomposition: Computes singular values and explores the loss of data with varying latent factors.
CUR Decomposition: Approximates the original matrix using selected columns and rows, with tunable latent dimensions.
PQ Matrix Factorization: Learns user and item latent vectors using gradient descent to minimize prediction error.

Prerequisites

Before running the code, ensure you have the following dependencies installed:

numpy
pandas
matplotlib
seaborn
scikit-learn

To install them, you can run:

pip install numpy pandas matplotlib seaborn scikit-learn

Dataset

The project uses the MovieLens dataset for movie ratings:

Path: Place the dataset in the data/ratings.csv.
Structure: The dataset should include userId, movieId, and rating columns.

How to Run

Clone this repository:

git clone https://github.com/yourusername/collaborative-filtering.git
cd collaborative-filtering

Place the dataset file (ratings.csv) in the data/ml-latest-small/ directory.
Run the script:
```
python collaborative_filtering.py
```

Outputs

1. SVD Decomposition

Displays the top 20 singular values.
Plots the loss of data against the number of latent factors (k).
Time taken for SVD decomposition is logged.

2. CUR Decomposition

Computes CUR approximation of the matrix.
Plots the reconstruction loss for varying k (latent factors).
Time taken for CUR decomposition is logged.

3. PQ Decomposition

Performs matrix factorization using gradient descent.
Logs training and test mean squared errors (MSE).
Time taken for PQ decomposition is logged.

Results

Top 20 singular values are [9032.38102201 4265.13020478 2962.83432586 2856.37494764 2441.34461236 2269.55931732 2169.8992637 1848.47223494 1701.69413469 1528.15832014 1476.74413397 1449.77168211 1432.00119537 1413.20720491 1319.28764566 1281.82058619 1213.72797731 1203.99401803 1198.55552737 1135.24246017]

Visualization

Two key plots are generated:

Loss vs. Latent Factors (SVD):
- Visualizes data reconstruction loss as latent factors (k) increase.
Loss vs. Latent Factors (CUR):
- Visualizes CUR reconstruction loss for varying k.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
main.py		main.py
main_forgpu.py		main_forgpu.py
main_multiprocessing.py		main_multiprocessing.py
svd_loss.png		svd_loss.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Collaborative Filtering using SVD, CUR, and PQ Matrix Decomposition

Methods

Prerequisites

Dataset

How to Run

Outputs

1. SVD Decomposition

2. CUR Decomposition

3. PQ Decomposition

Results

Visualization

About

Uh oh!

Releases

Packages

Languages

chinmayjainnnn/collaborative-filtering

Folders and files

Latest commit

History

Repository files navigation

Collaborative Filtering using SVD, CUR, and PQ Matrix Decomposition

Methods

Prerequisites

Dataset

How to Run

Outputs

1. SVD Decomposition

2. CUR Decomposition

3. PQ Decomposition

Results

Visualization

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages