Recommendation System - IdeKita

Source code and documentation of the Machine Learning team on the "IdeKita" Bangkit Capstone Project.

Model overview

we are developing a model using scraping datasets in omdena.com, after that we define our database using the structure, so the model can run with the same structure

Dataset

Our dataset basically is scraping data from omdena.com with a format look like this

https://www.kaggle.com/datasets/charismadeo/omdena-project-scraping-recommendation

project.csv

Idproject	Project Title	Categories
6688	House Price Recommendation System Using Machine Learning	Machine Learning \| NLP
8345	Creating a Text Summarization Tool to Combat the Overload of Information	Data Science \| Machine Learning \| NLP
5000	Tackling Deforestation in Tanzania with AI: A Mangrove-focused Pilot Project for National Carbon Monitoring	Data Science \| Machine Learning
4143	Geo-Tagging Nigerian License Plates Using Python and Computer Vision Through Machine Learning	Computer Vision \| Geospatial Data Science \| Machine Learning

ratings.csv

Iduser	Idproject	Ratings	Timestamp
9983	6688	4	1658841756
9983	8345	3	1658841762
9983	5000	2	1658841769
7236	4143	4	1658841776
7236	4550	3	1658841783
7236	5913	4	1658841790
8150	3389	3	1658841797
8150	6841	4	1658841804
8150	6881	4	1658841811

user.csv

Iduser	pref_categories
9983	Machine Learning \| Deep Learning
7236	Computer Vision \| Machine Learning

Notebook

https://colab.research.google.com/drive/1HheA3wv5tTBpXdGLyWpzIMHuaYDF4gJy?usp=sharing

Installation

To install and run the project locally, follow these steps:

Clone the repository:

git clone https://github.com/idekita/machine-learning.git

Install the dependencies:
```
pip install -r requirements.txt
```
Configure the project:
- Update the config.py file with the appropriate database connection details (DB_HOST, DB_USER, DB_PASSWORD, DB_NAME), Google Cloud Storage credentials (CREDENTIALS_PATH), and other required configurations.
Set up the MySQL database:
- Create a new MySQL database and import the necessary tables using the provided SQL script.
Run the application:
```
python app.py
```
Access the application in your browser at http://localhost:5000.

Configuration

The following configurations need to be set in the config.py file:

DB_HOST: The hostname of the MySQL database server.
DB_USER: The username to connect to the MySQL database.
DB_PASSWORD: The password for the MySQL database user.
DB_NAME: The name of the MySQL database.
CREDENTIALS_PATH: The file path to the Google Cloud Storage credentials.
BUCKET_NAME: The name of the Google Cloud Storage bucket.

API Endpoints

Database to CSV

The application provides the following API endpoints:

/: Converts the database into a CSV format and returns the CSV file.
/recommendations (POST): Triggers the recommendation process by fetching data from the database, performing recommendations using a machine learning model, and inserting the recommendations into the database.

/ Endpoint

Method: GET

Request Parameters: None

Request Body: None

Response: Text

Response Codes:

200: Database converted to CSV and uploaded to Cloud Storage.

Example Request:

curl http://localhost:5000/

Example Response:

Database converted to CSV and uploaded to Cloud Storage

/recommendations Endpoint

Method: POST

Request Parameters: None

Request Body: None

Response: Text

Response Codes:

200: Recommendations were inserted into the database successfully.

Example Request:

curl -X POST http://localhost:5000/recommendations

Example Response:*

Recommendations inserted into the database successfully!

Recommendation Algorithm

The application uses a machine learning model (recommendation.h5) to generate recommendations for users based on their preferences and ratings data. The algorithm follows these steps:

Fetch user preferences, ratings data, and project data from the MySQL database.
Preprocess the data and map user and project IDs to indices.
Iterate over each user:
- Check if the user exists in the ratings data and the mapping dictionary.
- Split the user's preferred categories.
- Select projects that match the user's preferred categories.
- Predict ratings for the unwatched projects using the machine learning model.
- Sort the projects based on predicted ratings.
- Insert the recommendations into the database.
Return the recommendations.

Project Structure

.

├── app.py : The main Flask application file.

├── config.py : Configuration file for the project.

├── credential.json : JSON file containing Google Cloud Storage credentials.

├── model

│ └── recommendation.h5 : The machine learning model for recommendations.

├── requirements.txt : File listing the required Python dependencies.

└── Dockerfile : File for building a Docker image of the application.

Usage Examples

To use the application, follow these steps:

Ensure that the MySQL database is set up and running.
Update the config.py file with the appropriate database connection details and the path to the Google Cloud Storage credentials file (credential.json).
Install the required Python dependencies by running the following command:
```
pip install -r requirements.txt
```
Start the application by running the following command:
```
python app.py
```
Access the application in your browser at http://localhost:5000.

To trigger the recommendation process, send a POST request to the /recommendations endpoint of the application. Here's an example using Python's requests library:

import requests

response = requests.post("http://localhost:5000/recommendations")
if response.status_code == 200:
    print("Recommendations inserted into the database successfully!")
else:
    print("Error: Failed to insert recommendations.")

Local Deployment

To deploy the Flask application, you can use Docker. Here's an example of how to build a Docker image and run the container:

Make sure Docker is installed on your machine.

Create a Dockerfile with the following content:

FROM python:3.9

WORKDIR /app

COPY . /app

RUN pip install -r requirements.txt

EXPOSE 5000

CMD ["python", "app.py"]

Build the Docker image by running the following command in the project's root directory:
```
docker build -t recommendation-app .
```

Run the Docker container using the image:

docker run -p 5000:5000 recommendation-app

Access the application in your browser at http://localhost:5000.

Cloud Deployment with Google Cloud Platform (GCP)

To deploy the application on GCP, follow these steps:

Create a new project on GCP.
Enable the necessary APIs:
- Google Cloud Storage API
- Google Cloud SQL API
Set up the MySQL database on Google Cloud SQL:
- Create a new Cloud SQL instance.
- Create a database within the instance.
- Import the necessary tables using the structure given.
Update the config.py file and get the credential at services account gcp.
Build a Docker image of the application as explained in the previous section.

Push the Docker image to Google Container Registry (GCR)

Authenticate with GCR:
```
gcloud auth configure-docker
```

Tag the Docker image:

docker tag recommendation-app gcr.io/[PROJECT_ID]/recommendation-app

Push the Docker image to GCR:

docker push gcr.io/[PROJECT_ID]/recommendation-app

Deploy the application on Google Cloud Run:
- Deploy the Docker image to Cloud Run:
```
gcloud run deploy recommendation-app --image gcr.io/[PROJECT_ID]/recommendation-app --platform managed
```
  if you can use the gcloud function, install the gcloud sdk first at https://cloud.google.com/sdk/docs/install
- Follow the prompts to select the region, allow unauthenticated invocations, and choose a service name.
Once the deployment is successful, you will receive a URL for the deployed Cloud Run service.
Access the application in your browser using the provided URL.

Now your application is deployed on GCP, utilizing Google Cloud Storage and Google Cloud SQL for storage and database services respectively. Users can access the application using the provided external IP address.

Note: Make sure to replace [PROJECT_ID] with your actual GCP project ID throughout the steps.

Troubleshooting

If you encounter any issues while setting up or using the application, consider the following:

Verify that the MySQL database connection details in config.py are correct.
Make sure the required Python dependencies are installed by running pip install -r requirements.txt.
Ensure that the credential.json file exists.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
model		model
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Recommendation System - IdeKita

Model overview

Dataset

Notebook

Installation

Configuration

API Endpoints

Database to CSV

/ Endpoint

/recommendations Endpoint

Recommendation Algorithm

Project Structure

Usage Examples

Local Deployment

Cloud Deployment with Google Cloud Platform (GCP)

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

idekita/machine-learning

Folders and files

Latest commit

History

Repository files navigation

Recommendation System - IdeKita

Model overview

Dataset

Notebook

Installation

Configuration

API Endpoints

Database to CSV

/ Endpoint

/recommendations Endpoint

Recommendation Algorithm

Project Structure

Usage Examples

Local Deployment

Cloud Deployment with Google Cloud Platform (GCP)

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages