PerfectPick: AI-Powered Phone Recommendation System

Introduction

PerfectPick is a production-grade AI recommendation system designed to deliver personalized smartphone recommendations from Flipkart's product catalog. It integrates classical information retrieval (BM25), modern semantic embeddings (BGE), and large language model (LLM)-based generation to provide accurate, context-aware recommendations. The system is built for scalability, leveraging a modular architecture deployed on Google Cloud Platform (GCP) using Docker, Kubernetes, Prometheus, and Grafana. The project is live at http://34.14.203.32:8080/, accessible for testing and evaluation.

This README provides a comprehensive guide to PerfectPick’s architecture, components, deployment workflow, and monitoring setup, suitable for researchers, developers, and DevOps engineers. It covers the system from data ingestion to production, with a focus on reproducibility, scalability, and observability.

Project Overview

PerfectPick is an end-to-end recommendation system that processes user queries (e.g., "best budget phone under 15K with good camera") to suggest relevant smartphones from a dataset of 3903 Flipkart products (1456 unique models). It combines sparse retrieval (BM25), dense vector search (Astra DB), and LLM-based generation (OpenAI/Groq) to achieve high relevance and low latency (<2s per query). The system uses Supabase PostgreSQL for session memory and is deployed on GCP with Kubernetes for orchestration, monitored via Prometheus and Grafana.

Objectives:

Deliver accurate, personalized recommendations using hybrid retrieval.
Ensure scalability for high query volumes.
Provide a modular, maintainable codebase for research and production.
Enable observability through comprehensive monitoring.

Current Status (October 03, 2025):

Developed on branch feat/production-flask-api.
Core functionality (ingestion, retrieval, API) complete.
Supabase connectivity stabilized with IPv4/IPv6 fallback.
Live deployment: http://136.116.202.138/.

Core Features

Hybrid Retrieval: Combines BM25 (keyword-based) and vector search (semantic) with neural reranking (BAAI/bge-reranker-base).
Session Memory: Stores user interaction history in Supabase PostgreSQL for multi-turn personalization.
Vector Storage: Astra DB for efficient embedding storage and retrieval.
Production API: Flask-based endpoints for recommendations and health checks.
Cloud-Native Deployment: Dockerized application orchestrated with Kubernetes on GCP.
Monitoring: Prometheus for metrics collection, Grafana for visualization.
Scalability: Horizontal Pod Autoscaler (HPA) for dynamic scaling.
Modularity: Independent modules for ingestion, retrieval, and generation.

System Architecture

PerfectPick follows a layered, microservices-inspired architecture:

Frontend Layer: Minimal HTML/CSS interface for user queries and results.
Backend Layer: Flask API handling requests and orchestrating retrieval/generation.
Data Layer: Processes and validates Flipkart CSV data.
Vector Store Layer: Astra DB for storing product embeddings.
Session Layer: Supabase PostgreSQL for session persistence.
Monitoring Layer: Prometheus and Grafana for system observability.
Deployment Layer: Docker containers managed by Kubernetes on GCP.

Directory Structure

bhupencoD3-PerfectPick/
├── app.py
├── main.py
├── requirements.txt
├── Dockerfile
├── .env.example
├── data/
│   └── Flipkart_Mobiles_cleaned.csv
├── perfectpick/
│   ├── config.py
│   ├── data_converter.py
│   ├── data_ingestion.py
│   ├── retrieval.py
│   ├── generation.py
│   ├── recommender.py
│   ├── service.py
│   ├── session_memory.py
├── utils/
│   ├── logger.py
│   ├── custom_exception.py
│   ├── validators.py
├── templates/
│   └── index.html
├── static/
│   └── style.css
├── tests/
│   ├── test_config.py
│   ├── test_data_converter.py
│   ├── test_data_ingestion.py
│   ├── test_recommender.py
│   ├── test_service.py
├── notebooks/
│   └── exploration.ipynb
├── docs/
│   └── index.html
├── deployment/
│   ├── k8s/
│   │   ├── deployment.yml
│   │   ├── service.yml
│   │   ├── hpa.yml
│   ├── prometheus/
│   │   └── prometheus.yml
│   ├── grafana/
│   │   ├── dashboards/
│   │   ├── datasource.yml
│   ├── deploy-all.sh
│   ├── entrypoint.sh
│   └── SETUP.md

Component-Level Breakdown

Core Application Modules

config.py
- Purpose: Loads and validates environment variables from .env.
- Features: Validates DB_URL, API keys; logs initialization status.
- Dependencies: os, python-dotenv, logging.
data_converter.py
- Purpose: Transforms raw product data into embedding-ready format.
- Features: Normalizes prices, cleans text fields.
- Dependencies: pandas.
data_ingestion.py
- Purpose: Loads CSV data and indexes embeddings in Astra DB.
- Features: Processes 3903 rows; uses LangChain AstraDBVectorStore (collection: flipkart_recommendation); Hugging Face embeddings (BAAI/bge-base-en-v1.5). Each product is vectorized based on a concatenated string of its key features (model, RAM, storage, camera, processor). The full product metadata is stored alongside the vector.
- Dependencies: pandas, langchain_astradb, sentence-transformers.
retrieval.py
- Purpose: Implements hybrid retrieval (BM25 + vector search).
- Features: BM25 index (3917 documents) for keyword-based matches; vector search (top-20) for semantic intent (e.g., "good battery life"); neural reranking (BAAI/bge-reranker-base) to re-score the top-40 combined results, yielding a final top-5 product set. Price filtering segments the data (Budget: ₹0-15K, Mid-range: ₹15-30K, Premium: ₹30-70K, Flagship: ₹70K+). The initial hybrid score is calculated as: $\text{Score}{\text{hybrid}}(q, d) = \alpha \cdot \text{Score}{\text{BM25}}(q, d) + (1-\alpha) \cdot \text{Score}_{\text{BGE}}(q, d)$, with $\alpha = 0.3$.
- Dependencies: rank_bm25, FlagEmbedding, langchain.
generation.py
- Purpose: Generates natural language responses using LLMs.
- Features: Integrates OpenAI/Groq with Retrieval-Augmented Generation (RAG). The LLM uses a system prompt to enforce a persona ("Expert Mobile Consultant") and a strict output format. Groq is prioritized for low-latency inference, with OpenAI as a high-quality fallback.
- Dependencies: openai, groq.
recommender.py
- Purpose: Orchestrates retrieval and generation for recommendations.
- Features: The module follows a clear RAG workflow: 1. Retrieve Session history. 2. Hybrid Search for candidate products. 3. Rerank candidates. 4. Contextualize by assembling the final RAG prompt, including the current query, session memory, and the top-5 documents. 5. Generate the response via LLM. 6. Store Session for personalization.
- Dependencies: internal modules (retrieval, generation, session_memory).
service.py
- Purpose: Defines Flask API endpoints (/recommend, /health, /metrics).
- Features: Handles JSON requests, returns structured responses.
- Dependencies: flask.
session_memory.py
- Purpose: Manages user session history in Supabase PostgreSQL.
- Features: Uses psycopg2 client with IPv4/IPv6 fallback for reliability. The session_memory table stores the conversational context as a JSONB object, enabling multi-turn recommendations by providing the LLM with past queries and results for context-aware reasoning.
- Dependencies: psycopg2, collections.deque, socket.

Utility Layer

logger.py
- Purpose: Provides structured JSON logging.
- Features: Timestamps, log levels, JSON formatting.
- Dependencies: logging, json.
custom_exception.py
- Purpose: Defines custom exceptions for consistent error handling.
- Features: Structured error messages for debugging.
- Dependencies: None.
validators.py
- Purpose: Validates input data and API requests.
- Features: Checks query format, data integrity.
- Dependencies: None.

API and Frontend

app.py
- Purpose: Main Flask application entrypoint.
- Features: Initializes routes, logging, services; handles /recommend, /health, /metrics.
- Dependencies: flask, internal modules.
templates/index.html
- Purpose: Basic HTML frontend for user interaction.
- Features: Query input, recommendation display.
- Dependencies: None.
static/style.css
- Purpose: Styles frontend interface.
- Features: Minimal, responsive design.
- Dependencies: None.

Data and Experiments

data/Flipkart_Mobiles_cleaned.csv
- Purpose: Source dataset with 3903 products (8 columns: model, price, RAM, storage, camera, battery, display, processor).
- Features: Cleaned, UTF-8 encoded.
notebooks/exploration.ipynb
- Purpose: Exploratory data analysis and model prototyping.
- Features: Visualizations, embedding experiments.
- Dependencies: jupyter, pandas, matplotlib.
docs/index.html
- Purpose: HTML documentation for contributors.
- Features: API specs, module descriptions.

Testing Suite

Purpose: Validates module functionality and integration.
Files:
- test_config.py: Tests environment variable loading.
- test_data_converter.py: Validates data preprocessing.
- test_data_ingestion.py: Ensures CSV loading and indexing.
- test_recommender.py: Tests recommendation pipeline.
- test_service.py: Verifies API endpoints.
Dependencies: pytest.

Deployment Layer

Purpose: Configures production deployment and monitoring.
Files:
- Dockerfile: Builds application image (Python 3.12, non-root user).
- deploy-all.sh: Automates Kubernetes deployment.
- entrypoint.sh: Container startup script.
- k8s/deployment.yml: Defines 3-replica deployment.
- k8s/service.yml: LoadBalancer for API access.
- k8s/hpa.yml: Autoscales pods (3-10 replicas, 50% CPU).
- prometheus/prometheus.yml: Configures metrics scraping.
- grafana/datasource.yml: Links Grafana to Prometheus.
- grafana/dashboards/: Prebuilt dashboards for CPU, memory, latency.
- SETUP.md: Deployment instructions.

Technology Stack

Category	Technology	Purpose
API	Flask	RESTful endpoints
Data Storage	Supabase PostgreSQL	Session memory
Vector Storage	Astra DB	Product embeddings
Retrieval Models	BM25, BGE Embeddings (`BAAI/bge-base-en-v1.5`), BGE Reranker (`BAAI/bge-reranker-base`)	Hybrid retrieval and reranking
Generation LLMs	OpenAI, Groq	LLM integration for response generation
Containerization	Docker (BuildKit-enabled)	Application packaging
Orchestration	Kubernetes (GKE)	Container management
Cloud	Google Cloud Platform (GCE, GKE, Secret Manager)	Hosting and infrastructure
Monitoring	Prometheus, Grafana	Metrics collection and visualization
Testing	Pytest	Unit and integration tests
Utilities	`python-dotenv`, `pandas`, `sentence-transformers`	Configuration, data manipulation, embedding utilities

Setup and Local Development

Prerequisites:

OS: Arch Linux (or Ubuntu/Debian)
Python: 3.12+ (use pyenv)
Docker: Install via sudo pacman -S docker
Git: For repository cloning

Setup Steps:

Clone: git clone <repo-url>; cd bhupencoD3-PerfectPick
Virtual Env: python -m venv venv; source venv/bin/activate; pip install -r requirements.txt

Configure .env:

DB_URL=postgresql://postgres.lxbououtadfxarleksun:<password>@aws-1-ap-southeast-1.pooler.supabase.com:6543/postgres?sslmode=require
OPENAI_API_KEY=sk-proj-...
GROQ_API_KEY=gsk_...
ASTRA_DB_API_ENDPOINT=https://ad6829b0-39f3-43aa-9a13-43036ed6bed2-us-east-2.apps.astra.datastax.com
ASTRA_DB_APPLICATION_TOKEN=AstraCS:...
ASTRA_DB_KEYSPACE=default_keyspace
HF_TOKEN=hf_qtSBHF...
DATA_FILE_PATH=data/Flipkart_Mobiles_cleaned.csv
FLASK_ENV=development
FLASK_PORT=8000

Run: python app.py
Test: curl -X POST http://localhost:8000/recommend -H "Content-Type: application/json" -d '{"query": "best budget phone"}'

Docker Setup:

Build: export DOCKER_BUILDKIT=1; docker build -t perfectpick:latest .
Run: docker run --env-file .env -p 8000:8000 --name perfectpick perfectpick:latest
Compose: docker-compose up --build (using docker-compose.yml)

Deployment Workflow

Overview: PerfectPick is deployed on GCP using Google Kubernetes Engine (GKE) for orchestration, Artifact Registry for images, Secret Manager for credentials, and Cloud Build for CI/CD. The live instance is accessible at http://136.116.202.138/.

Steps:

Enable GCP APIs: GKE, Cloud Build, Artifact Registry, Secret Manager.

Create GKE Cluster:

gcloud container clusters create perfectpick-cluster --zone us-central1-a --machine-type e2-medium --num-nodes 3 --enable-ip-alias --enable-autoscaling --min-nodes 1 --max-nodes 5

Store Secrets:

gcloud secrets create db-url --data-file=<file>
gcloud secrets create openai-key --data-file=<file>

Build and Push Image:

gcloud auth configure-docker
docker tag perfectpick:latest gcr.io/<project-id>/perfectpick:latest
docker push gcr.io/<project-id>/perfectpick:latest

Apply Kubernetes Manifests:
```
kubectl apply -f deployment/k8s/
```

Verify:

kubectl get pods
kubectl port-forward service/perfectpick-service 8080:80

Access: http://136.116.202.138/

Scale:

kubectl scale deployment perfectpick --replicas=3
kubectl apply -f deployment/k8s/hpa.yml

Cloud Build CI/CD (cloudbuild.yaml):

steps:
- name: 'gcr.io/cloud-builders/docker'
  args: ['build', '-t', 'gcr.io/$PROJECT_ID/perfectpick:$COMMIT_SHA', '.']
- name: 'gcr.io/cloud-builders/docker'
  args: ['push', 'gcr.io/$PROJECT_ID/perfectpick:$COMMIT_SHA']
- name: 'gcr.io/cloud-builders/gke-deploy'
  args:
  - run
  - --filename=deployment/k8s/
  - --image=gcr.io/$PROJECT_ID/perfectpick:$COMMIT_SHA
  - --location=us-central1-a
  - --cluster=perfectpick-cluster

Optional Cloud SQL Migration:

Create: gcloud sql instances create perfectpick-db --database-version=POSTGRES_15 --tier=db-f1-micro
Update DB_URL: postgresql://postgres:<pass>@<cloud-sql-ip>:5432/postgres
Migrate: pg_dump <supabase-url> | gcloud sql import sql perfectpick-db -

Monitoring and Observability

Prometheus

Enabled: gcloud container clusters update perfectpick-cluster --monitoring=SYSTEM,WORKLOAD

Config (deployment/prometheus/prometheus.yml):

global:
  scrape_interval: 15s
scrape_configs:
- job_name: 'perfectpick'
  static_configs:
  - targets: ['perfectpick-service:8000']
  metrics_path: /metrics

Grafana

Deployment (deployment/grafana/deployment.yml):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      containers:
      - name: grafana
        image: grafana/grafana:latest
        ports:
        - containerPort: 3000
        env:
        - name: GF_SECURITY_ADMIN_PASSWORD
          value: "admin"
---
apiVersion: v1
kind: Service
metadata:
  name: grafana-service
spec:
  type: LoadBalancer
  ports:
  - port: 3000
  selector:
    app: grafana

Access: <load-balancer-ip>:3000
Data Source: Prometheus (http://prometheus-operated:9090)
Dashboards: CPU usage, memory, query latency, DB connection errors, RAG performance breakdown.

Custom Metrics (`app.py`)

PerfectPick tracks key metrics for the RAG pipeline to ensure recommendation quality and efficiency:

from prometheus_client import Counter, Histogram, Gauge, generate_latest
from flask import Response

# General Metrics
QUERY_COUNTER = Counter('perfectpick_queries_total', 'Total queries')
LATENCY_HISTOGRAM = Histogram('perfectpick_query_duration_seconds', 'Query latency')
SESSION_HITS = Counter('perfectpick_session_memory_retrievals', 'Number of times session memory was retrieved')

# RAG & LLM Specific Metrics
RETRIEVAL_LATENCY = Histogram('perfectpick_retrieval_duration_seconds', 'Duration of hybrid retrieval stage')
RERANKING_TIME = Histogram('perfectpick_reranking_duration_seconds', 'Duration of neural reranking stage')
LLM_PROVIDER_COUNTER = Counter('perfectpick_llm_provider_used', 'Count of responses by LLM provider', ['provider'])
LLM_TOKEN_USAGE = Gauge('perfectpick_llm_tokens_used', 'Tokens used per query (Input/Output)', ['type', 'provider'])

@app.route('/metrics')
def metrics():
    return Response(generate_latest(), mimetype='text/plain')

@app.route('/recommend', methods=['POST'])
def recommend():
    # ... logic for retrieval, reranking, and generation ...
    with LATENCY_HISTOGRAM.time():
        QUERY_COUNTER.inc()
        # RETRIEVAL_LATENCY and RERANKING_TIME are observed internally
        # LLM_PROVIDER_COUNTER and LLM_TOKEN_USAGE are set after generation
        return jsonify(results)

Alerts

Critical alerts are configured to monitor both system stability and the quality of the AI service:

System Health: CPU $>$80%, Latency $>$2s (P95), Pod Restarts $>$3/hr.
Quality Degradation: DB errors $>$5/min (loss of personalization), Reranker Latency $>$100ms (RAG bottleneck), LLM API Failures $>$2% (critical service interruption).
Configured via Grafana notifications (Slack/Email).

Scalability and Performance

Horizontal Scaling: HPA scales pods (3-10) based on 50% CPU utilization.
Performance: Retrieval <1s, total query <2s; model initialization ~3-5min (CPU).
Optimization: Pre-downloaded BGE models; cached embeddings in Astra DB.
Load Handling: Tested for 100 concurrent users with <5% latency increase.

Future Improvements

CI/CD: Integrate GitHub Actions with Cloud Build for automated testing, image building, and Canary Deployments for new LLM/retrieval models.
Caching: Implement Redis for both exact query caching (for popular queries) and session caching (to reduce load on Supabase).
Multi-Modal Search: Incorporate image-based search with CLIP embeddings to allow for visual queries (e.g., "find a phone that looks like this").
A/B Testing: Implement a solution using Kubernetes Ingress/Service Mesh to A/B test different retrieval strategies (e.g., varying the $\alpha$ value in hybrid scoring, or testing new rerankers).
Authentication: Add OAuth2 for secure API access.
Multi-Product Support: Extend to other Flipkart categories (e.g., laptops).
Personalization: Explore federated learning for user-specific models.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PerfectPick: AI-Powered Phone Recommendation System

Table of Contents

Project Overview

Core Features

System Architecture

Directory Structure

Component-Level Breakdown

Core Application Modules

Utility Layer

API and Frontend

Data and Experiments

Testing Suite

Deployment Layer

Technology Stack

Setup and Local Development

Deployment Workflow

Monitoring and Observability

Prometheus

Grafana

Custom Metrics (`app.py`)

Alerts

Scalability and Performance

Future Improvements

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.github/workflows		.github/workflows
data		data
deployment		deployment
docs		docs
notebooks		notebooks
perfectpick		perfectpick
static		static
templates		templates
tests		tests
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
create-vm-complete.sh		create-vm-complete.sh
download_model.py		download_model.py
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

License

bhupencoD3/PerfectPick

Folders and files

Latest commit

History

Repository files navigation

PerfectPick: AI-Powered Phone Recommendation System

Table of Contents

Project Overview

Core Features

System Architecture

Directory Structure

Component-Level Breakdown

Core Application Modules

Utility Layer

API and Frontend

Data and Experiments

Testing Suite

Deployment Layer

Technology Stack

Setup and Local Development

Deployment Workflow

Monitoring and Observability

Prometheus

Grafana

Custom Metrics (app.py)

Alerts

Scalability and Performance

Future Improvements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Custom Metrics (`app.py`)

Packages