The Product Review Analyzer is a powerful AI-driven application designed to help product managers and development teams make data-driven decisions. It analyzes product reviews, user feedback, identifies pain points, extracts feature requests, and highlights positive feedback to transform raw customer input into actionable priorities.
This tool leverages advanced AI technologies including Google's Gemini API (using the Gemini 2.0 Flash model), Hugging Face Transformers, and natural language processing libraries to provide comprehensive analysis of user sentiment and feedback. With features like real-time data scraping, batch processing with parallel execution, circuit breaker pattern for API resilience, and interactive visualizations, it transforms raw feedback into strategic priorities for product development.
- Upload CSV files with user feedback
- Scrape real-time feedback from Google Play Store
- Analyze GitHub repositories for issues and discussions
- Advanced sentiment analysis with:
- Context-aware sentiment detection
- Aspect-based sentiment analysis
- Local processing for sentiment analysis to avoid API rate limits
- Parallel processing for improved performance
- Categorize feedback into:
- Pain points
- Feature requests
- Positive feedback
- Extract keywords from feedback
- Generate summaries with top insights
- Weekly summaries for product prioritization
- Download PDF reports with actionable priorities for product managers
- Track analysis history and view past results with full summaries
- Dynamic processing time estimation based on historical data
- Real-time progress updates via WebSockets
- Advanced Gemini API integration with:
- Local processing for sentiment analysis to avoid API rate limits
- Gemini API for insight extraction and summary generation
- Intelligent batch processing for optimal performance
- Adaptive request throttling to prevent rate limits
- Robust JSON parsing with multiple fallback mechanisms
- Multi-level caching system for faster responses
- Circuit breaker pattern for graceful degradation
- Multi-language support for analyzing reviews in different languages
- MongoDB Atlas integration for scalable data storage
- Optimized batch processing with dynamic batch sizing and RAM usage optimization
- FastAPI (Python)
- WebSockets for real-time progress updates
- MongoDB Atlas for database storage
- Hugging Face Transformers for sentiment analysis
- Google Gemini API for advanced text processing (using Gemini 2.0 Flash model)
- NLTK and spaCy for NLP tasks
- PyMongo for MongoDB integration
- google-play-scraper for Play Store data
- WeasyPrint for PDF generation
- Parallel processing for improved performance
- React
- Tailwind CSS
- Headless UI components
- Framer Motion for animations
- Axios for API calls
- WebSocket for real-time updates
- React Dropzone for file uploads
- React Markdown for rendering markdown content
- Recharts for data visualization
product-review-analyzer/
├── backend/ # FastAPI backend
│ ├── app/
│ │ ├── api/ # API endpoints and models
│ │ ├── auth/ # Authentication
│ │ ├── models/ # Data models
│ │ ├── services/ # Business logic (scraper, analyzer)
│ │ └── utils/ # Utilities and exceptions
│ ├── download_nltk_resources.py # NLTK setup script
│ ├── requirements.txt # Python dependencies
│ ├── requirements.aws.txt # AWS-specific dependencies
│ └── main.py # FastAPI application entry
├── frontend/ # React frontend
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── config/ # Configuration files
│ │ ├── hooks/ # Custom React hooks
│ │ ├── services/ # API services
│ │ ├── App.jsx # Main application component
│ │ ├── index.css # Tailwind CSS styles
│ │ └── main.jsx # Entry point
│ ├── tailwind.config.js # Tailwind configuration
│ ├── vite.config.js # Vite configuration
│ └── package.json # Node dependencies
├── docs/ # Documentation
│ ├── api/ # API documentation
│ ├── deployment/ # Deployment guides
│ │ ├── aws.md # AWS deployment guide
│ │ └── railway.md # Railway deployment guide
│ ├── features/ # Feature documentation
│ └── troubleshooting/ # Troubleshooting guides
├── scripts/ # Deployment and setup scripts
│ ├── aws/ # AWS deployment scripts
│ │ ├── build.sh # AWS build script
│ │ ├── deploy_aws.sh # AWS deployment script
│ │ ├── setup_aws.sh # AWS setup script
│ │ ├── install_aws_deps.sh # AWS dependencies installer
│ │ └── build_frontend_aws.sh # AWS frontend build script
│ └── build.sh # Production build script
├── serve.py # Production server script
├── start.sh # Production startup script
├── package.json # Root package.json with scripts
└── README.md # Project documentation
- Node.js (v16+)
- Python (v3.11 recommended, avoid 3.13 due to compatibility issues)
- MongoDB Atlas account
- Google Gemini API key (for enhanced analysis)
- AWS Account (for AWS deployment)
- Install all dependencies at once:
npm run install:all- Start both backend and frontend:
npm run devFor production deployment on AWS:
- Configure your AWS credentials and EC2 instance
- Build the application for production:
# Make the build script executable
chmod +x scripts/aws/build.sh
# Run the build script
./scripts/aws/build.sh- Deploy to AWS:
# Make the deployment script executable
chmod +x scripts/aws/deploy_aws.sh
# Deploy to AWS
./scripts/aws/deploy_aws.sh -h <EC2_HOST> -k <PEM_FILE>The deployment script will:
- Package your application
- Upload it to your EC2 instance
- Install all dependencies
- Configure Nginx as a reverse proxy
- Set up a systemd service for the application
- Start the application
For detailed AWS deployment instructions, see the AWS Deployment Guide.
For deployment on Railway:
- Connect your GitHub repository to Railway
- Configure the environment variables in Railway dashboard
- Deploy the application
For detailed Railway deployment instructions, see the Railway Deployment Guide.
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
cd backend
pip install -r requirements.txt- Download NLTK resources:
python download_nltk_resources.py- Set up environment variables:
# Copy the example .env file
cp .env.mongodb.example .env
# Edit the .env file to add your MongoDB URI and Gemini API key
# You can get a Gemini API key from https://ai.google.dev/- Run the backend server:
uvicorn main:app --reload- Install dependencies:
cd frontend
npm install- Run the development server:
npm run devPOST /api/upload- Upload and analyze a CSV fileGET /api/scrape- Scrape and analyze data from Google Play StorePOST /api/analyze- Analyze a list of reviewsPOST /api/summary- Generate a summary from analyzed reviewsPOST /api/summary/pdf- Generate a PDF reportPOST /api/summary/weekly- Generate a weekly summary
POST /api/sentiment/analyze- Analyze sentiment of a single text with advanced featuresPOST /api/sentiment/batch- Analyze sentiment of multiple texts
POST /api/gemini/sentiment- Analyze sentiment using Google's Gemini APIPOST /api/gemini/batch- Batch process multiple texts with GeminiPOST /api/gemini/insights- Extract insights from reviews using GeminiGET /api/gemini/status- Get detailed Gemini API status with performance metrics
POST /api/history- Record an analysis in the historyGET /api/history- Get analysis historyGET /api/history/{analysis_id}- Get a specific analysis by IDDELETE /api/history/{analysis_id}- Delete an analysis history record
POST /api/timing/record- Record processing time for an operationGET /api/timing/estimate/{operation}- Get estimated processing timeGET /api/timing/history- Get processing time history
POST /api/auth/register- Register a new userPOST /api/auth/login- Login and get access tokenGET /api/auth/me- Get current user informationGET /api/auth/user- Get user information from token
ws://localhost:8000/ws/batch-progress- Real-time batch processing progress updatesws://localhost:8000/ws/sentiment-progress- Real-time sentiment analysis progress updates
MONGODB_URI- MongoDB Atlas connection stringSECRET_KEY- Secret key for JWT token generationACCESS_TOKEN_EXPIRE_MINUTES- JWT token expiration time in minutes
GEMINI_API_KEY- Google Gemini API key for enhanced analysisGEMINI_MODEL- Gemini model to use (default: gemini-2.0-flash)GEMINI_BATCH_SIZE- Number of reviews to process in each Gemini API batch (default: 10)GEMINI_SLOW_THRESHOLD- Threshold in seconds to detect slow Gemini API processing (default: 5)PORT- Server port (default: 8000)HOST- Server host (default: 0.0.0.0)DEBUG- Enable debug mode (default: False)ENABLE_WEBSOCKETS- Enable WebSocket support (default: True)PARALLEL_PROCESSING- Enable parallel processing (default: True)MAX_WORKERS- Maximum number of worker threads for parallel processing (default: 4)BATCH_SIZE_MULTIPLIER- Adjust all batch sizes (default: 1.0)CIRCUIT_BREAKER_TIMEOUT- Time before resetting circuit breaker (default: 300 seconds)
DEVELOPMENT_MODE- Set tofalsefor production environments (default: True)FRONTEND_URL- URL of the frontend for CORS configuration (default: varies by environment)LOG_LEVEL- Logging level (default: INFO)GUNICORN_WORKERS- Number of Gunicorn workers for production (default: 4)GUNICORN_TIMEOUT- Timeout for Gunicorn workers in seconds (default: 120)
Once the backend is running, visit http://localhost:8000/docs for the interactive API documentation.
The application includes advanced integration with Google's Gemini API for enhanced text analysis capabilities. For detailed documentation, see Gemini API Integration Documentation.
- Intelligent Batch Processing: Dynamically adjusts batch sizes based on review length
- Multi-Level Caching System: Implements LRU caching for faster responses
- Adaptive Request Throttling: Prevents rate limit errors by controlling request rates
- Robust JSON Parsing: Multiple fallback mechanisms for handling various response formats
- Circuit Breaker Pattern: Gracefully degrades to local processing when API is unavailable
The Gemini API integration includes comprehensive performance monitoring accessible through the /api/gemini/status endpoint, which provides detailed metrics on:
- API response times
- Cache efficiency
- Request throttling
- Error rates
- Circuit breaker status
For implementation details, troubleshooting tips, and best practices, refer to the complete documentation.
The CSV file should contain the following columns:
text(required) - The feedback contentusername(optional) - The user who provided the feedbacktimestamp(optional) - When the feedback was providedrating(optional) - A numerical rating (1-5)
Example:
text,username,timestamp,rating
"The app keeps crashing whenever I try to upload photos",user1,2023-04-29,2
"Would love to have dark mode in the next update!",user2,2023-04-30,4
"Everything works flawlessly. Great job!",user3,2023-04-30,5The application uses the following MongoDB collections:
users- User accounts and authentication informationreviews- Analyzed reviews and feedbackkeywords- Extracted keywords and their frequenciesanalysis_history- History of analysis operations with full summariesprocessing_times- Processing time records for estimationweekly_summaries- Weekly summaries for product prioritizationbatch_progress- Batch processing progress tracking
MIT