This project analyzes 10,000 Google Play Store reviews of the LinkedIn Android app (US, English only).
I collected reviews using the google-play-scraper package, applied sentiment analysis with VADER (NLTK), and visualized user sentiment trends.
The goal is to understand user sentiment patterns, identify pain points, and highlight how sentiment aligns with star ratings and time.
- Source: Google Play Store
- App: LinkedIn (Android)
- Scope: 10,000 reviews (US, English only)
- Fields collected:
reviewId,userName,rating,date,review,developer reply - Tools:
google-play-scraper(Python package)⚠️ Raw data is excluded from GitHub for size/privacy. You can re-run01_collect_reviews.ipynbto scrape reviews yourself.
- Data Collection (
01_collect_reviews.ipynb)- Scraped 10,000 reviews (US/EN) via Google Play API
- Deduplicated and cleaned text
- Sentiment Analysis (
02_sentiment_analysis.ipynb)- Pre-processed text (removed links, special characters, etc.)
- Applied NLTK VADER sentiment analyzer
- Classified reviews into
positive,neutral, ornegative
- Visualization & Insights
- Distribution of sentiment
- Weekly sentiment trend
- Relationship between sentiment and Google Play star ratings
- ~68% positive, ~18% negative, ~14% neutral
- Positive reviews often highlight networking and job features.
- Negative reviews commonly mention login issues, bugs, and app crashes.
- Sentiment fluctuates weekly, with dips around late August and early September.
- Indicates possible impact of app updates on user satisfaction.
- Strong alignment:
- 1★ reviews → mostly negative sentiment
- 5★ reviews → strongly positive
- Confirms VADER sentiment scores are consistent with explicit ratings.
- Login & stability issues drive a large portion of negative sentiment → critical area for improvement.
- Networking and job-related features are the app’s strongest drivers of positive sentiment.
- Monitoring sentiment trends over time can help detect the impact of feature updates or bugs.
- Reviews are self-selected and may not represent all users.
- Only English/US reviews were analyzed — excludes international/local perspectives.
- Future extensions:
- Multi-language sentiment models
- Topic modeling (LDA/BERT) for deeper theme extraction
- Longer-term time series analysis with changelog/event overlay
Clone this repository and install dependencies:
git clone https://github.com/aituar17/Review_Sentiment_Analysis.git
cd Review_Sentiment_Analysis
pip install -r requirements.txt01_collect_reviews.ipynb to fetch fresh reviews.
Review_Sentiment_Analysis/
├── data/ # Dataset folder (not uploaded to GitHub due to size; see data/README.md)
├── images/ # Saved plots for README
│ └── sentiment_distribution.png
│ └── sentiment_over_time.png
│ └── sentiment_by_rating.png
├── notebooks/ # Jupyter notebooks with full scraping and analysis
│ └── 01_collect_reviews.ipynb
│ └── 02_sentiment_analysis.ipynb
├── .gitignore # Ignore large files (e.g., dataset)
├── README.md # Project documentation
└── requirements.txt # Dependencies for reproducibility


