Socio-Economic News Sentiment Analysis

Overview

This project aims to perform sentiment analysis on news articles related to socio-economic issues in India. The goal is to understand the sentiment and popularity of specific socio-economic topics mentioned in the articles. The project involves fetching articles, data preprocessing, and finding sentiment.

Libraries Used

GoogleNews: A Python library to fetch news articles from Google News.
Pymongo: A Python library to connect to Mongo Db Compass.
TextBlob (NLTK): A library for performing sentiment analysis on text data.

Installing Libraries

To get started with the project, you'll need to install the necessary libraries. Here are the installation steps for each library:

GoogleNews: The pygooglenews library can be installed using pip:
```
pip install pygooglenews
```
Pymongo: You can install the pymongo library using pip:
```
pip install pymongo
```
TextBlob (NLTK): To install the textblob library along with the necessary NLTK corpora, run the following commands:
```
pip install textblob
python -m textblob.download_corpora
```

Manually Installing pygooglenews:

If the pip command does not work, and you have to install the pygooglenews package and its dependencies, you can follow these steps:

curl -O https://files.pythonhosted.org/packages/3f/d5/695ef6cd1da80e090534562ba354bc72876438ae91d3693d6bd2afc947df/pygooglenews-0.1.2.tar.gz
tar -xvzf pygooglenews-0.1.2.tar.gz
cd pygooglenews-0.1.2
pip install feedparser --force
pip install beautifulsoup4 --force
pip install dateparser --force
pip install requests --force
pip install . --no-deps

Note: You might need to adjust the installation steps based on your interpreter.

Approach

Fetching Articles:
- Utilize the GoogleNews library in the Collection.py script to search for news articles related to socio-economic keywords in India.
- Fetch article information, including titles, URLs, and snippets.
- Script: collection.py
Data Preprocessing:
- Utilize the summarizing.py script to summarize the collected articles.
- Perform data preprocessing to clean and prepare the collected articles for analysis. This may involve:
  - Removing HTML tags and irrelevant content.
  - Tokenization: Splitting text into words or phrases.
  - Removing stopwords (common words like "and," "the," "in") to reduce noise.
  - Lemmatization or stemming to reduce words to their base form.
  - Handling missing data, if any.
- Script: summarizing.py
Finding Sentiment:
- Utilize the preprocessing.py script to preprocess the summarized data.
- Utilize the sentiment_analysis.py script to perform sentiment analysis on the preprocessed articles.
- Assign sentiment scores (positive, negative, neutral) to each article.
- Store the sentiment information in a new database for further analysis.
- Script (Preprocessing): preprocessing.py
- Script (Sentiment Analysis): sentiment_analysis.py
User Interaction:
- Allow the user to input a socio-economic topic and choose whether they want to see sentiment analysis on an aggregate level or individual article level.
- Script: printing.py

Implementation

After data preprocessing, loop through the articles and use TextBlob to analyze the sentiment of each article.
Store the sentiment scores along with article information (title, URL, etc.) in a new database using Pymongo.

Additional Consideration

Implement error handling for cases where sentiment analysis might not be accurate or if there are issues with data retrieval.

This project aims to provide valuable insights into the sentiment surrounding socio-economic topics in Indian news articles. By analyzing the sentiment and popularity of these topics, it can contribute to a better understanding of public perception and media coverage.

Additional Resources

For more information about the pygooglenews library, visit the pygooglenews GitHub repository.
To learn about the newspaper library for article extraction, refer to the newspaper GitHub repository.
To learn about the newspaper library for article extraction, refer to the newspaper GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data_collection_and_preprocessing		data_collection_and_preprocessing
fetching_articles		fetching_articles
main_code_sentiment_analysis		main_code_sentiment_analysis
user_interaction		user_interaction
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Socio-Economic News Sentiment Analysis

Overview

Libraries Used

Installing Libraries

Approach

Implementation

Additional Consideration

Additional Resources

About

Uh oh!

Releases

Packages

Languages

manyasiingh/Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Socio-Economic News Sentiment Analysis

Overview

Libraries Used

Installing Libraries

Approach

Implementation

Additional Consideration

Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages