Welcome to the NLP Sentiment Analysis Tutorial repository! This project is designed to help you understand and implement sentiment analysis using Python and popular NLP libraries. It's beginner-friendly and provides a structured approach to learning sentiment analysis step-by-step.
nlp-sentiment_analysis/
├── main.ipynb # Main notebook with code and explanations
├── utils.py # Utility functions for text processing and helper function
├── requirements.txt # List of dependencies
└── README.md # Project documentation
- Step-by-Step Guide: Walkthrough of key sentiment analysis concepts.
- Preprocessing Utilities: Helper functions in
utils.py. - Hands-On Notebook:
main.ipynbcontains a detailed tutorial and examples.
Make sure you have Python 3.7 or higher installed. You'll also need Jupyter Notebook or Jupyter Lab to run the main.ipynb.
-
Clone the Repository
Clone the repository to your local machine using:
git clone https://github.com/jumarubea/nlp-sentiment_analysis.git cd nlp-sentiment_analysis -
Set Up a Virtual Environment (Optional but Recommended)
Create a virtual environment to isolate dependencies:
python -m venv venv source venv/bin/activate # On MacOS/Linux venv\Scripts\activate # On Windows
-
Install Dependencies
Install the required Python libraries using
pip:pip install -r requirements.txt
-
Launch Jupyter Notebook
Start the Jupyter Notebook server and open
main.ipynb:jupyter notebook
Navigate to the folder and click on
main.ipynbto get started.
We are going to use twitter_samples datasets and english stopwords by installing them directly to from nltk
nltk.download('twitter_samples')
nltk.download('stopwords')- Data Loading: Import text data for analysis (examples provided in the notebook).
- Text Preprocessing: Utilize utilities from
utils.pyfor tasks like tokenization, stopword removal, and lemmatization. - Sentiment Analysis: Apply machine learning or rule-based methods to classify sentiments.
- Visualization: Plot sentiment distributions and other insights using tools like Matplotlib and Seaborn.
This project uses the following Python libraries:
nltk- For natural language processing.numpy- For numerical computations.pandas- For data manipulation and analysis.pprint- For pretty print of json files
To install these, run:
pip install -r requirements.txtContributions are welcome! If you’d like to improve this tutorial or fix issues, feel free to fork the repository and submit a pull request.
Special thanks to the open-source community for the libraries used in this project.