Student Feedback Sentiment Analyzer (NLP)

An end-to-end Natural Language Processing (NLP) project that analyzes student feedback text and classifies sentiment as Positive, Neutral, or Negative using TF-IDF and Logistic Regression.

This project demonstrates the complete NLP pipeline: data preprocessing, feature extraction, supervised text classification, explainability, and deployment-ready inference.

Features

NLP-based sentiment classification on real student feedback
TF-IDF vectorization for text feature extraction
Logistic Regression with class balancing
Explainable AI using feature coefficient analysis
Flask-based inference web application
Deployment-ready project structure

Tech Stack

Python
pandas
scikit-learn
TF-IDF Vectorizer
Logistic Regression
Flask
HTML / CSS
Gunicorn (deployment-ready)

Project Structure

student-feedback-analyzer/ ├── app/ │ ├── app.py │ ├── sentiment_model.pkl │ ├── vectorizer.pkl │ ├── label_encoder.pkl │ ├── templates/ │ │ └── index.html │ └── static/ │ └── style.css ├── data/ │ └── finalDataset0.2.csv ├── model/ │ └── train.py ├── requirements.txt └── README.md

Run Locally

in bash

git clone https://github.com/YOUR_USERNAME/student-feedback-analyzer.git cd student-feedback-analyzer pip install -r requirements.txt cd app python app.py Open in browser:http://127.0.0.1:5000

How It Works

User enters student feedback text
Text is transformed using the saved TF-IDF vectorizer
Logistic Regression model predicts sentiment class
Numeric class is mapped to:

0 → Negative 1 → Neutral 2 → Positive
Result is displayed in the UI

Model Explainability

The model’s predictions are interpretable by analyzing TF-IDF feature coefficients. Key words contributing to each sentiment class were extracted to validate model reasoning.

Deployment (Optional)

This project is deployment-ready and can be hosted on platforms like Render.

Build Command bash: pip install -r requirements.txt

Start Command bash: gunicorn app.app:app

Notes

Training and inference are fully separated
Model artifacts are versioned for reproducibility
Designed for clone-and-deploy usage

⚠️ Model Limitations & Design Decisions

This sentiment analyzer uses a TF-IDF + Logistic Regression pipeline, which provides fast, interpretable, and deployment-friendly NLP inference.

Known Limitations

Bag-of-words models do not fully capture semantic context or negation.
Phrases like "not good" or "very bad" may be misclassified in rare cases.
Minority sentiment classes have limited samples in the dataset, affecting recall.

Mitigations Applied

Class-weighted Logistic Regression to address imbalance.
Bigram features (1–2 grams) to improve handling of negation and sentiment phrases.

Future Improvements

Utilize all textual feedback columns by combining them into a unified input.
Explore hybrid models combining text and structured features.
Evaluate transformer-based models (e.g., BERT) for deeper semantic understanding.

Experimental Branch: Full-Text Combination

This repository includes an experimental branch that explores improving sentiment prediction by utilizing all available textual feedback fields in the dataset.

Branch Name: feature/full-text-combination

Motivation

The baseline model was trained using a single high-signal feedback column to establish a clean and interpretable NLP pipeline. However, the dataset contains multiple complementary text fields (e.g., teaching, course content, lab work, extracurricular feedback), which provide additional context about student experience.

To better leverage this information, an experimental branch was created to combine all textual inputs into a unified document for model training and inference.

What Was Changed

Combined multiple text columns into a single combined_text feature
Retained the same sentiment target (teaching) to avoid label ambiguity
Reused the same NLP pipeline: TF-IDF vectorization with unigram + bigram features Class-weighted Logistic Regression
Updated the Flask inference app to accept multiple feedback inputs and combine them consistently with training

Outcome

Improved contextual understanding of feedback
Better handling of mixed sentiment statements
More realistic behavior for negative and neutral feedback cases

The baseline model remains available on the main branch for simplicity and stability, while this branch serves as a documented enhancement and experimentation path.

Engineering Rationale

This branching approach reflects real-world ML development practices:

Stable baseline maintained on main
Experimental improvements isolated in feature branches
Trade-offs documented rather than hidden

This project follows an iterative ML development approach, balancing deployable baselines with documented experimentation using Git branching.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Student Feedback Sentiment Analyzer (NLP)

Features

Tech Stack

Project Structure

Run Locally

in bash

How It Works

Model Explainability

Deployment (Optional)

Notes

⚠️ Model Limitations & Design Decisions

Known Limitations

Mitigations Applied

Future Improvements

Experimental Branch: Full-Text Combination

Motivation

What Was Changed

Outcome

Engineering Rationale

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
app		app
data		data
model		model
README.md		README.md
requirement.txt		requirement.txt

Folders and files

Latest commit

History

Repository files navigation

Student Feedback Sentiment Analyzer (NLP)

Features

Tech Stack

Project Structure

Run Locally

in bash

How It Works

Model Explainability

Deployment (Optional)

Notes

⚠️ Model Limitations & Design Decisions

Known Limitations

Mitigations Applied

Future Improvements

Experimental Branch: Full-Text Combination

Motivation

What Was Changed

Outcome

Engineering Rationale

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages