NLP & CV Classification Project: Alcoholic Beverage Text Detection

1. Overview

This project simulates a real-world classification problem: given a set of images, the goal is to build a classifier that detects images containing superimposed text related to alcoholic beverages. The pipeline involves data exploration, analysis, and the construction of a robust classification system that achieves strong performance metrics.

Task Objectives

Explore the dataset and understand the inputs/outputs.
Design and explain a classification pipeline architecture.
Suggest and use appropriate metrics for performance assessment.
Implement the pipeline using the provided data.
Report results and share valuable findings.

Dataset

A set of images containing text, split into two folders: Alcohol and Non-alcohol, based on whether the text mentions alcohol.

2. How to Run

Prerequisites

Python 3.11
Install dependencies:
```
pip install -r requirements.txt
```

Running the Classifier

Place your images in a folder (e.g., Images/).
Run the main script to classify images:
```
python run.py Images
```
The script will output a results.csv file with two columns: file_name and prediction (either alcohol or nonalcohol).

Example Output

file_name,prediction
image1.jpg,alcohol
image2.jpg,nonalcohol

3. Main Analysis

Data Exploration and Problem Identification

The dataset consists of images with superimposed text, divided into Alcohol and Non-alcohol categories.
Example: An image with the text "Don't miss limited edition nightlife bourbon event" is labeled as alcohol, while "energy drink" is labeled as nonalcohol.
The initial data exploration revealed several challenging issues that needed to be addressed:
- Inverted/flipped images with upside-down or mirrored text
- Poor text-background contrast where text color was too similar to background
- Text visibility issues with blurred or unclear text
- Word splitting/merging errors (e.g., "wat er" instead of "water")
- Character recognition errors with numbers appearing instead of letters
- Irrelevant text content that could confuse classification

Image Processing Solutions

To address image-level challenges, a comprehensive preprocessing pipeline was developed:

Image orientation detection: For each image, both original and flipped versions were processed, with selection based on OCR confidence scores
Color space separation: Images were converted to RGB and YUV color spaces to improve text visibility in cases of poor contrast
Multi-variant processing: Each image was processed through multiple combinations (original/flipped × RGB/YUV color spaces) with the highest confidence result selected

Text Processing and Cleaning

Extracted text underwent several cleaning steps:

Case normalization: All text converted to lowercase
Character mapping: Numbers mapped to likely letter equivalents (e.g., '1' → 'i' or 'l')
Punctuation and whitespace cleanup: Removed irrelevant characters and normalized spacing
Noise reduction: Filtered out non-relevant text content

Classification Approach

A hybrid classification system was implemented:

Keyword-based classifier: Used curated lists of alcohol-related and non-alcohol-related terms with fuzzy string matching
Neural network classifier: Fine-tuned DistilBERT model for text classification
Hybrid logic: Combined both approaches, using keyword matching for clear cases and falling back to the neural network for ambiguous cases

Performance and Results

The final pipeline showed significant improvements across all metrics:

OCR quality: Substantial reduction in images returning "no_text" results
Average confidence scores: Marked improvement in OCR confidence across all images
Classification metrics: Strong performance on precision, recall, F1-score, and accuracy
Robustness: Effective handling of various image quality issues and text extraction challenges

The majority of development time was invested in image processing and text extraction optimization, which proved critical for achieving good classification results.

Notes

The pipeline is CPU-only and does not require a GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
evaluate_pics.py		evaluate_pics.py
extract_text.py		extract_text.py
infernece.py		infernece.py
preprocess_text.py		preprocess_text.py
report.pdf		report.pdf
requirements.txt		requirements.txt
results.csv		results.csv
run.py		run.py
train_net.py		train_net.py
upload.py		upload.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP & CV Classification Project: Alcoholic Beverage Text Detection

1. Overview

Task Objectives

Dataset

2. How to Run

Prerequisites

Running the Classifier

Example Output

3. Main Analysis

Data Exploration and Problem Identification

Image Processing Solutions

Text Processing and Cleaning

Classification Approach

Performance and Results

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLP & CV Classification Project: Alcoholic Beverage Text Detection

1. Overview

Task Objectives

Dataset

2. How to Run

Prerequisites

Running the Classifier

Example Output

3. Main Analysis

Data Exploration and Problem Identification

Image Processing Solutions

Text Processing and Cleaning

Classification Approach

Performance and Results

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages