This project implements an Optical Character Recognition (OCR) pipeline specifically designed for recognizing handwritten Kannada text from images.
- Image Denoising: Uses OpenCV's Non-Local Means Denoising to clean input images.
- Line/Paragraph Segmentation: Employs a deep learning model (
paragraph_dl.py) to identify lines of text within the image. - Character Segmentation: Segments individual characters from the detected lines using techniques likely implemented in
preprocessing/segmentation.py. - Character Classification: Uses a pre-trained PyTorch model (
models/classifier.pywith weights fromcheckpoints/classifier.pth) to classify segmented character images. - Text Reconstruction: Maps classified characters to their corresponding Unicode representation using
online_mapping.csvand reconstructs the text line by line.
The main processing logic is orchestrated in main.py:
- Load an input Kannada image (e.g.,
kannada.jpg). - Apply image denoising.
- Use the
process_linesfunction (fromparagraph_dl.py) to detect bounding boxes grouped by lines. - Iterate through each detected line:
- Sort bounding boxes within the line from left to right.
- For each bounding box:
- Crop the region from the denoised image.
- Segment characters within the cropped region using the
Segmenter. - Classify each segmented character using the
CharacterClassifier. - (Partially Implemented) Map the classification result to a Unicode character.
- Append the character to the current line's text.
- Write the recognized text for all lines to
transliteration.txt.
(Assumed setup - requires verification)
- Clone the repository:
git clone <repository-url> cd <repository-directory>
- Create a Python environment: (Recommended)
python -m venv venv source venv/bin/activate # Linux/macOS # or venv\Scripts\activate # Windows
- Install dependencies:
- You will likely need Python 3.x.
- Install required libraries (a
requirements.txtfile would be ideal, but based on imports, you'll need at least):pip install opencv-python torch pandas # Add any other specific dependencies required by the models or processing scripts - Ensure necessary CUDA libraries are set up if using a GPU with PyTorch (the
export LD_LIBRARY_PATH...line inmain.pysuggests CUDA 12.2 might be needed).
- Download Checkpoints: Ensure the character classifier checkpoint
checkpoints/classifier.pthis available. If not provided, you might need to train the model first.
- Place the Kannada image file you want to process in the root directory (or modify the
process_imagecall inmain.pyto point to the correct path). The current default iskannada.jpg. - Run the main script:
python main.py
- The recognized text will be saved in
transliteration.txt.
.
├── checkpoints/
│ └── classifier.pth # Pre-trained character classifier model
├── configs/ # Configuration files (if any)
├── data/ # Data files (training/testing - structure unknown)
├── logs/ # Log files (if any generated)
├── models/
│ ├── classifier.py # Character classification model definition
│ └── ... # Other model-related files
├── preprocessing/
│ ├── segmentation.py # Character segmentation logic
│ └── ... # Other preprocessing scripts
├── scripts/ # Helper scripts (e.g., training, evaluation)
├── utils/ # Utility functions
├── kannada.jpg # Example input image
├── main.py # Main script orchestrating the OCR pipeline
├── paragraph_dl.py # Line/paragraph detection script
├── online_mapping.csv # Mapping from classifier output to Unicode
├── transliteration.txt # Output file with recognized text
├── README.md # This file
├── .gitignore # Git ignore rules
└── ... # Other scripts and output images from experiments
- The mapping from the classifier output to Unicode characters in
main.py(unicode_character = ...) needs to be completed. - The handling of line/box detection output (
process_linesresult vs.linesvariable) inmain.pymight need clarification or fixing. - Remove debugging code (
pdb.set_trace()) frommain.py. - Remove the
export LD_LIBRARY_PATH...line frommain.py. - Consider creating a
requirements.txtfile. - Organize output images (
.jpgfiles) into a separate directory. - Consider moving core Python scripts into a source directory (e.g.,
src/).