Skip to content

Manojbhat09/kannadaink

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kannada Handwritten OCR Pipeline

This project implements an Optical Character Recognition (OCR) pipeline specifically designed for recognizing handwritten Kannada text from images.

Features

  • Image Denoising: Uses OpenCV's Non-Local Means Denoising to clean input images.
  • Line/Paragraph Segmentation: Employs a deep learning model (paragraph_dl.py) to identify lines of text within the image.
  • Character Segmentation: Segments individual characters from the detected lines using techniques likely implemented in preprocessing/segmentation.py.
  • Character Classification: Uses a pre-trained PyTorch model (models/classifier.py with weights from checkpoints/classifier.pth) to classify segmented character images.
  • Text Reconstruction: Maps classified characters to their corresponding Unicode representation using online_mapping.csv and reconstructs the text line by line.

Workflow

The main processing logic is orchestrated in main.py:

  1. Load an input Kannada image (e.g., kannada.jpg).
  2. Apply image denoising.
  3. Use the process_lines function (from paragraph_dl.py) to detect bounding boxes grouped by lines.
  4. Iterate through each detected line:
    • Sort bounding boxes within the line from left to right.
    • For each bounding box:
      • Crop the region from the denoised image.
      • Segment characters within the cropped region using the Segmenter.
      • Classify each segmented character using the CharacterClassifier.
      • (Partially Implemented) Map the classification result to a Unicode character.
      • Append the character to the current line's text.
  5. Write the recognized text for all lines to transliteration.txt.

Setup

(Assumed setup - requires verification)

  1. Clone the repository:
    git clone <repository-url>
    cd <repository-directory>
  2. Create a Python environment: (Recommended)
    python -m venv venv
    source venv/bin/activate # Linux/macOS
    # or
    venv\Scripts\activate # Windows
  3. Install dependencies:
    • You will likely need Python 3.x.
    • Install required libraries (a requirements.txt file would be ideal, but based on imports, you'll need at least):
      pip install opencv-python torch pandas
      # Add any other specific dependencies required by the models or processing scripts
    • Ensure necessary CUDA libraries are set up if using a GPU with PyTorch (the export LD_LIBRARY_PATH... line in main.py suggests CUDA 12.2 might be needed).
  4. Download Checkpoints: Ensure the character classifier checkpoint checkpoints/classifier.pth is available. If not provided, you might need to train the model first.

Usage

  1. Place the Kannada image file you want to process in the root directory (or modify the process_image call in main.py to point to the correct path). The current default is kannada.jpg.
  2. Run the main script:
    python main.py
  3. The recognized text will be saved in transliteration.txt.

Project Structure

.
├── checkpoints/
│   └── classifier.pth      # Pre-trained character classifier model
├── configs/                # Configuration files (if any)
├── data/                   # Data files (training/testing - structure unknown)
├── logs/                   # Log files (if any generated)
├── models/
│   ├── classifier.py       # Character classification model definition
│   └── ...                 # Other model-related files
├── preprocessing/
│   ├── segmentation.py     # Character segmentation logic
│   └── ...                 # Other preprocessing scripts
├── scripts/                # Helper scripts (e.g., training, evaluation)
├── utils/                  # Utility functions
├── kannada.jpg             # Example input image
├── main.py                 # Main script orchestrating the OCR pipeline
├── paragraph_dl.py         # Line/paragraph detection script
├── online_mapping.csv      # Mapping from classifier output to Unicode
├── transliteration.txt     # Output file with recognized text
├── README.md               # This file
├── .gitignore              # Git ignore rules
└── ...                     # Other scripts and output images from experiments

Known Issues / TODOs

  • The mapping from the classifier output to Unicode characters in main.py (unicode_character = ...) needs to be completed.
  • The handling of line/box detection output (process_lines result vs. lines variable) in main.py might need clarification or fixing.
  • Remove debugging code (pdb.set_trace()) from main.py.
  • Remove the export LD_LIBRARY_PATH... line from main.py.
  • Consider creating a requirements.txt file.
  • Organize output images (.jpg files) into a separate directory.
  • Consider moving core Python scripts into a source directory (e.g., src/).

About

OCR for kannada

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages