A web application that extracts metadata from book images using OCR and outputs the results in JSON format.
- Extracts metadata from book cover images and title pages
- Supports multiple image uploads (batch processing)
- Extracts the following metadata:
- Title
- Authors
- ISBN (10 or 13 digits)
- Publishers
- Publication date
- Edition
- Responsive web interface with drag-and-drop support
- Outputs clean, structured JSON data
The extracted metadata follows this schema:
{
"title": "string | null",
"authors": ["string"],
"isbn": "string | null",
"publishers": ["string"],
"publication_date": "string | null",
"edition": "string | null",
"filename": "string"
}-
Install Tesseract OCR
- Windows: Download and install from UB Mannheim
- macOS:
brew install tesseract - Linux:
sudo apt install tesseract-ocr
-
Clone the repository
git clone <repository-url> cd book_metadata_extractor
-
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate
-
Install Python dependencies
pip install -r requirements.txt
-
Start the application
python app.py
-
Open your web browser Visit
http://localhost:5000 -
Upload book images
- Drag and drop images or click to select files
- Click "Process Images" to extract metadata
- Python 3.7+
- Tesseract OCR
- Flask
- pytesseract
- opencv-python
- Pillow
- python-dateutil
This project is open source and available under the MIT License.
Contributions are welcome! Please feel free to submit a Pull Request.