This project aims to automate the extraction of images from well completion report PDFs and classify them into predefined categories using machine learning techniques.
This project automates the Extraction and Classification of Images from Well Construction Report PDFs using Machine Learning and Natural Language Processing (NLP) techniques.
The System:
- Detects figure with caption, figure without caption & graphs in PDFs using a YOLOv8 model.
- Extracts captions associated with detected images.
- Classifies captions using Logistic Regression-based NLP in classes:
- Contour_Maps
- Drilling_Plots
- Geological_Map
- Geotechnical_Order
- Location_Map
- Log_Motif
- Remote_Sensing_Image
- Seismic_Section
- Stratigraphy_and_Casing_Plot
- Structural_Map
- Well_Construction_Diagram
- Well_Schematic_Diagram
- Others
- Organizes the output into structured directories.
- Provides a GUI-based interaction using PyQt6.
✅ Object Detection: Uses YOLOv8 to detect figures (labeled/unlabeled) and graphs.
✅ Caption Extraction: Extracts captions near detected images using PyMuPDF.
✅ NLP-based Classification: Classifies captions using TF-IDF + Logistic Regression.
✅ Automated Processing: Processes multiple PDFs at once.
✅ User-Friendly GUI: A PyQt6 interface for browsing PDFs and viewing results.
✅ Structured Output: Saves extracted images and captions in organized folders.
-
git clone https://github.com/mhsuhail00/ONGC-PDF-Image-Classification.git
-
cd ONGC-PDF-Image-Classification
pip install -r requirements.txtpython main.pycaptured_images
└───PDF_file_name
├───figure_without_label
│ ├───page_1_object_2.png
│ └───page_2_object_6.png
├───figure_with_label
│ ├───Contour_Maps
│ ├───page_1_object_1.png
│ └───page_1_object_1.txt
│ ├───Drilling_Plots
│ ├───Geological_Map
│ ├───Geotechnical_Order
│ ├───Location_Map
│ ├───Log_Motif
│ ├───Others
│ ├───Remote_Sensing_Image
│ ├───Seismic_Section
│ ├───Stratigraphy_and_Casing_Plot
│ ├───Structural_Map
│ ├───Well_Construction_Diagram
│ └───Well_Schematic_Diagram
└───graph
├───page_1_object_3.png
└───page_1_object_4.png
- Enhancing Caption Extraction – Improving accuracy for multi-line captions.
- Deep Learning-Based Classification – Exploring transformer-based NLP models.
- Extending Image Classification – Using CNNs for better image categorization.
- GUI Enhancements – Adding real-time progress tracking.
This project was developed as part of an Industrial Training at ONGC GEOPIC Centre, Dehradun, under the guidance of Mr. Sanjay Chakravorty, Dy. General Manager (Programming), ONGC.
Author: Mohammad Suhail
Institution: Zakir Husain College of Engineering & Technology, Aligarh Muslim University
You can view the detailed report of this project here:
Project Report
This project is licensed under the Apache License.
- Developer: Mohammad Suhail
- Email: mhsuhail00@gmail.com
- GitHub Profile: mhsuhail00
