A powerful toolkit to extract, process, and upload Small Area Atlas maps by Bangladesh Bureau of Statistics (BBS)
BBS Map Extractor is a collection of automated tools designed to extract, process, and mass-upload maps and metadata from the Bangladesh Bureau of Statistics (BBS) Small Area Atlas. This project facilitates the high-precision conversion of PDF maps to SVG format and streamlines the batch uploading process directly to Wikimedia Commons.
BBS pdf to svg map extractor.py: The core extraction engine. Automatically slices and extracts SVG maps from source PDF files with smart naming logic.Map Metadata generator.py: Scans the extracted SVGs and generates Wikimedia Commons-compatible metadata (CSV) including descriptions, dynamic categories, and licensing.Map Batch Upload.py: Reads the generated CSV and securely automates the batch uploading process to Wikimedia Commons using Pywikibot.
Atlas to PDF map extractor.html: A browser-based interface to intelligently extract and slice individual map pages from the master PDFs into separate PDF files.SVG_boundary_cropper.html: An advanced utility utilizing matrix transforms to precisely crop SVG boundaries and remove invisible whitespace.
- Backend / Automation: Python 3, Pywikibot, PyMuPDF (
fitz), Pandas - Frontend / UI: HTML5, CSS3, Vanilla JavaScript, JSZip, PDF.js, PDF-lib
- 📊 Source Data: BBS Small Area Atlas (Official)
- 📚 Source PDFs on Commons: Category:Small Area Atlas of Bangladesh
- 🗺️ Extracted SVG Maps: Category:Maps from the Small Area Atlas of Bangladesh
- Python 3.x installed on your system.
- Required Python libraries:
pip install pywikibot PyMuPDF pandas
A modern web browser (Chrome, Edge, Firefox, etc.) to run the HTML-based tools.
-
Clone or Download this repository to your local machine.
-
Extract: Run BBS pdf to svg map extractor.py (or the HTML tool) to get the map files from your downloaded PDFs.
-
Generate Metadata: Run Map Metadata generator.py to automatically prepare your upload-ready CSV file.
-
Upload: Execute Map Batch Upload.py to securely sync everything with Wikimedia Commons.
This project is open-source and released under the CC BY-SA 4.0 License. You are free to share and adapt the material, provided you give appropriate credit and distribute your contributions under the same license.
Developed with ❤️ by MS Sakib