Skip to content

khalilT/geocode_disasters

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Geo-Disasters: Geocoding EM-DAT climate related disasters

This repository contains a series of Python scripts for processing and geolocating the climate-related disaster events from EM-DAT for the period 1990-2023. It uses geographic data from GAUL administrative boundaries, and the GeoNames API.

Overview

1_clean_EmDAT_nogeocode.py

This script identifies the EM-DAT events for which no GAUL id was provided that need to be geocoded using Geonames API.

Inputs: EM-DAT data file (public_emdat_1990_2023.xlsx) Outputs: Dataframe with events that need to be geocoded

2_geolocation_geonames_script.py

This script is designed to geocode event locations using the GeoNames API. Running might take a long time (3 to 4 days) as we have around 10k names to geolocate, with limits on the usage per hour and day, in addition to potential interruptions if there are too many requests on the API. The code is adapted to continue from were the operation was interrupted. The script was run on two iterations. The first one on the list of events identified in the first script. The second one is on the events that were not identified in the first iteration (around 900 locations), after manually correcting the locations names / iso codes.

Inputs: Dataframe with locations that need to be geocoded Outputs: Dataframe with geocoded locations: lat/lon and province names identified.

3_clean_geonames_geolocation.py

This script processes the locations geocoded with Geonames and identified latitudes and longitudes in each case with GAUL administrative regions at both levels (ADM1 and ADM2). We assign geocoding quality flags to the geocoded regions.

Inputs: Dataframe with geocoded locations Outputs: Dataframe with geocoded locations with identified corresponding GAUL admin level and code that enables matching with geographic data.

4_geolocation_identified_gaul_ID.py

In this script, we identify the geometries corresponding to the locations where the GAUL id is provided with EM-DAT. we then concatenate all identified locations together (with the geonames identified locations) and apply quality flags for consistency

Inputs: Dataframe with geocoded locations with identified corresponding GAUL Inputs: EM-DAT data file (public_emdat_1990_2023.xlsx) Outputs: Geodaraframe with all identified locations and geographic data

5_national_overlay.py

In this script, we overlay the regions corresponding to each EM-DAT event within each country, to have the total reported area of the event.

Inputs: Geodaraframe with all identified locations and geographic data Outputs: Geodaraframe with overlayed extent per event

6_filter_write_data.py

In this script, we perform additional filtering of EM-DAT events. We remove all events that do not have any impact information, and events with inaccurate geocoding. Inputs: Geodaraframe with overlayed extent per event, EM-DAT database Output: Geodaraframe with event locations, em-dat event and impact information, and needed time information for climate aggregation

7_compare_gdis.py

In this script, we compare the geographic mismatch between EM-DAT events geocoded by GDIS and by Geo-Disasters. Inputs: Geo-Disasters, GDIS output: comparison_df: dataframe with the comparison results.

db_descriptions.R

In this R script, we generate the figures in the publications. Inputs: Geo-Disasters (subnational, national overlay), GDIS, EM-DAT, comparison_df Output: publication figures

How to Run the Scripts

Ensure that the paths to the required input files (EM-DAT data, GAUL maps,...) are correctly stated in src/utils/paths.py and src/utils/paths.R. Run each script in the appropriate order indicated in the names.

Notes

The data output from these scripts is crucial for geographic analysis and visualization of climate-related disasters. The scripts have built-in error handling for mismatched or missing locations, ensuring robust processing. Manual corrections are necessary in many cases. The GeoNames API script requires a valid username and should be run twice: once for locations without GAUL IDs and once for manually corrected locations. We do not provide the GAUL maps, but we recommend downloading them from Google Earth Engine Data Catalog. We do not provide EM-DAT, it can be freely accessed for academic purposes in https://www.emdat.be/

Installation

  1. Clone the repository:
    git clone https://github.com/khalilT/geocode_disasters.git
    cd yourproject
  2. Install dependencies:
    pip install -r requirements.txt

Session Info

  • Python Version: 3.8.19 | packaged by conda-forge | (default, Mar 20 2024, 12:47:35) [GCC 12.3.0]
  • Platform: Linux-6.8.0-57-generic-x86_64-with-glibc2.10
  • OS: Linux
  • Architecture: 64bit
  • Processor: x86_64
  • Generated On: 2025-05-15 10:06:19
  • R Version: R version 4.4.2 (2024-10-31) -- "Pile of Leaves"

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors