GeoLink is an R package that provides easy access to various geospatial datasets, allowing seamless integration with World Bank surveys and other spatial data.
- Features
- Installation
- Usage
- Data Visualization Examples
- Basic Usage Examples
- Documentation
- Contributing
- License
Download and process:
- CHIRPS rainfall data
- Night Time Light (NTL) data
- Population data (WorldPop)
- Elevation data
- Building data (WorldPop)
- CMIP6 climate model data
- Cropland data
- WorldClim climate data
- Terraclimate data
- Land Use Land Cover data
- Points of Interest (OpenStreetMap)
- Electrification access data (HREA)
- OpenCellID data
- Pollution data
- Vegetation Index data
# Install devtools if you haven't already
install.packages("devtools")
# Install GeoLink
devtools::install_github("SSA-Statistical-Team-Projects/GeoLink")
# Load the package
library(GeoLink)Most functions require either: - A shapefile (shp_dt or shp_fn) - OR
a household survey dataset (survey_dt or survey_fn) with coordinates
library(ggplot2)
library(sf)
library(viridis)
# Get rainfall data
rainfall_map <- geolink_chirps(
time_unit = "month",
start_date = "2020-01-01",
end_date = "2020-01-31",
shp_dt = shp_dt[shp_dt$ADM1_EN == "Abia",],
grid_size = 1000
)
# Create beautiful rainfall map
ggplot(rainfall_map) +
geom_sf(aes(fill = rainfall_month1)) +
scale_fill_viridis_c(
name = "Rainfall (mm)",
option = "plasma"
) +
theme_minimal() +
labs(
title = "January 2020 Rainfall Distribution",
subtitle = "Abia State, Nigeria",
caption = "Data source: CHIRPS"
) +
theme(
plot.title = element_text(size = 16, face = "bold"),
legend.position = "right"
)library(leaflet)
# Get population data
pop_data <- geolink_population(
start_year = 2018,
end_year = 2018,
iso_code = "NGA",
constrained = 'N',
UN_adjst = 'N',
shp_dt = shp_dt[shp_dt$ADM1_EN == "Kano",],
grid_size = 1000,
extract_fun = "mean"
)
# Create interactive population map
leaflet(pop_data) %>%
addProviderTiles(providers$CartoDB.DarkMatter) %>%
addPolygons(
fillColor = ~colorQuantile("YlOrRd", population_2018)(population_2018),
fillOpacity = 0.7,
weight = 1,
color = "#666",
popup = ~paste(
"<strong>Area:</strong>",
"<br>Population:", round(population_2018),
"<br>Density:", round(population_2018/st_area(geometry))
)
) %>%
addLegend(
"bottomright",
title = "Population Density",
pal = colorQuantile("YlOrRd", pop_data$population_2018),
values = ~population_2018
)# Combine elevation and cropland data
elevation_data <- geolink_elevation(
iso_code = "NGA",
shp_dt = shp_dt[shp_dt$ADM1_EN == "Plateau",],
grid_size = 1000
)
cropland_data <- geolink_cropland(
shp_dt = shp_dt[shp_dt$ADM1_EN == "Plateau",],
grid_size = 1000
)
# Create combined visualization
ggplot(elevation_data) +
geom_sf(aes(fill = NGA_elv_msk)) +
geom_sf(data = cropland_data, aes(alpha = cropland), fill = "darkgreen") +
scale_fill_gradient2(
low = "darkgreen",
mid = "yellowgreen",
high = "brown",
midpoint = median(elevation_data$NGA_elv_msk),
name = "Elevation (m)"
) +
scale_alpha_continuous(name = "Cropland Density") +
theme_minimal() +
labs(
title = "Elevation Profile with Cropland Overlay",
subtitle = "Plateau State, Nigeria"
)df <- geolink_chirps(
time_unit = "month",
start_date = "2020-01-01",
end_date = "2020-03-01",
shp_dt = shp_dt[shp_dt$ADM1_PCODE == "NG001",],
grid_size = 1000
)df <- geolink_population(
start_year = 2018,
end_year = 2019,
iso_code = "NGA",
shp_dt = shp_dt[shp_dt$ADM1_EN == "Abia",],
grid_size = 1000
)For detailed documentation of each function, use Rβs help system:
?geolink_chirps
?geolink_ntl
?geolink_populationWhen working with geospatial data, memory management is crucial, especially on laptops:
- Large raster operations: Many GeoLink functions process large raster files that can consume significant RAM
- Geographic chunking: Consider processing large areas by dividing them into smaller regions
- Grid size adjustment: Increase
grid_sizeparameter to reduce resolution and memory requirements - Temporary file cleanup: Use
raster::removeTmpFiles(h=0)periodically to clean up temp files
To maximize available memory for GeoLink on your laptop:
# Increase R's memory limit (Windows)
memory.limit(size=10000) # Set to 10GB if available
# Clear R environment before large operations
rm(list = ls())
gc()
# Configure raster package to use disk instead of RAM
raster::rasterOptions(maxmemory = 1e9) # 1GB RAM limit
raster::rasterOptions(tmpdir = "C:/R_temp") # Custom temp directory
# Monitor memory usage
mem_used <- pryr::mem_used() # Requires 'pryr' package
print(paste0("Current memory usage: ", round(mem_used/1e6, 2), " MB"))
# Close other applications and disable memory-intensive R features
options(device.ask.default = FALSE) # Disable interactive graphics promptsFor extremely large geospatial operations:
- Use
terrapackage instead ofrasterfor more efficient memory usage - Consider parallel processing with
parallelorfuturepackages - Pre-filter your area of interest to the smallest necessary extent
- Export intermediate results to disk using
sf::write_sf()orterra::writeRaster() - Consider cloud-based computation for nationwide or continental analyses
GeoLink implements robust error handling to gracefully manage common issues when working with geospatial data:
# Invalid file format detection
df <- geolink_population(survey_fn = "invalid.txt")
# Error: "Unsupported file format. Please provide .dta file"
# Missing required parameters
df <- geolink_chirps()
# Error: "start_date and end_date must be specified"
# Invalid parameter values
df <- geolink_ntl(time_unit = "weekly")
# Error: "Time unit should either be month or annual"# Automatic CRS detection and reprojection
df <- geolink_chirps(shp_dt = invalid_crs_data)
# Message: "Reprojecting from XXXX to EPSG:4326"
# All GeoLink functions use ensure_crs_4326() internally to handle CRS issues# Proper resource allocation and deallocation
on.exit(raster::removeTmpFiles(h=0))# Intelligent splitting of large areas into manageable chunks
# For OpenStreetMap data:
if (bbox_area > area_threshold) {
message("Large area detected. Splitting into quadrants...")
# Process quadrants separately and combine results
}If you encounter errors, try these common solutions:
- Memory errors: Increase grid size or use resampling
- Connection timeouts: Check internet connection
- Missing data: Verify date ranges are valid for the selected dataset
- Invalid geometries: Clean your shapefile using
sf::st_make_valid() - Projection errors: Ensure input data has defined coordinate systems
GeoLink offers straightforward access to a wide range of geospatial
variables without requiring expertise in the data or the tools needed to
access them. When compared to packages such as BlackMarbleR, chirps,
climateR, and cmip6, GeoLink shows mixed performance results in
terms of speed. Using microbenchmark to evaluate GeoLink's performance
alongside these different packages, we have compiled the results (in
seconds) in the table below.
However, other packages involve additional steps and require technical
expertise, which can lengthen the time needed to achieve the desired
results. For example, BlackMarbleR necessitates a subscription to a
website and produces raster data instead of vector data (such as a
shapefile). The cimp6 package failed to obtain rasters in 7 out of 10
attempts, whereas GeoLink successfully collected data in all
instances. Additionally, climateR provides daily raster data that
needs to be summarized into monthly or yearly figures.
| Variable | Package | min | lq | mean | median | uq | max | neval |
|---|---|---|---|---|---|---|---|---|
| Nightime lights | BlackMarbleR |
7.77 | 8.05 | 8.28 | 8.33 | 8.55 | 8.67 | 10 |
GeoLink |
30.22 | 34.54 | 37.68 | 36.50 | 40.51 | 47.96 | 10 | |
| CHIRPS | CHIRPS |
138.99 | 139.53 | 148.70 | 144.09 | 154.15 | 174.31 | 10 |
GeoLink |
6.36 | 7.01 | 7.39 | 7.28 | 7.65 | 9.20 | 10 | |
| TerraClimate | climateR |
1.20 | 1.30 | 1.31 | 1.32 | 1.34 | 1.35 | 10 |
GeoLink |
5.40 | 6.42 | 10.69 | 8.31 | 11.44 | 30.15 | 10 | |
| CMIP6 | cmip6 |
11.06 | 11.43 | 16.91 | 17.79 | 20.69 | 24.18 | 10 |
GeoLink |
335.36 | 347.13 | 356.53 | 352.17 | 365.97 | 380.32 | 10 |
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the Project
- Create your Feature Branch
(
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE.md file for details.
Project Link: https://github.com/SSA-Statistical-Team-Projects/GeoLink
Made with β€οΈ by [The World Bank Group & The University of Southampton]


