A reproducible R workflow for processing, analyzing, and visualizing population distribution data across Kenya's counties — integrating raster population grids with administrative shapefiles to generate zonal statistics and publication-ready maps.
- Overview
- Workflow
- Prerequisites
- Installation & Setup
- Usage
- Project Structure
- Key Outputs
- Data Sources
- Contributing
- License
- Author
This project provides a complete end-to-end geospatial analysis pipeline in R for exploring population patterns at the county and sub-county levels in Kenya. The workflow combines:
- Administrative boundaries (counties and sub-counties) as vector shapefiles
- Population raster data (e.g., WorldPop 100m grids) for spatial distribution
- Health facility locations to support accessibility and coverage analysis
The pipeline handles CRS harmonization, raster clipping, zonal statistics computation, and produces clean choropleth maps suitable for reports and dashboards.
The analysis follows a structured 9-step pipeline:
| Step | Stage | Description |
|---|---|---|
| 1 | Import Data | Load county/sub-county boundaries, population raster, and health facility points |
| 2 | Data Quick Checks | Inspect structure with View(), summary(), names(), and st_crs() |
| 3 | Filter / Subset County | Isolate target county from the national boundaries layer |
| 4 | Check CRS | Verify coordinate reference systems across all layers |
| 5 | Reproject Population Raster | Align raster CRS to match the shapefile's projection |
| 6 | Clip Population to County | Mask the raster to the county boundary extent |
| 7 | Zonal Statistics | Compute sum and mean population per administrative unit |
| 8 | Merge Results to Shapefile | Join statistical outputs back to the spatial layer |
| 9 | Create Population Map | Render final choropleth map with tmap or ggplot2 |
R >= 4.0.0
# Spatial data handling
install.packages(c("sf", "terra", "raster"))
# Data manipulation
install.packages(c("dplyr", "tidyr"))
# Visualization
install.packages(c("ggplot2", "tmap", "viridis"))
# Utilities
install.packages(c("here", "readr"))- Clone the repository
git clone https://github.com/sylvesteronyango/kenya-population-viz.git
cd kenya-population-viz- Open the R project
# In RStudio, open kenya-population-viz.Rproj
# or set working directory manually:
setwd("path/to/kenya-population-viz")- Install dependencies
source("R/00_install_packages.R")- Add data files — see Data Sources for download instructions, then place files in
data/raw/.
Run scripts sequentially, or execute the master script:
# Run the full pipeline
source("R/run_all.R")Or step by step:
source("R/01_import_data.R")
source("R/02_data_checks.R")
source("R/03_filter_county.R")
source("R/04_check_crs.R")
source("R/05_reproject_raster.R")
source("R/06_clip_population.R")
source("R/07_zonal_statistics.R")
source("R/08_merge_results.R")
source("R/09_create_map.R")To analyze a specific county, set the target in R/config.R:
TARGET_COUNTY <- "Nairobi" # Change to any Kenya county namekenya-population-viz/
│
├── data/
│ ├── raw/ # Source data (not tracked by git)
│ │ ├── shapefiles/ # County & sub-county boundaries (.shp)
│ │ ├── raster/ # WorldPop population raster (.tif)
│ │ └── health_facilities/ # Health facility points (.csv / .shp)
│ └── processed/ # Intermediate outputs
│
├── R/
│ ├── config.R # Global settings & county selector
│ ├── 00_install_packages.R
│ ├── 01_import_data.R
│ ├── 02_data_checks.R
│ ├── 03_filter_county.R
│ ├── 04_check_crs.R
│ ├── 05_reproject_raster.R
│ ├── 06_clip_population.R
│ ├── 07_zonal_statistics.R
│ ├── 08_merge_results.R
│ ├── 09_create_map.R
│ └── run_all.R # Master execution script
│
├── outputs/
│ ├── maps/ # Exported map images (.png, .pdf)
│ └── tables/ # Zonal statistics CSV exports
│
├── docs/
│ └── workflow.png # Pipeline diagram
│
├── kenya-population-viz.Rproj
├── .gitignore
└── README.md
- Population choropleth map — County or sub-county level map showing population distribution
- Zonal statistics table — CSV with total and mean population per administrative unit
- Clipped raster — Population raster masked to the target county boundary
- Reprojected layers — CRS-harmonized spatial files ready for further analysis
Sample map output is saved to outputs/maps/population_map.png.
| Dataset | Source | Format |
|---|---|---|
| Kenya County Boundaries | KNBS / GADM | .shp |
| Sub-County Boundaries | Kenya Open Data / KNBS | .shp |
| Population Grid (100m) | WorldPop | .tif |
| Health Facilities | Kenya MoH / KHFL | .csv / .shp |
Note: Raw data files are excluded from version control via
.gitignore. Download them from the sources above and place indata/raw/.
Contributions, bug reports, and feature suggestions are welcome!
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit your changes:
git commit -m "Add: your feature description" - Push to the branch:
git push origin feature/your-feature - Open a Pull Request
Please ensure your code follows the existing style and includes comments for any new geospatial processing steps.
This project is licensed under the MIT License — see the LICENSE file for details.
|
Sylvester Onyango Geospatial Data Analyst @sylvesteronyango |
Built with passion for spatial data and open science. If this project helped you, consider giving it a ⭐ on GitHub.