This repository contains code to download and clean all Data Zone level data for the Northen Irish 2021 Census
The python code:
- Finds the available variables from the NISRA Table Builder
- Scrapes the tables for each variable using beautiful soup
- Create new variable names based on the sequential ordering of the variables and the table identification code
- Create a metadata lookup table providing the link between the new names and the original names
- Export the data zone data as both CSV and Parquet files
The created CSV are available in the folder "/output_data/csv" and the parquet files in the folder "/output_data/parquet"