Create a virtual environment and install dependencies:
pip install -r requirements.txtThis project follows the HRDAG principled data processing framework. Each processing step has its own directory with src/, input/, output/, hand/, and frozen/ subdirectories to ensure reproducibility and data lineage.
Run the complete pipeline:
make allOr run individual steps:
make convert # Excel to Parquet conversion
make dedupe # Deduplication
make wrangle # Add facility/county data
make kq1 # Citizenship analysis by AOR
make kq2 # Accelerated deportation analysis