Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,97 @@ This project, undertaken in collaboration with Dr. Miles Corak, examines the pro
* **Analyze Poverty Trends:** Conduct in-depth, cross-national analyses of poverty trends, drawing on the harmonized LIS microdata.
* **Evaluate Policy Impacts:** Assess the impact of various social and economic policies on poverty, using our standardized dataset to draw meaningful comparisons.

## Repository Structure

```
├── DART/ # DART validation and methodological notes
│ ├── MIMA/ # MIMA-related CSV data and visualizations
│ └── Methodological_Notes.md # Documentation of methodology
├── LISSY/ # Core LIS analysis and validation
│ ├── Official_pr_Analysis/ # Main poverty rate analysis pipeline (see below)
│ ├── DART_Validation/ # Validation against DART tables
│ ├── MBM_validation/ # Market Basket Measure validation
│ ├── MIMA5/ # MIMA-5 specific analysis
│ └── Tutorial/ # Tutorial materials
├── METIS-LIS/ # METIS-LIS integration and codebooks
├── analysis/ # Data availability analysis
├── docs/ # Project documentation
├── present-Nov21/ # Presentation materials (2000, 2008, 2018 base years)
├── scripts/ # Utility scripts (e.g., HTML to MD conversion)
├── xlsxConverted/ # Converted Excel files (CSV, JSON, MD formats)
├── xlsxFiles/ # Original Excel source files
├── compute_mima.py # Core MIMA computation script
├── convert_excel.py # Excel file conversion utility
└── USAGE_GUIDE.md # Usage guide for the project
```

## Official_pr_Analysis Folder

The `LISSY/Official_pr_Analysis/` folder contains the **official poverty rate analysis pipeline** — the core analytical workflow for computing and validating poverty rates across multiple countries using Luxembourg Income Study (LIS) microdata.

### Purpose

This folder implements the **MIMA (Median Income Moving Average)** methodology for calculating poverty rates and validates these calculations against official government benchmarks from multiple countries.

### Structure

```
Official_pr_Analysis/
├── benchmarks/ # Official poverty rate benchmarks by country
│ ├── ca/ # Canada (Market Basket Measure benchmarks)
│ ├── de/ # Germany (Armutsgefährdungsquoten data)
│ ├── eu/ # European Union (EU-SILC methodology)
│ ├── uk/ # United Kingdom (HBAI statistics)
│ └── us/ # United States (Census Bureau poverty rates)
├── lissy_data/ # LIS microdata outputs and processing
│ ├── _SCRIPTS/ # Job parsing scripts (parse_lissy_job.py)
│ ├── ca/, de/, uk/, us/ # Country-specific LIS job outputs
│ ├── mima_algorithm_explanation.md # Detailed MIMA algorithm documentation
│ └── parse_lissy_job_guide.md # Guide for parsing LIS job logs
├── results/ # Analysis outputs and visualizations
│ ├── ca/ # Canada results (CSV files and plots)
│ └── us/ # United States results
└── scripts/ # Analysis and visualization scripts
├── run_analysis.py # Main runner for optimization and visualization
├── run_multi_benchmark_analysis.py # Multi-benchmark comparison runner
├── parse_lis_output.py # Parser for LIS output files
├── calculate_and_plot_mima_from_csv.py # MIMA calculation from CSV
├── plot_mima_difference.py # MIMA difference visualization
├── plot_npoor_analysis.py # Number of poor analysis plots
├── plot_specific_rates.py # Specific rate plotting utilities
├── plot_us_vs_benchmarks.py # US vs benchmark comparison plots
└── single_file_optimize.py # Single-file optimization script
```

### Workflow

1. **Data Extraction**: R scripts run on the LIS LISSY platform to extract household-level data with income variables (`dhi`, `mhi`), household weights (`hpopwgt`), and demographic variables.

2. **Job Parsing**: The `parse_lissy_job.py` script automatically parses LIS job outputs, extracting CSV data from log files and saving them with dynamic filenames.

3. **MIMA Calculation**: The algorithm computes the Median Income Moving Average using a configurable window size and calculates poverty lines as a fraction (α) of MIMA.

4. **Optimization**: Scripts sweep across parameter combinations (α = alpha, w = window size) to find optimal settings that best match official government poverty rate benchmarks.

5. **Visualization**: Multiple visualization scripts generate comparative plots showing calculated poverty rates against official benchmarks.

### Key Algorithm (MIMA)

The MIMA algorithm (documented in `mima_algorithm_explanation.md`) calculates poverty rates using:

- **Equivalized Income**: Household income adjusted by square root of household size
- **Moving Average Median**: Trailing average of annual median incomes over w years
- **Poverty Line**: α × MIMA (where α typically ranges from 0.4 to 0.65)
- **Poverty Rate**: Proportion of population below the poverty line, weighted by person-level population weights

### Countries Analyzed

- **Canada (ca)**: Validated against Market Basket Measure (MBM) data
- **United States (us)**: Validated against Census Bureau historical poverty rates
- **United Kingdom (uk)**: Validated against Households Below Average Income (HBAI) statistics
- **Germany (de)**: Validated against Armutsgefährdungsquoten (at-risk-of-poverty rate) data
- **European Union (eu)**: EU-SILC methodology reference

## Methodology

TBD
Expand Down