EEbrami · Copilot · Dec 2, 2025 · Dec 2, 2025
diff --git a/README.md b/README.md
@@ -11,6 +11,97 @@ This project, undertaken in collaboration with Dr. Miles Corak, examines the pro
 * **Analyze Poverty Trends:** Conduct in-depth, cross-national analyses of poverty trends, drawing on the harmonized LIS microdata.
 * **Evaluate Policy Impacts:** Assess the impact of various social and economic policies on poverty, using our standardized dataset to draw meaningful comparisons.
 
+## Repository Structure
+
+```
+├── DART/                          # DART validation and methodological notes
+│   ├── MIMA/                      # MIMA-related CSV data and visualizations
+│   └── Methodological_Notes.md   # Documentation of methodology
+├── LISSY/                         # Core LIS analysis and validation
+│   ├── Official_pr_Analysis/      # Main poverty rate analysis pipeline (see below)
+│   ├── DART_Validation/           # Validation against DART tables
+│   ├── MBM_validation/            # Market Basket Measure validation
+│   ├── MIMA5/                     # MIMA-5 specific analysis
+│   └── Tutorial/                  # Tutorial materials
+├── METIS-LIS/                     # METIS-LIS integration and codebooks
+├── analysis/                      # Data availability analysis
+├── docs/                          # Project documentation
+├── present-Nov21/                 # Presentation materials (2000, 2008, 2018 base years)
+├── scripts/                       # Utility scripts (e.g., HTML to MD conversion)
+├── xlsxConverted/                 # Converted Excel files (CSV, JSON, MD formats)
+├── xlsxFiles/                     # Original Excel source files
+├── compute_mima.py                # Core MIMA computation script
+├── convert_excel.py               # Excel file conversion utility
+└── USAGE_GUIDE.md                 # Usage guide for the project
+```
+
+## Official_pr_Analysis Folder
+
+The `LISSY/Official_pr_Analysis/` folder contains the **official poverty rate analysis pipeline** — the core analytical workflow for computing and validating poverty rates across multiple countries using Luxembourg Income Study (LIS) microdata.
+
+### Purpose
+
+This folder implements the **MIMA (Median Income Moving Average)** methodology for calculating poverty rates and validates these calculations against official government benchmarks from multiple countries.
+
+### Structure
+
+```
+Official_pr_Analysis/
+├── benchmarks/                    # Official poverty rate benchmarks by country
+│   ├── ca/                        # Canada (Market Basket Measure benchmarks)
+│   ├── de/                        # Germany (Armutsgefährdungsquoten data)
+│   ├── eu/                        # European Union (EU-SILC methodology)
+│   ├── uk/                        # United Kingdom (HBAI statistics)
+│   └── us/                        # United States (Census Bureau poverty rates)
+├── lissy_data/                    # LIS microdata outputs and processing
+│   ├── _SCRIPTS/                  # Job parsing scripts (parse_lissy_job.py)
+│   ├── ca/, de/, uk/, us/         # Country-specific LIS job outputs
+│   ├── mima_algorithm_explanation.md   # Detailed MIMA algorithm documentation
+│   └── parse_lissy_job_guide.md        # Guide for parsing LIS job logs
+├── results/                       # Analysis outputs and visualizations
+│   ├── ca/                        # Canada results (CSV files and plots)
+│   └── us/                        # United States results
+└── scripts/                       # Analysis and visualization scripts
+    ├── run_analysis.py            # Main runner for optimization and visualization
+    ├── run_multi_benchmark_analysis.py  # Multi-benchmark comparison runner
+    ├── parse_lis_output.py        # Parser for LIS output files
+    ├── calculate_and_plot_mima_from_csv.py  # MIMA calculation from CSV
+    ├── plot_mima_difference.py    # MIMA difference visualization
+    ├── plot_npoor_analysis.py     # Number of poor analysis plots
+    ├── plot_specific_rates.py     # Specific rate plotting utilities
+    ├── plot_us_vs_benchmarks.py   # US vs benchmark comparison plots
+    └── single_file_optimize.py    # Single-file optimization script
+```
+
+### Workflow
+
+1. **Data Extraction**: R scripts run on the LIS LISSY platform to extract household-level data with income variables (`dhi`, `mhi`), household weights (`hpopwgt`), and demographic variables.
+
+2. **Job Parsing**: The `parse_lissy_job.py` script automatically parses LIS job outputs, extracting CSV data from log files and saving them with dynamic filenames.
+
+3. **MIMA Calculation**: The algorithm computes the Median Income Moving Average using a configurable window size and calculates poverty lines as a fraction (α) of MIMA.
+
+4. **Optimization**: Scripts sweep across parameter combinations (α = alpha, w = window size) to find optimal settings that best match official government poverty rate benchmarks.
+
+5. **Visualization**: Multiple visualization scripts generate comparative plots showing calculated poverty rates against official benchmarks.
+
+### Key Algorithm (MIMA)
+
+The MIMA algorithm (documented in `mima_algorithm_explanation.md`) calculates poverty rates using:
+
+- **Equivalized Income**: Household income adjusted by square root of household size
+- **Moving Average Median**: Trailing average of annual median incomes over w years
+- **Poverty Line**: α × MIMA (where α typically ranges from 0.4 to 0.65)
+- **Poverty Rate**: Proportion of population below the poverty line, weighted by person-level population weights
+
+### Countries Analyzed
+
+- **Canada (ca)**: Validated against Market Basket Measure (MBM) data
+- **United States (us)**: Validated against Census Bureau historical poverty rates
+- **United Kingdom (uk)**: Validated against Households Below Average Income (HBAI) statistics
+- **Germany (de)**: Validated against Armutsgefährdungsquoten (at-risk-of-poverty rate) data
+- **European Union (eu)**: EU-SILC methodology reference
+
 ## Methodology
 
 TBD