Reproducibility Package for the Human Capital Index Plus
The Human Capital Index Plus (HCI+) converts health, education, and labor market indicators into a common metric of expected lifetime productivity, measured as the expected contribution to log earnings. The index combines three components:
- Health Component - Captures productivity effects of surviving to working age and achieving adequate early-life physical growth
- Education Component - Aggregates human capital accumulated during formal schooling
- On-the-Job Learning Component - Measures human capital accumulation after age 18 from work experience
-
01_raw_data contains the raw data for the project for each indicator.
-
02_programs contains the main replication file for the project. Most of the code is written in Stata. The entire project can be run by running the file
run_HCIPlus.do. Paths can be automatically set by running theprofile.dofile, which allows user to set the path by locating a.herefile in their local machine in the root directory of the project. -
03_output. This folder contains a number of final output files in either .csv or Stata .dta format. The final dataset is
03_output/hci_plus/hci_plus_index.dta. The dataset is also available in .xlsx format.
- Clone the repository to your local machine.
- Please run
run_HCIPlus.doto generate the data. This file will run all of the code to generate the data. The replicator should expect the code to run for around 5-10 minutes. - There should be no need to change the working directory. The code should provide a prompt to change the working directory if necessary.
We write lifetime human capital as the sum of three components using a log-form representation consistent with Mincer-style interpretations:
Values for males and females combined (_mf) are shown in the formulas below. The data also contains separate values for males (_m) and females (_f).
We use adult survival (ages 15–60) and stunting as health measures. Both are measures of underlying latent health. Because both capture latent health and adult productivity, the two are each given 50% weight as in Kraay (2019).
Rationale: Following Kraay (2019) and Weil (2007), the earnings penalty per centimeter of height loss is approximately 3.4%. Mortality and stunting are associated with height deficits of 19.2 cm and 10.2 cm, respectively, implying log earnings penalties of 0.65 and 0.35.
This can be written more compactly by factoring out HLO/625:
Note: For tertiary returns, we assume zero additional return after a country reaches 50% tertiary completion.
Employment indicators are split for youth (18-24) and working age (25-64) groups. In both age groups, the following are considered: labor force participation rates, unemployment rates, and the share in wage employment.
Depreciation rate (loss of skills when not working):
Define effective labor force participation and non-participation:
Returns during youth are:
Define effective labor force participation and non-participation:
Returns during working age are:
The working-age component is:
Definition of T:
To avoid using two alternative statistics on adult mortality, the country-specific value for
which is obtained by regressing life expectancy on the adult survival rate data in a Poisson regression (pseudo R² = 0.989).
| Component | Return / Parameter | Source / Notes |
|---|---|---|
| Health | 3.4% earnings penalty per cm height loss | Kraay (2019); Weil (2007) |
| Health | 19.2 cm height loss (mortality), 10.2 cm (stunting) | Kraay (2019); Weil (2007) |
| Education | 12% per learning-adjusted year (pre-primary + LAYS) | World Bank analysis based on Schoellman (2012), Jedwab et al. (2023), and Gethin's (2025) meta-analysis |
| Education | 16% per tertiary year (×4 years) | Psacharopoulos & Patrinos (2018) |
| On-the-job learning (youth) | 3.9% per year (wage); 2.0% (non-wage) | World Bank analysis using I2D2 & GLD collections |
| On-the-job learning (adults) | 3.0% per year (wage); 1.8% (non-wage) | World Bank analysis using I2D2 & GLD collections |
| On-the-job learning | 1.25% annual depreciation if not working | Dinerstein et al. (2022) |
The HCI+ scale is closely related to the scale of the previously published World Bank HCI.
The original HCI compared the level of human capital of a country to the benchmark of an ideal society with perfect health and full schooling to age 18. Because human capital accumulates, each component was multiplied over time:
Converting to logarithmic form:
This transformation means that contributions of health, education, and on-the-job learning are additive. This is the scale used in the HCI+.
Converting HCI+ to HCI scale: An interested user can easily convert the HCI+ into units of the original HCI (0-1 scale). Suppose a country has an HCI+ score of 200. The ideal society has a score of 325. The comparable score on the HCI 0-1 scale is:
Key dataset is 03_output/hci_plus/hci_plus_index.dta. This dataset contains the Human Capital Index Plus (HCI+) for each country. The HCI+ is the sum of the component scores for health, education, and on-the-job learning.
Only total values are shown in the table below. However, values for females/males are also available in the data. The naming convention for these columns is as follows:
_mfrefers to the total (male + female)_mrefers to the value for males_frefers to the value for females
| Column Name | Description | Min | Max |
|---|---|---|---|
| surv_15to60_mf_2025 | Adult survival rate (ages 15-60) | 0 | 1 |
| nostu_mf_2025 | Probability of not being stunted (under 5) | 0 | 1 |
| Column Name | Description | Min | Max |
|---|---|---|---|
| eys_pp_mf_fill_2025 | Expected years of pre-primary schooling (EYS) | 0 | 3 |
| eys_sa_mf_fill_2025 | Expected years of schooling (primary + secondary) | 0 | 12 |
| hlo_mf_fill_2025 | Harmonized learning outcomes (HLO) | 200 | 625 |
| lays_sa_mf_fill_2025 | Learning-adjusted years of schooling (LAYS) | 0 | 12 |
| ter_ya_mf_fill_2025 | Fraction completing tertiary education | 0 | 0.5 |
Youth (18-24):
| Column Name | Description | Min | Max |
|---|---|---|---|
| lfp_ya_mf_fill_2025 | Labor force participation rate (youth) | 0 | 1 |
| emp_ya_mf_fill_2025 | Employment rate (youth) | 0 | 1 |
| shr_wemp_ya_mf_fill_2025 | Share in wage employment (youth) | 0 | 1 |
Working Age (25-64):
| Column Name | Description | Min | Max |
|---|---|---|---|
| lfp_wa_mf_fill_2025 | Labor force participation rate (working age) | 0 | 1 |
| emp_wa_mf_fill_2025 | Employment rate (working age) | 0 | 1 |
| shr_wemp_wa_mf_fill_2025 | Share in wage employment (working age) | 0 | 1 |
| Column Name | Description |
|---|---|
| hci_health_mf_2025 | Health component score |
| hci_education_mf_2025 | Education component score |
| hci_otj_mf_2025 | On-the-job learning component score |
| hcip_mf_2025 | Human Capital Index Plus (HCI+) total score |
| Column Name | Description |
|---|---|
| wbcountryname | Country name |
| wbcode | Country code (ISO3) |
| year | Year of the data |
| wbregion | World Bank region |
| wbincomegroup | World Bank income group |
The health component captures the productivity effects of surviving to working age and achieving adequate early-life physical growth. Two proxies are used:
- Adult survival rate (ages 15–60): Sourced from UN Population Division life tables
- Share of children under age 5 who are not stunted: From WHO–UNICEF–World Bank Joint Malnutrition Estimates
Both indicators rely on long-run evidence linking adult height and productivity. Following Kraay (2019) and Weil (2007), the earnings penalty per centimeter of height loss is approximately 3.4 percent. Mortality and stunting are associated with height deficits of 19.2 cm and 10.2 cm, respectively, implying log earnings penalties of 0.65 and 0.35. These are weighted equally in the HCI+ scoring formula.
The education component aggregates human capital accumulated during formal schooling. Data inputs include:
- Expected years of schooling (EYS): UNESCO UIS
- Harmonized learning outcomes (HLO): World Bank EduAnalytics
- Tertiary completion: UIS and household surveys
Returns to pre-primary and LAYS are combined into a single estimate based on World Bank analysis indicating returns of 9–12 percent per effective learning-adjusted year. The HCI+ uses 12% per LAYS based on:
- Schoellman (2012)
- Islam–Jedwab–Romer (2023)
- Gethin's (2025) meta-analysis of high-quality IV estimates
Tertiary returns follow Psacharopoulos & Patrinos (2018), using a 16 percent return per year for a four-year degree.
The HCI+ introduces a component measuring human capital accumulation after age 18 from work experience. Inputs include:
- Labor force participation rates
- Employment rates
- Share of workers in formal wage employment
Data sources: ILOSTAT and the I2D2/GLD survey database.
Returns to experience differ by job type:
- 3.9% per year in formal wage jobs (youth 18-24)
- 2.0% per year in informal/non-wage work (youth 18-24)
- 3.0% per year in formal wage jobs (adults 25-64)
- 1.8% per year in informal/non-wage work (adults 25-64)
Skills depreciation: A depreciation rate of 1.25% per year is applied to periods out of work, following Dinerstein, Megalokonomou, and Yannelis (2022).
Like any summary measure of complex phenomena, the HCI+ provides a simplified representation of human capital that cannot capture every dimension of health, learning, and work experience.
The mathematical structure of the HCI+ assumes that its components are largely independent of one another. In practice, these independence assumptions do not hold perfectly—children who are stunted are more likely to complete fewer years of schooling and score lower on learning assessments. The HCI+ does not explicitly account for these correlations because reliable, cross-country data on how human capital components interact is rarely collected systematically.
The additive structure of the HCI+ allows the index to be decomposed by pillar (health, education, on-the-job learning) and by individual components within each pillar. While the report presents separate male and female HCI+ scores, these are computed independently rather than decomposed from a single aggregate measure.
The HCI+ uses the share of the population in wage employment as a proxy for job quality. Other measures of job quality (firm size, job type) were not used due to data limitations.
The HCI+ does not include a direct measure of adult morbidity. However, the index captures the effects of poor health on productivity through:
- Adult survival component (mortality reflects underlying health conditions)
- Labor force participation rates (illness/disability reduces participation)
- Unemployment rates (chronic conditions affect employability)
The HCI+ relies on survey estimates from international organizations (UNESCO, WHO, ILO, UN Population Division) to ensure data is properly harmonized and complete country coverage. Where survey data are unavailable or outdated, modelled data are sometimes used to impute values.
This project is licensed under the MIT License together with the World Bank IGO Rider. The Rider is purely procedural: it reserves all privileges and immunities enjoyed by the World Bank, without adding restrictions to the MIT permissions. Please review both files before using, distributing or contributing.
- All data are publicly available.
- Some data cannot be made publicly available.
- No data can be made publicly available.
All input datasets required to reproduce the HCI+ outputs are listed below with structured metadata: filename(s), source, URL, access date (when recorded in do files), suggested short citation, license/access status, access instructions or subset details used by scripts, and whether the file is included in the repository.
| Filename(s) | Source | URL | Access date | Citation (short) | License / access | Access instructions / subset used | Included in repo? |
|---|---|---|---|---|---|---|---|
hci_health_nutrition/WPP2024_MORT_F04_1_LIFE_TABLE_SURVIVORS_BOTH_SEXES.xlsxhci_health_nutrition/WPP2024_MORT_F04_2_LIFE_TABLE_SURVIVORS_MALE.xlsxhci_health_nutrition/WPP2024_MORT_F04_3_LIFE_TABLE_SURVIVORS_FEMALE.xlsxhci_health_nutrition/WPP2024_GEN_F01_DEMOGRAPHIC_INDICATORS_COMPACT.xlsx |
UN Population Division (World Population Prospects 2024) | https://population.un.org/wpp/ (mortality & demographic downloads) | Oct 15, 2025 | United Nations, Department of Economic and Social Affairs, Population Division (2024). World Population Prospects 2024. Data files. | Public — UNPD terms | Imported sheets: "Estimates"; used to compute surv_15to60. |
Yes |
hci_health_nutrition/P_Data_Extract_From_Health_Nutrition_and_Population_Statistics.xlsx |
World Bank Databank (Health, Nutrition and Population Statistics; UNICEF–WHO–WB JME) | https://databank.worldbank.org/source/health-nutrition-and-population-statistics# | Jun 15, 2025 | World Bank. Health, Nutrition and Population Statistics (UNICEF–WHO–World Bank JME). Data extract Jun 2025. | Public — World Bank Data Terms | final outputs use 2009–2024. | Yes |
hci_on_job_learning/lfp_survey_v2.dtahci_on_job_learning/lfp_modeled_v2.dtahci_on_job_learning/unemp_survey_v2.dtahci_on_job_learning/unemp_modeled_v2.dtahci_on_job_learning/emp_structure_survey_v2.dtahci_on_job_learning/emp_structure_age_modeled_v2.dtahci_on_job_learning/workingage_pop.dtahci_on_job_learning/pop_data_modeled.dta |
ILOSTAT (International Labour Organization) | https://ilostat.ilo.org/ | Dec 2025 | ILO (ILOSTAT) database. | Public — ILOSTAT terms | Do files read the provided .dta survey and modeled files. Reproduce by downloading LFP/unemployment/employment-structure indicators |
Yes |
hci_education/eys_data_for_HCI_Team_Dec_11_2025_v2.dtahci_education/hlo_combined_2025_2020.dtahci_education/rawfull_tertiary.dta |
World Bank — Education Analytics team (internal dataset) | internal (contact Education Analytics team) | Dec 2025 | World Bank, Education Analytics Team (2025). EYS/HLO/Tertiary datasets for HCI. | Internal — request access from Education Analytics team | Do files load these .dta files directly and select many EYS/HLO variables (see 02_programs/hci_education/1.eys.do, 1.hlo.do, 1.tertiary.do). Tertiary logic uses hierarchy TCR > TPR > GGR > GER (UIS indicators). |
Yes (internal) |
01_raw_data/misc/loggdp_wdi.dta |
World Bank (World Development Indicators) | https://databank.worldbank.org/source/world-development-indicators | July 2025 | World Bank. World Development Indicators (GDP per capita). | Public — World Bank Data Terms | Merged by wbcode and year in multiple do files to provide log GDP per capita for years up to 2024. |
Yes |
01_raw_data/misc/masterdata.dta |
World Bank (project master country metadata) | internal (repo) | July 2025 | Project master metadata file (World Bank). | Internal (project) | Used as canonical country-year panel for merges (wbcode, year); included in repo. |
Yes |
hci_education/rawfull_tertiary.dta (UIS indicators referenced: GGR.6T7, GER.5T8) |
World Bank GMD survey data; UNESCO Institute for Statistics (UIS) | https://databrowser.uis.unesco.org/ | Dec 2025 | World Bank GMD survey data; UNESCO Institute for Statistics. UIS Data. | Public — UIS terms | Yes | |
hci_health_nutrition/LHCI_component_data_reviewed.dtahci_on_job_learning/LHCI_component_data_reviewed.dtaLHCI_component_data_reviewed.dta |
World Bank internal (country-team reviewed updates) | internal (repo) | July 2025 | World Bank internal — country-team reviewed updates. | Internal (project) | Internal review files included in repo; contact authors for details on review notes and provenance. | Yes |
metadata/WB_CLASS_FY26.xlsx |
World Bank | internal (repo) | July 2025 | World Bank classification file. | Internal | Used for membership/classification metadata. | Yes |