-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Background
The overall purpose of this fork is to use the taxdata repository to create a statistically matched file of tax units for use in taxcalc simulations. This file combines information from the Current Populations Survey (CPS) and the IRS Statistics of Income Public Use File (PUF). At the time of first creating this fork in 2023, the taxdata repository had been most recently updated to target construction using the PUF from 2011. In 2023, CBPP purchased the 2015 vintage of the PUF, with the intent to have modeling based on this information be closer to reality by having more recent information about taxpayers.
Outline
- Fork the taxdata repository (completed by creating this repository)
- Install taxdata and its dependencies
- As of writing this commit, these directions are noted in the README at the root of this repository fork.
- Code/logic updates
- Note any code/logic referencing year-specific PUF information
- Update code to reflect the year 2015
- If there are differences in columns, aggregate RECIDs, or other values between 2011 and 2015, update those too
- Supplemental data updates
- Identify data files that are used to process or validate the PUF
- If any of these are year-specific, enumerate those into a list (in a new issue)
- Update these data files
- Update 2015 PUF testing targets
- Currently, some of the tests validate the PUF to reach certain aggregate outputs for the 2011 vintage. We need to find out where we could get information for 2015 targets.
Next steps
It will probably be neater to have all of these be separate issues instead of one large issue here. I will try to keep this issue as the main reference point for all updates, creating other smaller issues for discrete tasks.