[WIP]: normalize_shocks and normalize_levels allow user to impose true propositions on sims#1094
[WIP]: normalize_shocks and normalize_levels allow user to impose true propositions on sims#1094
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1094 +/- ##
==========================================
- Coverage 73.65% 73.64% -0.01%
==========================================
Files 69 69
Lines 10579 10595 +16
==========================================
+ Hits 7792 7803 +11
- Misses 2787 2792 +5
Continue to review full report at Codecov.
|
|
@Mv77, could you review this? And, if you have the right permissions, merge it? It passes all tests and should not affect any existing results, since if the normalize booleans are not set then all it does is divide some things by 1.0. PS. I'm interested to see how much improvement there is from this versus from the Harmenberg thing. And, I can't see any reason they couldn't be combined, to achieve even more improvement. What I really want to do, though, is to get the "reshuffling" technology implemented. @wdu9, you might be interested too. Thanks for your earlier input -- you put me on the right track |
|
@llorracc I am starting to take a look now. Will get to the code soon, but I have a couple of conceptual questions.
|
This is exactly right. It goes back to whether we are thinking of our discretizations as defining an exact model that is being solved on its own terms, or whether we are thinking of them as approximations of a "true" model in which the shock is, say, really continuously lognormally distributed. My preference is strongly for the latter, because it is the more general formulation. If two solutions to a model differ because one used a 5 point and the other used a 7 point approximation, then whichever of them is closer to what you get when you have an infinity-point approximation is the one that is defined as being "closer" to the truth. I'd rather do shuffling than dividing, because shuffling has the virtue that the simulated outcomes match numerically identically the computations that went into the calculation of the expectations, and it is deeply attractive to have identical calculations going into the solution and the simulation phases. But implementing shuffling would require considerably more work, and it is possible that dividing by the mean gets 95 percent of the benefits -- that is something I want to figure out.
No, PermGroFac is a separate object from PermShk. Throughout, we have always insisted that E[PermShk]=1, and have handled either life cycle patterns or aggregate growth using PermGroFac. So, it's not a problem to impose E[PermShk]=1.
This may be a very good catch: I have not examined the simulation code carefully enough to determine whether, as I had assumed, the draw of the permanent shock occurs before the calculation of b or m. If it does, then we're fine. That was the case in my original Mathematica code, so I had assumed it is the case in our HARK simulations, but if not then this step may need to be moved to some earlier point. (Though, if that is the case, maybe some renaming will be in order, since I think of "transitions" as defining how you get from t-1 to t, and if by the time you get to "transitions" some of that has already been done then our nomenclature may not be ideal). |
Fully agree, but you are imposing HARK/HARK/ConsumptionSaving/ConsIndShockModel.py Line 1809 in 8f0057e |
Here is the relevant code HARK/HARK/ConsumptionSaving/ConsIndShockModel.py Lines 1795 to 1816 in 8f0057e Notice |
|
Oh, yes, you're right about that. When I did that my thought was "the right way to handle this is to have an aggregate PLvl variable that tracks the aggregate movements and an idiosyncratic pLvl whose mean should always be 1 but I have a sneaking suspicion we have not done it that way ... even though it's being done that way in the particular case I'm working with right now (Harmenberg-Aggregation)." |
There was a problem hiding this comment.
Pull request overview
This work-in-progress PR implements shock and level normalization during simulations to improve efficiency. The approach imposes known population properties (e.g., that mean permanent and transitory shocks equal 1) on stochastic draws.
Changes:
- Added two new boolean parameters
normalize_shocksandnormalize_levelsto control normalization behavior - Modified
transition()method in PerfForesightConsumerType to normalize permanent income levels when enabled - Modified
get_shocks()method in IndShockConsumerType to normalize permanent and transitory shocks when enabled
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if not hasattr(self, "normalize_levels"): | ||
| self.normalize_levels = False | ||
|
|
||
| if self.normalize_levels == True: |
There was a problem hiding this comment.
Using '== True' is redundant in Python. The idiomatic way to check boolean values is to use 'if self.normalize_levels:' instead of 'if self.normalize_levels == True:'.
| if self.normalize_levels == True: | |
| if self.normalize_levels: |
| PermShkMeanNow = np.mean(IncShks[0]) | ||
| TranShkMeanNow = np.mean(IncShks[1]) | ||
|
|
||
| PermShkNow[these] = ( | ||
| IncShks[0, :] * PermGroFacNow | ||
| (IncShks[0, :] * PermGroFacNow | ||
| / PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False | ||
| ) # permanent "shock" includes expected growth | ||
| TranShkNow[these] = IncShks[1, :] | ||
| TranShkNow[these] = IncShks[1, :] / TranShkMeanNow |
There was a problem hiding this comment.
Potential division by zero issue. If the sampled shocks have a mean of exactly zero (or very close to zero), dividing by PermShkMeanNow or TranShkMeanNow will cause division by zero or numerical instability. This should be guarded against, perhaps by only normalizing when the mean is above a certain threshold.
| # In the limit, it cannot hurt to impose "true" restrictions, | ||
| # like the fact that the mean value of the shocks should be one | ||
| PermShkMeanNow, TranShkMeanNow = 1.0, 1.0 # Dividing by 1 changes nothing | ||
| if self.normalize_shocks == True: |
There was a problem hiding this comment.
Using '== True' is redundant in Python. The idiomatic way to check boolean values is to use 'if self.normalize_shocks:' instead of 'if self.normalize_shocks == True:'.
| PermShkMeanNow = np.mean(IncShkDstnNow.X[0][EventDraws]) | ||
| TranShkMeanNow = np.mean(IncShkDstnNow.X[1][EventDraws]) | ||
|
|
||
| PermShkNow[these] = ( | ||
| IncShkDstnNow.X[0][EventDraws] * PermGroFacNow | ||
| (IncShkDstnNow.X[0][EventDraws] * PermGroFacNow | ||
| / PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False | ||
| ) # permanent "shock" includes expected growth | ||
| TranShkNow[these] = IncShkDstnNow.X[1][EventDraws] | ||
| TranShkNow[these] = IncShkDstnNow.X[1][EventDraws] / TranShkMeanNow |
There was a problem hiding this comment.
Potential division by zero issue. If the sampled shocks have a mean of exactly zero (or very close to zero), dividing by PermShkMeanNow or TranShkMeanNow will cause division by zero or numerical instability. This should be guarded against, perhaps by only normalizing when the mean is above a certain threshold.
| PermShkNow[these] = ( | ||
| IncShkDstnNow.X[0][EventDraws] * PermGroFacNow | ||
| (IncShkDstnNow.X[0][EventDraws] * PermGroFacNow | ||
| / PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False | ||
| ) # permanent "shock" includes expected growth |
There was a problem hiding this comment.
The normalization logic may be mathematically incorrect. When normalize_shocks is True, the permanent shock is computed as (IncShkDstnNow.X[0][EventDraws] * PermGroFacNow / PermShkMeanNow). However, PermShkMeanNow is calculated from the raw shock values (IncShkDstnNow.X[0][EventDraws]), not from the values after multiplying by PermGroFacNow. This means the normalization is forcing the mean of the raw shocks to be 1, but the actual permanent shocks (which include PermGroFacNow) will have a mean of PermGroFacNow, not 1. If the intent is to normalize the final permanent shocks to have mean PermGroFacNow, then the calculation should be: PermShkMeanNow = np.mean(IncShkDstnNow.X[0][EventDraws] * PermGroFacNow).
| PermShkNow[these] = ( | ||
| IncShks[0, :] * PermGroFacNow | ||
| (IncShks[0, :] * PermGroFacNow | ||
| / PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False | ||
| ) # permanent "shock" includes expected growth |
There was a problem hiding this comment.
The normalization logic may be mathematically incorrect. When normalize_shocks is True, the permanent shock is computed as (IncShks[0, :] * PermGroFacNow / PermShkMeanNow). However, PermShkMeanNow is calculated from the raw shock values (IncShks[0]), not from the values after multiplying by PermGroFacNow. This means the normalization is forcing the mean of the raw shocks to be 1, but the actual permanent shocks (which include PermGroFacNow) will have a mean of PermGroFacNow, not 1. If the intent is to normalize the final permanent shocks to have mean PermGroFacNow, then the calculation should be: PermShkMeanNow = np.mean(IncShks[0] * PermGroFacNow).
| "normalize_shocks": False, # In sims, normalize mean of collection of shocks to population mean | ||
| "normalize_levels": False, # In sims, normalize mean of a level variable (like permanent income) to the population mean |
There was a problem hiding this comment.
The new parameters normalize_shocks and normalize_levels are added to the init_idiosyncratic_shocks dictionary but are not documented in the class docstring. The IndShockConsumerType class docstring (lines 2061-2072) should be updated to document these new parameters, explaining what they do and when they should be used.
| if not hasattr(self, "normalize_shocks"): | ||
| self.normalize_shocks = False | ||
|
|
||
| if not hasattr(self, "normalize_levels"): | ||
| self.normalize_levels = False | ||
|
|
||
| if self.normalize_levels == True: | ||
| pLvlNowMean = np.mean(pLvlNow) | ||
|
|
||
| pLvlNow = pLvlNow / pLvlNowMean # Divide by 1.0 if normalize_levels=False |
There was a problem hiding this comment.
The normalize_levels parameter is used in the PerfForesightConsumerType.transition() method but is only added to init_idiosyncratic_shocks, not to init_perfect_foresight. This means that PerfForesightConsumerType agents won't have this parameter by default, and the hasattr check will always set it to False. If normalize_levels is intended to be used by PerfForesightConsumerType, it should also be added to init_perfect_foresight with a default value of False.
|
|
||
| if not hasattr(self, "normalize_levels"): | ||
| self.normalize_levels = False | ||
|
|
There was a problem hiding this comment.
Inconsistent spacing in comment. There's trailing whitespace after the inline comment, which should be removed for consistency with Python style guidelines.
| if not hasattr(self, "normalize_levels"): | |
| self.normalize_levels = False | |
| if not hasattr(self, "normalize_levels"): | |
| self.normalize_levels = False |
| if not hasattr(self, "normalize_shocks"): | ||
| self.normalize_shocks = False | ||
|
|
||
| if not hasattr(self, "normalize_levels"): | ||
| self.normalize_levels = False | ||
|
|
||
| if self.normalize_levels == True: |
There was a problem hiding this comment.
Using hasattr checks in the transition method is problematic. These checks will only be executed once per simulation step, and if the attributes don't exist initially, they'll be set to False dynamically. This approach is fragile and unconventional. Instead, the parameters should be initialized properly in the agent's initialization or retrieved from the parameter dictionary. Consider using getattr with a default value instead, such as: self.normalize_shocks = getattr(self, 'normalize_shocks', False)
| if not hasattr(self, "normalize_shocks"): | |
| self.normalize_shocks = False | |
| if not hasattr(self, "normalize_levels"): | |
| self.normalize_levels = False | |
| if self.normalize_levels == True: | |
| # Use getattr with a default instead of dynamically creating attributes here | |
| if getattr(self, "normalize_levels", False): |
This implements a simple version of an old idea from CDC Mathematica code: Simulation efficiency can be substantially improved by imposing on the stochastic draws facts that we know are true in the population, like that the average values of permanent and transitory shocks are 1 in each period.