Skip to content

[WIP]: normalize_shocks and normalize_levels allow user to impose true propositions on sims#1094

Open
llorracc wants to merge 9 commits intomainfrom
Normalize-To-Truths
Open

[WIP]: normalize_shocks and normalize_levels allow user to impose true propositions on sims#1094
llorracc wants to merge 9 commits intomainfrom
Normalize-To-Truths

Conversation

@llorracc
Copy link
Collaborator

@llorracc llorracc commented Dec 23, 2021

This implements a simple version of an old idea from CDC Mathematica code: Simulation efficiency can be substantially improved by imposing on the stochastic draws facts that we know are true in the population, like that the average values of permanent and transitory shocks are 1 in each period.

@llorracc llorracc marked this pull request as draft December 23, 2021 18:58
@llorracc llorracc requested a review from Mv77 December 23, 2021 18:58
@llorracc llorracc changed the title normalize_shocks and normalize_levels allow user to impose true propositions on sims [WIP]: normalize_shocks and normalize_levels allow user to impose true propositions on sims Dec 23, 2021
@llorracc llorracc marked this pull request as ready for review January 4, 2022 17:15
@codecov-commenter
Copy link

codecov-commenter commented Jan 4, 2022

Codecov Report

Merging #1094 (8f0057e) into master (4c002f1) will decrease coverage by 0.00%.
The diff coverage is 72.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1094      +/-   ##
==========================================
- Coverage   73.65%   73.64%   -0.01%     
==========================================
  Files          69       69              
  Lines       10579    10595      +16     
==========================================
+ Hits         7792     7803      +11     
- Misses       2787     2792       +5     
Impacted Files Coverage Δ
HARK/ConsumptionSaving/ConsIndShockModel.py 85.55% <72.22%> (-0.32%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4c002f1...8f0057e. Read the comment docs.

@llorracc
Copy link
Collaborator Author

llorracc commented Jan 5, 2022

@Mv77, could you review this? And, if you have the right permissions, merge it?

It passes all tests and should not affect any existing results, since if the normalize booleans are not set then all it does is divide some things by 1.0.

PS. I'm interested to see how much improvement there is from this versus from the Harmenberg thing. And, I can't see any reason they couldn't be combined, to achieve even more improvement. What I really want to do, though, is to get the "reshuffling" technology implemented.

@wdu9, you might be interested too. Thanks for your earlier input -- you put me on the right track

@Mv77
Copy link
Collaborator

Mv77 commented Jan 8, 2022

@llorracc I am starting to take a look now. Will get to the code soon, but I have a couple of conceptual questions.

  • On dividing shocks by their means: I believe that doing things like TranShkNow[these] = IncShks[1, :] / TranShkMeanNow introduces the possibility of (normalized) transitory shocks taking on values not included in the discrete approximation TranShkDstn.X. How do you feel about that? The agent is drawing shocks that were not allowed in the expectations he formed. I believe this points back to the debate on whether these discretized solutions should be seen as approximations of the continuous problem. I wanted to hear your thoughts on that, since I believe this fact complicates things you have been thinking about (like shuffling).

  • On normalizing levels:
    a) There should be some guardrails ensuring that pLvlNow = pLvlNow / pLvlNowMean happens only for models in which it is in fact true that the average pLvl should be 1 every period. This excludes e.g. life-cycle models or models with aggregate growth, right?
    b) pLvlNow = pLvlNow / pLvlNowMean alters the joint distribution of (m,P). Should we worry about that? Consider an agent who ends period 0 with p=1, a=1. At the start of period 0 he draws PermShk=0.5 and now he has, approximately p_1 = 0.5, m_1 = 2. But then we apply the pLvl normalization and this can shift his p_1 up or down while leaving m_1 fixed. This is picky, the normalization should have a very small effect for large populations. Nevertheless, my point is that p affects m through PermShk before it is normalized, and I wonder if normalizing p after setting m without taking this link into account can have perverse consequences.

@llorracc
Copy link
Collaborator Author

llorracc commented Jan 9, 2022

.

  • On dividing shocks by their means: I believe that doing things like TranShkNow[these] = IncShks[1, :] / TranShkMeanNow introduces the possibility of (normalized) transitory shocks taking on values not included in the discrete approximation TranShkDstn.X. How do you feel about that? The agent is drawing shocks that were not allowed in the expectations he formed. I believe this points back to the debate on whether these discretized solutions should be seen as approximations of the continuous problem. I wanted to hear your thoughts on that, since I believe this fact complicates things you have been thinking about (like shuffling).

This is exactly right. It goes back to whether we are thinking of our discretizations as defining an exact model that is being solved on its own terms, or whether we are thinking of them as approximations of a "true" model in which the shock is, say, really continuously lognormally distributed. My preference is strongly for the latter, because it is the more general formulation. If two solutions to a model differ because one used a 5 point and the other used a 7 point approximation, then whichever of them is closer to what you get when you have an infinity-point approximation is the one that is defined as being "closer" to the truth. I'd rather do shuffling than dividing, because shuffling has the virtue that the simulated outcomes match numerically identically the computations that went into the calculation of the expectations, and it is deeply attractive to have identical calculations going into the solution and the simulation phases. But implementing shuffling would require considerably more work, and it is possible that dividing by the mean gets 95 percent of the benefits -- that is something I want to figure out.

  • On normalizing levels:
    a) There should be some guardrails ensuring that pLvlNow = pLvlNow / pLvlNowMean happens only for models in which it is in fact true that the average pLvl should be 1 every period. This excludes e.g. life-cycle models or models with aggregate growth, right?

No, PermGroFac is a separate object from PermShk. Throughout, we have always insisted that E[PermShk]=1, and have handled either life cycle patterns or aggregate growth using PermGroFac. So, it's not a problem to impose E[PermShk]=1.

  • b) pLvlNow = pLvlNow / pLvlNowMean alters the joint distribution of (m,P). Should we worry about that? Consider an agent who ends period 0 with p=1, a=1. At the start of period 0 he draws PermShk=0.5 and now he has, approximately p_1 = 0.5, m_1 = 2. But then we apply the pLvl normalization and this can shift his p_1 up or down while leaving m_1 fixed. This is picky, the normalization should have a very small effect for large populations. Nevertheless, my point is that p affects m through PermShk before it is normalized, and I wonder if normalizing p after setting m without taking this link into account can have perverse consequences.

This may be a very good catch: I have not examined the simulation code carefully enough to determine whether, as I had assumed, the draw of the permanent shock occurs before the calculation of b or m. If it does, then we're fine. That was the case in my original Mathematica code, so I had assumed it is the case in our HARK simulations, but if not then this step may need to be moved to some earlier point. (Though, if that is the case, maybe some renaming will be in order, since I think of "transitions" as defining how you get from t-1 to t, and if by the time you get to "transitions" some of that has already been done then our nomenclature may not be ideal).

@Mv77
Copy link
Collaborator

Mv77 commented Jan 9, 2022

No, PermGroFac is a separate object from PermShk. Throughout, we have always insisted that E[PermShk]=1, and have handled either life cycle patterns or aggregate growth using PermGroFac. So, it's not a problem to impose E[PermShk]=1.

Fully agree, but you are imposing E[pLvl] = 1 in some parts

pLvlNow = pLvlNow / pLvlNowMean # Divide by 1.0 if normalize_levels=False

@Mv77
Copy link
Collaborator

Mv77 commented Jan 9, 2022

This may be a very good catch: I have not examined the simulation code carefully enough to determine whether, as I had assumed, the draw of the permanent shock occurs before the calculation of b or m. If it does, then we're fine. That was the case in my original Mathematica code, so I had assumed it is the case in our HARK simulations, but if not then this step may need to be moved to some earlier point. (Though, if that is the case, maybe some renaming will be in order, since I think of "transitions" as defining how you get from t-1 to t, and if by the time you get to "transitions" some of that has already been done then our nomenclature may not be ideal).

Here is the relevant code

# Calculate new states: normalized market resources and permanent income level
pLvlNow = pLvlPrev*self.shocks['PermShk'] # Updated permanent income level
# Asymptotically it can't hurt to impose true restrictions
# (at least if the GICRaw holds)
pLvlNowMean = 1.0
if not hasattr(self, "normalize_shocks"):
self.normalize_shocks = False
if not hasattr(self, "normalize_levels"):
self.normalize_levels = False
if self.normalize_levels == True:
pLvlNowMean = np.mean(pLvlNow)
pLvlNow = pLvlNow / pLvlNowMean # Divide by 1.0 if normalize_levels=False
# Updated aggregate permanent productivity level
PlvlAggNow = self.state_prev['PlvlAgg']*self.PermShkAggNow
# "Effective" interest factor on normalized assets
ReffNow = RfreeNow/self.shocks['PermShk']
bNrmNow = ReffNow*aNrmPrev # Bank balances before labor income
mNrmNow = bNrmNow + self.shocks['TranShk'] # Market resources after income

Notice pLvlNow is set in line 1796, using PermShk. Then pLvlNow is normalized in line 1809. But then ReffNow, which is used to compute bNrmNow is set using PermShk in 1814.

@llorracc
Copy link
Collaborator Author

llorracc commented Jan 9, 2022

Oh, yes, you're right about that. When I did that my thought was "the right way to handle this is to have an aggregate PLvl variable that tracks the aggregate movements and an idiosyncratic pLvl whose mean should always be 1 but I have a sneaking suspicion we have not done it that way ... even though it's being done that way in the particular case I'm working with right now (Harmenberg-Aggregation)."

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This work-in-progress PR implements shock and level normalization during simulations to improve efficiency. The approach imposes known population properties (e.g., that mean permanent and transitory shocks equal 1) on stochastic draws.

Changes:

  • Added two new boolean parameters normalize_shocks and normalize_levels to control normalization behavior
  • Modified transition() method in PerfForesightConsumerType to normalize permanent income levels when enabled
  • Modified get_shocks() method in IndShockConsumerType to normalize permanent and transitory shocks when enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if not hasattr(self, "normalize_levels"):
self.normalize_levels = False

if self.normalize_levels == True:
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using '== True' is redundant in Python. The idiomatic way to check boolean values is to use 'if self.normalize_levels:' instead of 'if self.normalize_levels == True:'.

Suggested change
if self.normalize_levels == True:
if self.normalize_levels:

Copilot uses AI. Check for mistakes.
Comment on lines +2215 to +2222
PermShkMeanNow = np.mean(IncShks[0])
TranShkMeanNow = np.mean(IncShks[1])

PermShkNow[these] = (
IncShks[0, :] * PermGroFacNow
(IncShks[0, :] * PermGroFacNow
/ PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False
) # permanent "shock" includes expected growth
TranShkNow[these] = IncShks[1, :]
TranShkNow[these] = IncShks[1, :] / TranShkMeanNow
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential division by zero issue. If the sampled shocks have a mean of exactly zero (or very close to zero), dividing by PermShkMeanNow or TranShkMeanNow will cause division by zero or numerical instability. This should be guarded against, perhaps by only normalizing when the mean is above a certain threshold.

Copilot uses AI. Check for mistakes.
# In the limit, it cannot hurt to impose "true" restrictions,
# like the fact that the mean value of the shocks should be one
PermShkMeanNow, TranShkMeanNow = 1.0, 1.0 # Dividing by 1 changes nothing
if self.normalize_shocks == True:
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using '== True' is redundant in Python. The idiomatic way to check boolean values is to use 'if self.normalize_shocks:' instead of 'if self.normalize_shocks == True:'.

Copilot uses AI. Check for mistakes.
Comment on lines +2239 to +2246
PermShkMeanNow = np.mean(IncShkDstnNow.X[0][EventDraws])
TranShkMeanNow = np.mean(IncShkDstnNow.X[1][EventDraws])

PermShkNow[these] = (
IncShkDstnNow.X[0][EventDraws] * PermGroFacNow
(IncShkDstnNow.X[0][EventDraws] * PermGroFacNow
/ PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False
) # permanent "shock" includes expected growth
TranShkNow[these] = IncShkDstnNow.X[1][EventDraws]
TranShkNow[these] = IncShkDstnNow.X[1][EventDraws] / TranShkMeanNow
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential division by zero issue. If the sampled shocks have a mean of exactly zero (or very close to zero), dividing by PermShkMeanNow or TranShkMeanNow will cause division by zero or numerical instability. This should be guarded against, perhaps by only normalizing when the mean is above a certain threshold.

Copilot uses AI. Check for mistakes.
Comment on lines 2242 to 2245
PermShkNow[these] = (
IncShkDstnNow.X[0][EventDraws] * PermGroFacNow
(IncShkDstnNow.X[0][EventDraws] * PermGroFacNow
/ PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False
) # permanent "shock" includes expected growth
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalization logic may be mathematically incorrect. When normalize_shocks is True, the permanent shock is computed as (IncShkDstnNow.X[0][EventDraws] * PermGroFacNow / PermShkMeanNow). However, PermShkMeanNow is calculated from the raw shock values (IncShkDstnNow.X[0][EventDraws]), not from the values after multiplying by PermGroFacNow. This means the normalization is forcing the mean of the raw shocks to be 1, but the actual permanent shocks (which include PermGroFacNow) will have a mean of PermGroFacNow, not 1. If the intent is to normalize the final permanent shocks to have mean PermGroFacNow, then the calculation should be: PermShkMeanNow = np.mean(IncShkDstnNow.X[0][EventDraws] * PermGroFacNow).

Copilot uses AI. Check for mistakes.
Comment on lines 2218 to 2221
PermShkNow[these] = (
IncShks[0, :] * PermGroFacNow
(IncShks[0, :] * PermGroFacNow
/ PermShkMeanNow) # Divide by 1.0 if normalize_shocks=False
) # permanent "shock" includes expected growth
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalization logic may be mathematically incorrect. When normalize_shocks is True, the permanent shock is computed as (IncShks[0, :] * PermGroFacNow / PermShkMeanNow). However, PermShkMeanNow is calculated from the raw shock values (IncShks[0]), not from the values after multiplying by PermGroFacNow. This means the normalization is forcing the mean of the raw shocks to be 1, but the actual permanent shocks (which include PermGroFacNow) will have a mean of PermGroFacNow, not 1. If the intent is to normalize the final permanent shocks to have mean PermGroFacNow, then the calculation should be: PermShkMeanNow = np.mean(IncShks[0] * PermGroFacNow).

Copilot uses AI. Check for mistakes.
Comment on lines +2054 to +2055
"normalize_shocks": False, # In sims, normalize mean of collection of shocks to population mean
"normalize_levels": False, # In sims, normalize mean of a level variable (like permanent income) to the population mean
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parameters normalize_shocks and normalize_levels are added to the init_idiosyncratic_shocks dictionary but are not documented in the class docstring. The IndShockConsumerType class docstring (lines 2061-2072) should be updated to document these new parameters, explaining what they do and when they should be used.

Copilot uses AI. Check for mistakes.
Comment on lines +1800 to +1809
if not hasattr(self, "normalize_shocks"):
self.normalize_shocks = False

if not hasattr(self, "normalize_levels"):
self.normalize_levels = False

if self.normalize_levels == True:
pLvlNowMean = np.mean(pLvlNow)

pLvlNow = pLvlNow / pLvlNowMean # Divide by 1.0 if normalize_levels=False
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalize_levels parameter is used in the PerfForesightConsumerType.transition() method but is only added to init_idiosyncratic_shocks, not to init_perfect_foresight. This means that PerfForesightConsumerType agents won't have this parameter by default, and the hasattr check will always set it to False. If normalize_levels is intended to be used by PerfForesightConsumerType, it should also be added to init_perfect_foresight with a default value of False.

Copilot uses AI. Check for mistakes.
Comment on lines +1802 to +1805

if not hasattr(self, "normalize_levels"):
self.normalize_levels = False

Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent spacing in comment. There's trailing whitespace after the inline comment, which should be removed for consistency with Python style guidelines.

Suggested change
if not hasattr(self, "normalize_levels"):
self.normalize_levels = False
if not hasattr(self, "normalize_levels"):
self.normalize_levels = False

Copilot uses AI. Check for mistakes.
Comment on lines +1800 to +1806
if not hasattr(self, "normalize_shocks"):
self.normalize_shocks = False

if not hasattr(self, "normalize_levels"):
self.normalize_levels = False

if self.normalize_levels == True:
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hasattr checks in the transition method is problematic. These checks will only be executed once per simulation step, and if the attributes don't exist initially, they'll be set to False dynamically. This approach is fragile and unconventional. Instead, the parameters should be initialized properly in the agent's initialization or retrieved from the parameter dictionary. Consider using getattr with a default value instead, such as: self.normalize_shocks = getattr(self, 'normalize_shocks', False)

Suggested change
if not hasattr(self, "normalize_shocks"):
self.normalize_shocks = False
if not hasattr(self, "normalize_levels"):
self.normalize_levels = False
if self.normalize_levels == True:
# Use getattr with a default instead of dynamically creating attributes here
if getattr(self, "normalize_levels", False):

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Stale

Development

Successfully merging this pull request may close these issues.

4 participants