Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,8 +127,6 @@
# a list of builtin themes.
html_theme = "pydata_sphinx_theme"

html_theme_options = {
"github_url": "https://github.com/OpenSourceEconomics/respy",
}
html_theme_options = {"github_url": "https://github.com/OpenSourceEconomics/respy"}

html_css_files = ["css/custom.css"]
267 changes: 149 additions & 118 deletions docs/reference_guides/state_space.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,156 +3,187 @@ The State Space

.. currentmodule:: respy.state_space

The implementation of the state space in respy allows the user to solve and analyze a
wide range of models in an efficient way by storing the essential information about the
model and by acting as a precise and simple interface between different components of
the model. First and foremost a state space contains a register all possible states of
the universe that a particular models allows for. Once a model contains a rich set of
features it is vital to not repeat information such as to keep the analysis efficient.
respy defines full states implicitly as combinations of more coarse states to avoid any
duplication. Furthermore we group states in a way such that all members of one group are
treated symmetrically throughout the model solution. The state space moreover contains a
set of methods that facilitate required communications between different state space
groups. This guide contains an explanation the most important components of the state
space.

The implementation of the state space in respy serves several purposes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this paragraph should be split into two parts.

The first part defines what is meant by the state space which is exactly what you do in the beginning. If you feel you delve too much into the economics, you could draft references to the explanations written by Rafael and Benedikt.

The other part defines the problem we need to solve. Talk about memory demands. Reference the curse of the dimensionality. Maybe a back-of-the-envelop calculation: 52 Mio states for KW97. A float64 variables costs 400mb and int8 52mb. The other part defines the problem we need to solve. Talk about memory demands. Reference the curse of the dimensionality. Maybe a back-of-the-envelop calculation: 52 Mio states for KW97. A float64 variables costs 400mb and int8 52mb. What is a float? 10 wages and nonpecs, 1 expected vlaue functions, temporariliy 5 continuation values, 4 of 16 coviarates-> 20 * 400mb = 8GB. What is uint8? 6 state dimensions, 12 covariates -> 18 * 52MB = 1GB.

Then, I could also add some history information. The evolution of the state space: storing all states in tabular format and having a matrix to find child states -> having core states in tabular format and adding dense information if necessary combined with period specific indexer.

First and foremost the state space contains a register of all possible states of
the universe that a particular models allows for.
In high dimensional models the number of states grow substantially which constitutes
a considerable constraint for creating realistic models of economic dynamics.
To relax this constraint as much as possible it is crucial to avoid any
duplication of calculations or information.
To this extent we designed a state space that contains a range of different objects
that define representations and groupings of individual model states that allow us
to use as few resources as possible during the model solution.
Once an attribute of a state can be expressed as a combination of
the attributes of two sub-states the number of calculations required to
map all states to their attributes is reduced drastically.
The distinction between core and dense states within respy is defined such that
this property is exploited in a straight forward way.
Bundling states with a similar representation and a symmetric treatment in the
model solution together avoids double calculation and allows to write simpler
and more efficient code.
This consideration constitutes the base for the fine grained division of states
in respy which is defined in the period_choice_cores.
The range of methods contained in our state space facilitates clear communication
between different groups and representations of the state space.
This guide contains an illustration of the most important components of the state space
by stressing their role in representation of states, division of states and
communication between states.


Representation
--------------
We divide state space dimensions in core and dense dimensions.
A core dimension is a deterministic function of past choices, time
and initial conditions.
Adding another core dimension changes the structure of the state space
in a complicated way.
A dense dimension is not such a function and thus either changes the state space
in a more predictable way or changes the state space in such a complicated way that
we find it easier to have slightly larger state space than required rather than
performing complciated calculations to get a precise one.
A prime example for a dense variable in pur mode is an observable characteristic
that does not change in the model.
The addition of exogenous processes as dense variables makes the
distinction a bit less explicit but the general logic still applies.
A dense state of the model is a combination of its dense
and core states.
All attributes that are only functions of either dense dimensions or
core dimensions can be calculated seperately and can be combined
thereafter.
This property dramatically reduces the amount of calculations required
to obtain several pieces of information.
The main conceptual objects underlying the representation of states
are the core state space, the dense grid and the dense state space.

.. _core_state_space:

The Core State Space
--------------------

ads
- The Core State Space:
The core state space is the set of all core states. It is the image of the function
underlying the core dimensions invoked at al feasible combinations of past choices
time and initial conditions.
The Core State Space is explicitely obtained in the module and is represented by
a pd.DataFrame. As such it is the basis for most calculations and every state in the
solution is partly represented by apointer to a row in the core state space.

.. _dense_grid:

Dense Grid
----------

The dense grid is a list that contains all states of dense variables. A dense variable
is not a deterministic function of past choices and time. Adding another dense variables
essentially copies the full state space. The disjunction makes use of this property. By
expressing a state as a combination of a dense and a core state we avoid several
duplications.
.. _dense_grid:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that some of the information below should be moved to the attribute section in the docstring of the state space.


- Dense Grid:
The dense grid is a list that contains all states of dense variables.
The main difference to the core state space is that we do not apply logic to obtain
all feasible states of dense dimensions.
Instead we just use the full cartesian product of dense dimenions. As we already
mentioned in the last section this is either due to the fact that the marginal change
of the state space due to the addition of the variable is so simple that we do not
have to do any more than to duplicate the existing state space or due to the fact
the marginal change is so complicated that we are fine with having a slightly bigger
state space than required.
The dense grid is also explicitely obtained and represented by a list of tuples.
Each state in the solution is partly represented by a pointer to a position in the
dense grid.

.. _dense_state_space:

Dense State Space
-----------------

The dense state space is the full state space that contains all admissible states. respy
does not store the full dense state space explicitly. The concept is nevertheless
important since the model solution essentially loops through each dense state.

- Dense State Space:
The dense state space contains all full states that the model allows for.
In respy the dense state space is essentially the cartesian product of the core
state space and the entries in the dense grid.
We however do not store the full dense state space explicitly.
We rather create the dense state space sperately for each
dense_period_choice_chore.
Each dense state is represented by a combination of the dense state
and the subset of the core dataframe that corresponds to the particular
denste_period_choice_core. We only require the full information at two
particular points in the creation of the state space and the solution
of the model. Since it takes a long time to create this object and since it
would consume a lot of memory to keep it in the working memory at all times
the objects are saved to disk after they are created and only called whenever
they are required.

Division
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partly attribute description or moved up where the separation by choice sets is explained.

--------
The essential dimension along which our model is solved is time. That implies
that the minimal division of the dense state space that we require to solve our model
is along time.
During the solution and analysis different states however are treated differently. In
particular covariates are calculated differently and choices or other conditions are
different.
We thus want a division of the state space in each period that is as symmetric as
possible in its treatment during the solution while not being too complicated to
compile and manage.
The dense state space is seperated in ``dense_period_choice_cores`` in respy and several
objects and indices are created along the way.

.. _period_choice_cores:

Period Choice Cores
-------------------

The interface of respy allows for flexible choice sets. The period choice core maps
period and choice set to a set of core states.
- Period Choice Cores:
The interface of respy allows for flexible choice sets.
The period choice core maps period and choice set to a set of core states.
It mainly constitutes the base for dense_period_choice cores.


.. _dense_period_choice_cores:

Dense Period Choice Cores
-------------------------

The period choice core maps period and choice set and dense index to a set of core
states. This is the separation of states that the model solution loops over. All of the
state space operations work symmetrically on all states within one period. The same
mapping is applied to generate the full set of covariates, they are all called at the
same time during the backward induction procedure (reordering would not matter) and they
all use the same rule to obtain continuation value. That is the reason why they are
stored together and why the model solution loops over period and all
``dense_period_choice_cores`` within that period perform their operations in parallel.
- Dense Period Choice Cores:
The period choice core maps period and choice set and dense index to a set
of core states. This is the separation of states that the model solution
loops over. All of the state space operations work symmetrically on all states
within one period. The same mapping is applied to generate the full set of
covariates, they are all called at the same time during the backward induction
procedure (reordering would not matter) and they all use the same rule to obtain
continuation value. That is the reason why they are stored together and why the
model solution loops over period and all ``dense_period_choice_cores`` within
that period perform their operations in parallel.


.. _state_space_location_indices:

State Space Location Indices
============================

To create the state space and to store information efficiently we build simple indices
of the objects introduced above. In general we call location indices indices if the
defining mapping is injective and keys if the defining mapping is not injective.


.. _core_indices:

Core Indices
------------

Core indices are row indices for states in the :ref:`core state space
<core_state_space>`. They are continued over different periods and choice sets in the
core.


.. _core_key:

Core Key
--------

A ``core_key`` is an index for a set of states in the core state space which are in the
same period and share the same choice set.


.. _dense_vector:

Dense Vector
------------

A dense vector is combination of values in the dense dimensions.


.. _dense_index:

Dense Index
-----------

A dense index is a position in the dense grid.
- State Space Location Indices:
To store and manage information efficiently we build simple indices
of the state space objects introduced above.
It is crucial to say that we only need certain information at certain
points in the solution process. Thus we define a location indices such that we
can access all the information that we need throughout the solution easily.
In general these location indices are integers that point to a position
within an object. In general we call location indices indices if the defining
mapping is injective in the dense state space and keys if the defining mapping is
not injective in the dense state space. Thus it follows that location indices that
point to a row in the ``core_state_space`` are referred to as indices while location
indices that refer to a group of states such as the numeration of the
``dense_period_choice_cores`` are referred to as keys.


.. _dense_key:

Dense Key
---------

A ``dense_key`` is an index for a set of states in the dense state space which are in
the same period, share the same choice set, and the same dense vector.

.. _complex:

Complex
--------------------
A complex key is the basis for `core_key` and `dense_key` it is a tuple of a period and
a tuple for the choice set which contains booleans for whether a choice is available.
The complex index for a dense index also contains the dense vector in the last position.
Communication
-------------

.. _state_space_methods:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following are docstrings.



State Space Methods
===================

Several methods facilitate communication between different groups in the state space.
They are shortly introduced in turn:

To solve a discrete dynamic choice model dense period choice cores in different
periods need to communicate each other.
This section shortly summarizes how the three main state space methods create an
intertemporal link between state space groups and thereby facilitate an efficient model
solution.

.. _collect_child_indices:

Collect Child Indices
---------------------

This function assigns each state a function that maps choices into child states.

- Collect Child Indices:
This function assigns each state a function that maps choices into child states.
(Requires more information)

.. _get_continuation_values:

Get Continuation Values
-----------------------
- Get Continuation Values:
This method uses collect child indices to assign each state a function
that maps choices into continuation values.
It is the api that a dense period choice core uses to get information about the
continuation values.

.. _set_attribute_from_keys:

This method uses collect child indices to assign each state a function
that maps choices into continuation values.
- Set Attribute from keys:
This method allows to store information at a certain point of a state space object.
It is the api that a dense period choice core uses to store information about the
expected value functions.
4 changes: 1 addition & 3 deletions respy/interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,7 @@
},
]

ROBINSON_CRUSOE_CONSTRAINTS = [
{"loc": "shocks_sdcorr", "type": "sdcorr"},
]
ROBINSON_CRUSOE_CONSTRAINTS = [{"loc": "shocks_sdcorr", "type": "sdcorr"}]


def get_example_model(model, with_data=True):
Expand Down
2 changes: 1 addition & 1 deletion respy/interpolate.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@


def kw_94_interpolation(
state_space, period_draws_emax_risk, period, optim_paras, options,
state_space, period_draws_emax_risk, period, optim_paras, options
):
r"""Calculate the approximate solution proposed by [1]_.

Expand Down
26 changes: 24 additions & 2 deletions respy/solve.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,25 @@ def solve(params, options, state_space):

@parallelize_across_dense_dimensions
def _create_choice_rewards(complex_, choice_set, optim_paras, options):
"""Create wage and non-pecuniary reward for each state and choice."""
"""Create wage and non-pecuniary reward for each state and choice.

In particular the function retrieves dense period choice cores
with all covariates (they have aready been calculated in the
construction of the state space) from disk.
Thereafter the function obtains rewards for choices for each
state based on the pre calculated covariates.

Returns
----------
wages : np.array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wages : np.array
wages : numpy.ndarray

Array with dimensions n_states x n_choices.
Contains all wages for a particular state choice
combination within a dense period choice core.
nonpecs : np.array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nonpecs : np.array
nonpecs : numpy.ndarray

This is the correct type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you need to write out numpy and pandas

Array with dimensions n_states x n_choices.
Contains all nonpecs for a particular state choice
combination within a dense period choice core.
"""
n_choices = sum(choice_set)
choices = select_valid_choices(optim_paras["choices"], choice_set)

Expand Down Expand Up @@ -142,7 +160,7 @@ def _solve_with_backward_induction(state_space, optim_paras, options):

elif any_interpolated:
period_expected_value_functions = kw_94_interpolation(
state_space, period_draws_emax_risk, period, optim_paras, options,
state_space, period_draws_emax_risk, period, optim_paras, options
)

else:
Expand Down Expand Up @@ -171,6 +189,10 @@ def _full_solution(
In contrast to approximate solution, the Monte Carlo integration is done for each
state and not only a subset of states.

Returns
----------
period_expected_value_functions: np.array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
period_expected_value_functions: np.array
period_expected_value_functions: numpy.ndarray

Array containing expected value function for each state.
"""
period_expected_value_functions = calculate_expected_value_functions(
wages,
Expand Down
Loading