Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/hark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
max-parallel: 5
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: [3.8, 3.9, "3.10"]
python-version: ["3.10"]
Comment on lines 23 to +24
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI now only tests Python 3.10, but pyproject.toml declares requires-python = ">=3.8" and classifiers include 3.8/3.9. This change can mask compatibility regressions (and this PR introduces list[...] type hints that will break on 3.8). Either restore 3.8/3.9 in the matrix or update the project's supported Python versions consistently (metadata + docs + code).

Copilot uses AI. Check for mistakes.

steps:
- uses: actions/checkout@v3
Expand Down
39 changes: 39 additions & 0 deletions HARK/abstract/tests/consindshk.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
states: !StateSpace
variables:
- !State
name: m
short_name: money
long_name: market resources
latex_repr: \mNrm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know how you're imagining using the Latex representation.
But my preference would be to not include it in the PR unless you have some demonstration of how it works ready.
4 different ways to name something for a quick demo seems like a lot....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now these are filler, I want to throw any non-required keys into an attributes dictionary.

- !State
name: &name stigma
short_name: &short_name risky share
long_name: &long_name risky share of portfolio
latex_repr: &latex_repr \stigma

actions: !ActionSpace
variables:
- !Action
name: c
short_name: consumption
long_name: consumption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these fields optional? Can one spill over to the others as a default?
Basically, how can we make these config files lighter weight?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are optional, only required is name.

latex_repr: \cNrm
- !Action
name: *name
short_name: *short_name
long_name: *long_name
latex_repr: *latex_repr

post_states: !PostStateSpace
variables:
- !PostState
name: a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you've repeated the post_states block twice in this file?

Not sure how @mnwhite feels, but maybe we don't need to draw a firm distinction between states and post states like this.

Or the labels could be inside the variable, not part of the document structure.

Compare:

var_type_1:
   variables:
       - !VarTypeClass1
           details

var_type_2:
   variables:
       - !VarTypeClass2
           details

with

variables:
    - !VarTypeClass1
       details
    - !VarTypeClass2   
       details

The latter is same information, but fewer lines.

short_name: assets
long_name: savings
latex_repr: \aNrm
- !PostState
name: *name
short_name: *short_name
long_name: *long_name
latex_repr: *latex_repr
52 changes: 52 additions & 0 deletions HARK/abstract/tests/consindshk_full.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
states: !StateSpace
variables:
- !State
name: m
short_name: money
long_name: market resources
latex_repr: \mNrm
- !State
name: &name stigma
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what the ampersands are doing here.
Maybe document that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ampersands are aliases, since one of the states is also a control and a post-state. I will document this more as I make more progress.

short_name: &short_name risky share
long_name: &long_name risky share of portfolio
latex_repr: &latex_repr \stigma

actions: !ActionSpace
variables:
- !Action
name: c
short_name: consumption
long_name: consumption
latex_repr: \cNrm
- !Action
name: *name
short_name: *short_name
long_name: *long_name
latex_repr: *latex_repr

post_states: !PostStateSpace
variables:
- !PostState
name: a
short_name: assets
long_name: savings
latex_repr: \aNrm
- !PostState
name: *name
short_name: *short_name
long_name: *long_name
latex_repr: *latex_repr

parameters: !Parameters
variables:
- !Parameter
name: DiscFac
short_name: discount factor
long_name: discount factor
latex_repr: \beta
- !Parameter
name: CRRA
short_name: risk aversion
long_name: coefficient of relative risk aversion
latex_repr: \rho
19 changes: 19 additions & 0 deletions HARK/abstract/tests/test_variables.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import unittest

import numpy as np
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'np' is not used.

Suggested change
import numpy as np

Copilot uses AI. Check for mistakes.
import yaml

import HARK.abstract.variables
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'HARK' is not used.

Copilot uses AI. Check for mistakes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test case showing how this data structure could be used in practice would go a long way.


class test_pyyaml(unittest.TestCase):
def setUp(self):
self.path = "HARK/abstract/tests/"

Comment on lines +9 to +12
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test data path is hard-coded as a relative string ("HARK/abstract/tests/"), which depends on the test runner's current working directory. Consider deriving paths from __file__ (e.g., via pathlib.Path) so the tests are robust when run from other working directories.

Copilot uses AI. Check for mistakes.
def test_partial(self):
with open(self.path + "consindshk.yml") as f:
data = yaml.safe_load(f)

def test_full(self):
with open(self.path + "consindshk_full.yml") as f:
data = yaml.safe_load(f)
Comment on lines +13 to +19
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests only load YAML but don't assert anything about the parsed result (e.g., that the returned objects are StateSpace/ActionSpace, that variables are present, etc.). Since this PR introduces YAML-backed variable abstractions, adding a few assertions would make the tests actually validate behavior rather than just “no exception.”

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +19
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable data is not used.

Suggested change
data = yaml.safe_load(f)
def test_full(self):
with open(self.path + "consindshk_full.yml") as f:
data = yaml.safe_load(f)
data = yaml.safe_load(f)
self.assertIsNotNone(data)
def test_full(self):
with open(self.path + "consindshk_full.yml") as f:
data = yaml.safe_load(f)
self.assertIsNotNone(data)

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +19
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable data is not used.

Suggested change
data = yaml.safe_load(f)
def test_full(self):
with open(self.path + "consindshk_full.yml") as f:
data = yaml.safe_load(f)
data = yaml.safe_load(f)
self.assertIsNotNone(data)
def test_full(self):
with open(self.path + "consindshk_full.yml") as f:
data = yaml.safe_load(f)
self.assertIsNotNone(data)

Copilot uses AI. Check for mistakes.
277 changes: 277 additions & 0 deletions HARK/abstract/variables.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,277 @@
from dataclasses import dataclass, field
from typing import Mapping, Optional, Union
from warnings import warn

import numpy as np
import xarray as xr
from yaml import SafeLoader, YAMLObject

rng = np.random.default_rng()


@dataclass
class Variable(YAMLObject):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to initialize Variables without parsing them from a YAML file?
I.e., a pure python way to create variables?

I'm a little wary of tying the model objects too tightly to the serial format because it can make it tricky to interoperate with other python libraries.

"""
Abstract class for representing variables. Variables are the building blocks
of models. They can be parameters, states, actions, shocks, or auxiliaries.
"""

name: str # The name of the variable, required
attrs: dict = field(default_factory=dict, kw_only=True)
short_name: str = field(default=None, kw_only=True)
long_name: str = field(default=None, kw_only=True)
latex_repr: str = field(default=None, kw_only=True)
yaml_tag: str = field(default="!Variable", kw_only=False)
yaml_loader = SafeLoader

Comment on lines +19 to +26
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml_tag is declared using dataclasses.field(...) in several YAMLObject subclasses (e.g., Variable, VariableSpace, Auxiliary). Using field() makes the class attribute a dataclasses.Field rather than the tag string, which can prevent YAMLObject from registering constructors correctly and also unnecessarily adds yaml_tag to the dataclass init/instances. Prefer yaml_tag: ClassVar[str] = "!Tag" (and similarly for yaml_loader) so PyYAML sees a plain class attribute and dataclasses doesn't treat it as a field.

Copilot uses AI. Check for mistakes.
def __post_init__(self):
for key in ["long_name", "short_name", "latex_repr"]:
self.attrs.setdefault(key, None)
Comment on lines +28 to +29
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable.__post_init__ currently only sets default keys in attrs, but it never copies short_name/long_name/latex_repr field values into attrs. When loading from YAML (which sets these as top-level fields), self.attrs stays empty and downstream code (e.g., State.assign_values passes self.attrs) loses this metadata. Consider syncing non-None field values into attrs (or removing the duplicate fields and storing metadata only in attrs).

Suggested change
for key in ["long_name", "short_name", "latex_repr"]:
self.attrs.setdefault(key, None)
# Synchronize metadata between top-level fields and attrs.
for key in ["long_name", "short_name", "latex_repr"]:
field_value = getattr(self, key)
# If the dataclass field is set, it is the source of truth.
if field_value is not None:
self.attrs[key] = field_value
else:
# Otherwise, if attrs already has a non-None value, mirror it back.
if key in self.attrs and self.attrs[key] is not None:
setattr(self, key, self.attrs[key])
else:
# Ensure the key exists in attrs for downstream consumers.
self.attrs.setdefault(key, None)

Copilot uses AI. Check for mistakes.
self.name = self.name.strip()
if not self.name:
raise ValueError("Empty variable name")

@classmethod
def from_yaml(cls, loader, node):
fields = loader.construct_mapping(node, deep=True)
return cls(**fields)

def __repr__(self):
"""
String representation of the variable.

Returns:
str: The string representation of the variable.
"""
return f"{self.__class__.__name__}({self.name})"


@dataclass
class VariableSpace(YAMLObject):
"""
Abstract class for representing a collection of variables.
"""

variables: list[Variable]
yaml_tag: str = field(default="!VariableSpace", kw_only=True)
Comment on lines +55 to +56
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module uses PEP 585 built-in generics like list[Variable] / list[str], which are not valid on Python 3.8 without from __future__ import annotations. The project declares requires-python = ">=3.8" in pyproject.toml, so this will break on supported versions unless you either (a) add the future import, (b) switch to typing.List[...], or (c) bump the minimum Python version everywhere (metadata + CI).

Copilot uses AI. Check for mistakes.
yaml_loader = SafeLoader

def __post_init__(self):
"""
Save the variables in a dictionary for easy access.
"""
self.variables = {var.name: var for var in self.variables}

@classmethod
def from_yaml(cls, loader, node):
fields = loader.construct_mapping(node, deep=True)
return cls(**fields)


@dataclass(kw_only=True)
class Parameter(Variable):
"""
A `Parameter` is a variable that has a fixed value.
"""

value: Union[int, float] = 0
yaml_tag: str = "!Parameter"
yaml_loader = SafeLoader

def __repr__(self):
"""
String representation of the parameter.

Returns:
str: The string representation of the parameter.
"""
return f"{self.__class__.__name__}({self.name}, {self.value})"


@dataclass
class Parameters(VariableSpace):
"""
A `Parameters` is a collection of parameters.
"""

yaml_tag: str = "!Parameters"


@dataclass
class Auxiliary(Variable):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Chris that 'auxiliary' is more like a macro.
I wouldn't use 'auxiliary' here, though it makes sense to have this sort of object.

"""
Class for representing auxiliaries. Auxiliaries are abstract variables that
have an array structure but are not states, actions, or shocks. They may
include information like domain, measure (discrete or continuous), etc.
"""

array: Union[list, np.ndarray, xr.DataArray] = None
domain: Union[list, tuple] = field(default=None, kw_only=True)
is_discrete: bool = field(default=False, kw_only=True)
yaml_tag: str = field(default="!Auxiliary", kw_only=True)


@dataclass
class AuxiliarySpace(VariableSpace):
"""
A `AuxiliarySpace` is a collection of auxiliary variables.
"""

yaml_tag: str = "!AuxiliarySpace"


@dataclass(kw_only=True)
class State(Auxiliary):
"""
Class for representing a state variable.
"""

yaml_tag: str = "!State"

def assign_values(self, values):
return make_state_array(values, self.name, self.attrs)

def discretize(self, min, max, N, method):
# linear for now
self.assign_values(np.linspace(min, max, N))

Comment on lines +131 to +137
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State.assign_values returns a new Dataset but does not store it on the instance; State.discretize calls self.assign_values(...) and ignores the return value. As written, discretize has no effect on the State. Either have assign_values mutate self.array/self.domain etc., or have discretize return the created Dataset and update callers accordingly.

Copilot uses AI. Check for mistakes.

@dataclass(kw_only=True)
class StateSpace(AuxiliarySpace):
states: Mapping[str, State] = field(init=False)
yaml_tag: str = "!StateSpace"

def __post_init__(self):
super().__post_init__()
self.states = self.variables


@dataclass(kw_only=True)
class PostState(State):
yaml_tag: str = "!PostState"


@dataclass(kw_only=True)
class PostStateSpace(StateSpace):
post_states: Mapping[str, State] = field(init=False)
yaml_tag: str = "!PostStateSpace"

def __post_init__(self):
super().__post_init__()
self.post_states = self.variables


@dataclass(kw_only=True)
class Action(Auxiliary):
"""
Class for representing actions. Actions are variables that are chosen by the agent.
Can also be called a choice, control, decision, or a policy.

Args:
Variable (_type_): _description_
"""
Comment on lines +164 to +172
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring includes placeholder text (Args: Variable (_type_): _description_) which is misleading/inaccurate for a public class. Please either remove the placeholder section or replace it with the actual arguments/meaning.

Copilot uses AI. Check for mistakes.

is_optimal: bool = True
yaml_tag: str = "!Action"

def discretize(self, *args, **kwargs):
warn("Actions cannot be discretized.")


@dataclass(kw_only=True)
class ActionSpace(AuxiliarySpace):
actions: Mapping[str, State] = field(init=False)
yaml_tag: str = "!ActionSpace"

def __post_init__(self):
super().__post_init__()
self.actions = self.variables
Comment on lines +182 to +188
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ActionSpace.actions is annotated as Mapping[str, State], but it is populated from self.variables which will contain Action instances. This should be Mapping[str, Action] (or a common base type) to avoid type/usage errors.

Copilot uses AI. Check for mistakes.


@dataclass(kw_only=True)
class Shock(Variable):
"""
Class for representing shocks. Shocks are variables that are not
chosen by the agent.
Can also be called a random variable, or a state variable.

Args:
Variable (_type_): _description_
"""
Comment on lines +191 to +200
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Shock class docstring also contains placeholder boilerplate (Args: Variable (_type_): _description_), which is misleading for users. Please remove or replace it with accurate documentation.

Copilot uses AI. Check for mistakes.

yaml_tag: str = "!Shock"


@dataclass(kw_only=True)
class ShockSpace(VariableSpace):
shocks: list[Shock]
yaml_tag: str = "!ShockSpace"

def __post_init__(self):
super().__post_init__()
self.shocks = self.variables

Comment on lines +206 to +213
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShockSpace.shocks is declared as list[Shock], but VariableSpace.__post_init__ converts variables into a dict and ShockSpace.__post_init__ assigns that dict to self.shocks. This mismatch will break expectations for iteration/order and static typing. Consider making shocks a Mapping[str, Shock] (consistent with the assigned value) or avoid converting to a dict.

Copilot uses AI. Check for mistakes.

def make_state_array(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method duplicated?

values: np.ndarray,
name: Optional[str] = None,
attrs: Optional[dict] = None,
) -> xr.Dataset:
"""
Function to create a state with given values, name and attrs.

Parameters:
values (np.ndarray): The values for the state.
name (str, optional): The name of the state. Defaults to 'state'.
attrs (dict, optional): The attrs for the state. Defaults to None.

Returns:
State: An xarray DataArray representing the state.
"""
# Use a default name only when no name is provided
name = name or f"state{rng.integers(0, 100)}"
attrs = attrs or {}

return xr.Dataset(
{
name: xr.DataArray(
values,
name=name,
dims=(name,),
attrs=attrs,
)
}
)


def make_states_array(
values: Union[np.ndarray, list],
names: Optional[list[str]] = None,
attrs: Optional[list[dict]] = None,
) -> xr.Dataset:
"""
Function to create states with given values, names and attrs.

Parameters:
values (Union[np.ndarray, States]): The values for the states.
names (list[str], optional): The names of the states. Defaults to None.
attrs (list[dict], optional): The attrs for the states. Defaults to None.

Returns:
States: An xarray Dataset representing the states.
"""
if isinstance(values, list):
values_len = len(values)
elif isinstance(values, np.ndarray):
values_len = values.shape[0]

# Use default names and attrs only when they are not provided
names = names or [f"state{rng.integers(0, 100)}" for _ in range(values_len)]
attrs = attrs or [{}] * values_len
Comment on lines +263 to +270
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_states_array leaves values_len undefined when values is neither list nor np.ndarray, which will raise an UnboundLocalError later. Add an explicit else that raises a clear TypeError/ValueError, or normalize inputs up front.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attrs = attrs or [{}] * values_len reuses the same dict instance for every state, so mutating one state's attrs would affect all others. Use a list comprehension to create distinct dicts per state.

Suggested change
attrs = attrs or [{}] * values_len
attrs = attrs or [{} for _ in range(values_len)]

Copilot uses AI. Check for mistakes.

states = [
make_state_array(value, name, attr)
for value, name, attr in zip(values, names, attrs)
]

return xr.merge([states])
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return xr.merge([states]) passes a nested list (states is already a list of Datasets), which will error or produce an unexpected merge. It should merge the list directly (and likely states should be the list of datasets).

Suggested change
return xr.merge([states])
return xr.merge(states)

Copilot uses AI. Check for mistakes.
Loading
Loading