[WIP] Abstract Variables/Symbols by alanlujan91 · Pull Request #1308 · econ-ark/HARK

alanlujan91 · 2023-07-19T18:45:47Z

Please ensure your pull request adheres to the following guidelines:

Tests for new functionality/models or Tests to reproduce the bug-fix in code.
Updated documentation of features that add new functionality.
Update CHANGELOG.md with major/minor changes.

codecov · 2023-07-19T19:03:36Z

Codecov Report

Patch coverage has no change and project coverage change: -0.40% ⚠️

Comparison is base (37134b9) 72.55% compared to head (f41c944) 72.16%.
Report is 30 commits behind head on master.

❗ Current head f41c944 differs from pull request most recent head e4d2850. Consider uploading reports for the commit e4d2850 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1308      +/-   ##
==========================================
- Coverage   72.55%   72.16%   -0.40%     
==========================================
  Files          78       79       +1     
  Lines       13009    13080      +71     
==========================================
  Hits         9439     9439              
- Misses       3570     3641      +71

Files Changed	Coverage Δ
HARK/variables.py	`0.00% <0.00%> (ø)`

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

alanlujan91 · 2023-07-25T15:35:41Z

Having trouble setting tests up but this works locally.

alanlujan91 · 2023-07-25T18:21:14Z

@ingydotnet here I am starting the work of building the abstract objects

sbenthall · 2023-07-25T18:21:33Z

Test failures may be due to Python versions (3.8, 3.9, etc.). A lot of these more advanced language features are relatively recent.

sbenthall · 2023-07-25T18:23:32Z

HARK/abstract/tests/consindshk.yml

+      name: m
+      short_name: money
+      long_name: market resources
+      latex_repr: \mNrm


I know how you're imagining using the Latex representation.
But my preference would be to not include it in the PR unless you have some demonstration of how it works ready.
4 different ways to name something for a quick demo seems like a lot....

For now these are filler, I want to throw any non-required keys into an attributes dictionary.

sbenthall · 2023-07-25T18:24:24Z

HARK/abstract/tests/consindshk.yml

+    - !Action
+      name: c
+      short_name: consumption
+      long_name: consumption


Are these fields optional? Can one spill over to the others as a default?
Basically, how can we make these config files lighter weight?

These are optional, only required is name.

sbenthall · 2023-07-25T18:25:36Z

HARK/abstract/tests/consindshk_full.yml

+      long_name: market resources
+      latex_repr: \mNrm
+    - !State
+      name: &name stigma


I don't understand what the ampersands are doing here.
Maybe document that?

ampersands are aliases, since one of the states is also a control and a post-state. I will document this more as I make more progress.

sbenthall · 2023-07-25T18:29:39Z

HARK/abstract/tests/consindshk.yml

+post_states: !PostStateSpace
+  variables:
+    - !PostState
+      name: a


Looks like you've repeated the post_states block twice in this file?

Not sure how @mnwhite feels, but maybe we don't need to draw a firm distinction between states and post states like this.

Or the labels could be inside the variable, not part of the document structure.

Compare:

var_type_1: variables: - !VarTypeClass1 details var_type_2: variables: - !VarTypeClass2 details

with

variables: - !VarTypeClass1 details - !VarTypeClass2 details

The latter is same information, but fewer lines.

sbenthall · 2023-07-25T18:32:01Z

HARK/abstract/variables.py

+        self.shocks = self.variables
+
+
+def make_state_array(


Is this method duplicated?

sbenthall · 2023-07-25T18:32:25Z

HARK/abstract/tests/test_variables.py

+import yaml
+
+import HARK.abstract.variables
+


A test case showing how this data structure could be used in practice would go a long way.

sbenthall · 2023-07-25T18:35:55Z

I think this is a cool demonstration of how PyYAML can be leveraged to make model configuration files in YAML without a custom parser.

The tricky part, as we know, is function definitions.

alanlujan91 · 2023-07-25T18:36:17Z

Test failures may be due to Python versions (3.8, 3.9, etc.). A lot of these more advanced language features are relatively recent.

yep, this might have to be a feature for the future, just getting some initial work done on it

alanlujan91 · 2023-07-25T18:37:31Z

I think this is a cool demonstration of how PyYAML can be leveraged to make model configuration files in YAML without a custom parser.

The tricky part, as we know, is function definitions.

Yes! I was just talking to one of the creators of YAML who was telling me about this https://github.com/yaml/yamlscript

sbenthall · 2023-07-25T18:41:08Z

Mind blown emoji.

One more thing: I'd be keen to see how you would initialize a distribution for a shock directly from YAML.

sbenthall · 2023-07-26T17:24:43Z

HARK/abstract/variables.py

+
+
+@dataclass
+class Auxiliary(Variable):


I agree with Chris that 'auxiliary' is more like a macro.
I wouldn't use 'auxiliary' here, though it makes sense to have this sort of object.

sbenthall · 2023-07-26T17:31:31Z

HARK/abstract/variables.py

+
+
+@dataclass
+class Variable(YAMLObject):


Is there a way to initialize Variables without parsing them from a YAML file?
I.e., a pure python way to create variables?

I'm a little wary of tying the model objects too tightly to the serial format because it can make it tricky to interoperate with other python libraries.

alanlujan91 · 2023-08-07T13:34:29Z

@MridulS thank you for fixing the checks!

Might you be able to help me with the failing test? I'm not sure why !Parameters is not being recognized.

HARK/abstract/variables.py

Co-authored-by: Mridul Seth <mail@mriduls.com>

Copilot

Pull request overview

Introduces a new “abstract variables/symbols” layer with PyYAML-backed serialization to describe model variables/spaces via YAML fixtures.

Changes:

Add HARK/abstract/variables.py defining Variable/State/Action/Shock and corresponding “space” containers, plus xarray helpers.
Add YAML fixtures and a basic test module that loads them via yaml.safe_load.
Add pyyaml dependency and narrow CI to only run on Python 3.10.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
requirements/base.txt	Adds PyYAML dependency for YAML tag loading/parsing.
HARK/abstract/variables.py	Implements variable/space dataclasses, PyYAML tag classes, and xarray dataset builders.
HARK/abstract/tests/test_variables.py	Adds tests intended to validate YAML loading of tagged objects.
HARK/abstract/tests/consindshk.yml	Adds a YAML fixture for states/actions/post-states.
HARK/abstract/tests/consindshk_full.yml	Adds a YAML fixture including parameters in addition to spaces.
.github/workflows/hark.yml	Reduces the test matrix to only Python 3.10.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T19:21:26Z

HARK/abstract/variables.py

+        for key in ["long_name", "short_name", "latex_repr"]:
+            self.attrs.setdefault(key, None)


Variable.__post_init__ currently only sets default keys in attrs, but it never copies short_name/long_name/latex_repr field values into attrs. When loading from YAML (which sets these as top-level fields), self.attrs stays empty and downstream code (e.g., State.assign_values passes self.attrs) loses this metadata. Consider syncing non-None field values into attrs (or removing the duplicate fields and storing metadata only in attrs).

Suggested change

for key in ["long_name", "short_name", "latex_repr"]:

self.attrs.setdefault(key, None)

# Synchronize metadata between top-level fields and attrs.

for key in ["long_name", "short_name", "latex_repr"]:

field_value = getattr(self, key)

# If the dataclass field is set, it is the source of truth.

if field_value is not None:

self.attrs[key] = field_value

else:

# Otherwise, if attrs already has a non-None value, mirror it back.

if key in self.attrs and self.attrs[key] is not None:

setattr(self, key, self.attrs[key])

else:

# Ensure the key exists in attrs for downstream consumers.

self.attrs.setdefault(key, None)

Copilot · 2026-01-28T19:21:26Z

HARK/abstract/variables.py

+    def assign_values(self, values):
+        return make_state_array(values, self.name, self.attrs)
+
+    def discretize(self, min, max, N, method):
+        # linear for now
+        self.assign_values(np.linspace(min, max, N))
+


State.assign_values returns a new Dataset but does not store it on the instance; State.discretize calls self.assign_values(...) and ignores the return value. As written, discretize has no effect on the State. Either have assign_values mutate self.array/self.domain etc., or have discretize return the created Dataset and update callers accordingly.

Copilot · 2026-01-28T19:21:26Z

HARK/abstract/variables.py

+    if isinstance(values, list):
+        values_len = len(values)
+    elif isinstance(values, np.ndarray):
+        values_len = values.shape[0]
+
+    # Use default names and attrs only when they are not provided
+    names = names or [f"state{rng.integers(0, 100)}" for _ in range(values_len)]
+    attrs = attrs or [{}] * values_len


make_states_array leaves values_len undefined when values is neither list nor np.ndarray, which will raise an UnboundLocalError later. Add an explicit else that raises a clear TypeError/ValueError, or normalize inputs up front.

Copilot · 2026-01-28T19:21:27Z

HARK/abstract/variables.py

+        for value, name, attr in zip(values, names, attrs)
+    ]
+
+    return xr.merge([states])


return xr.merge([states]) passes a nested list (states is already a list of Datasets), which will error or produce an unexpected merge. It should merge the list directly (and likely states should be the list of datasets).

Suggested change

return xr.merge([states])

return xr.merge(states)

Copilot · 2026-01-28T19:21:27Z

HARK/abstract/variables.py

+    variables: list[Variable]
+    yaml_tag: str = field(default="!VariableSpace", kw_only=True)


This module uses PEP 585 built-in generics like list[Variable] / list[str], which are not valid on Python 3.8 without from __future__ import annotations. The project declares requires-python = ">=3.8" in pyproject.toml, so this will break on supported versions unless you either (a) add the future import, (b) switch to typing.List[...], or (c) bump the minimum Python version everywhere (metadata + CI).

Copilot · 2026-01-28T19:21:29Z

HARK/abstract/tests/test_variables.py

+            data = yaml.safe_load(f)
+
+    def test_full(self):
+        with open(self.path + "consindshk_full.yml") as f:
+            data = yaml.safe_load(f)


Variable data is not used.

Suggested change

data = yaml.safe_load(f)

def test_full(self):

with open(self.path + "consindshk_full.yml") as f:

data = yaml.safe_load(f)

data = yaml.safe_load(f)

self.assertIsNotNone(data)

def test_full(self):

with open(self.path + "consindshk_full.yml") as f:

data = yaml.safe_load(f)

self.assertIsNotNone(data)

Copilot · 2026-01-28T19:21:30Z

HARK/abstract/tests/test_variables.py

+            data = yaml.safe_load(f)
+
+    def test_full(self):
+        with open(self.path + "consindshk_full.yml") as f:
+            data = yaml.safe_load(f)


Variable data is not used.

Suggested change

data = yaml.safe_load(f)

def test_full(self):

with open(self.path + "consindshk_full.yml") as f:

data = yaml.safe_load(f)

data = yaml.safe_load(f)

self.assertIsNotNone(data)

def test_full(self):

with open(self.path + "consindshk_full.yml") as f:

data = yaml.safe_load(f)

self.assertIsNotNone(data)

Copilot · 2026-01-28T19:21:30Z

HARK/abstract/tests/test_variables.py

@@ -0,0 +1,19 @@
+import unittest
+
+import numpy as np


Import of 'np' is not used.

Suggested change

import numpy as np

Copilot · 2026-01-28T19:21:30Z

HARK/abstract/tests/test_variables.py

+import numpy as np
+import yaml
+
+import HARK.abstract.variables


Import of 'HARK' is not used.

Copilot · 2026-01-28T19:21:30Z

requirements/base.txt

 numba>=0.56
 numpy>=1.23
 pandas>=1.5
+pyyaml


The newly added dependency pyyaml is unpinned, so each install may fetch a different upstream package version, which increases supply-chain attack risk if the PyPI project is ever compromised. Because this code runs as part of your application, a malicious release of pyyaml could execute arbitrary code with your app’s privileges and access to secrets. Pin pyyaml to a specific, vetted version (or manage it via a lockfile/constraints file) and update it deliberately after review rather than tracking the moving latest release.

alanlujan91 added 3 commits July 19, 2023 14:36

Create variables.py

a7ea228

values can be list or array

19ceb72

rename make_states

f41c944

alanlujan91 requested review from mnwhite and sbenthall July 19, 2023 18:45

alanlujan91 added 5 commits July 25, 2023 09:45

implement abstract classes

c523d4b

remove xarray merge for now

ccdf96e

add tests

207696f

update path

918177d

update requirements

376a49a

sbenthall reviewed Jul 25, 2023

View reviewed changes

HARK/abstract/variables.py

self.shocks = self.variables

def make_state_array(

Copy link

Contributor

sbenthall Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method duplicated?

sbenthall reviewed Jul 25, 2023

View reviewed changes

sbenthall reviewed Jul 26, 2023

View reviewed changes

alanlujan91 requested a review from MridulS August 2, 2023 17:44

turn off 3.8, 3.9 tests for now

64e4c09

MridulS reviewed Aug 7, 2023

View reviewed changes

HARK/abstract/variables.py Outdated Show resolved Hide resolved

Update HARK/abstract/variables.py

e4d2850

Co-authored-by: Mridul Seth <mail@mriduls.com>

mnwhite moved this to Stale in Issues & PRs Jan 3, 2026

mnwhite added this to Issues & PRs Jan 3, 2026

alanlujan91 requested a review from Copilot January 28, 2026 19:15

Copilot started reviewing on behalf of alanlujan91 January 28, 2026 19:15 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

		for key in ["long_name", "short_name", "latex_repr"]:
		self.attrs.setdefault(key, None)

-        for key in ["long_name", "short_name", "latex_repr"]:
-            self.attrs.setdefault(key, None)
+        # Synchronize metadata between top-level fields and attrs.
+        for key in ["long_name", "short_name", "latex_repr"]:
+            field_value = getattr(self, key)
+            # If the dataclass field is set, it is the source of truth.
+            if field_value is not None:
+                self.attrs[key] = field_value
+            else:
+                # Otherwise, if attrs already has a non-None value, mirror it back.
+                if key in self.attrs and self.attrs[key] is not None:
+                    setattr(self, key, self.attrs[key])
+                else:
+                    # Ensure the key exists in attrs for downstream consumers.
+                    self.attrs.setdefault(key, None)

		variables: list[Variable]
		yaml_tag: str = field(default="!VariableSpace", kw_only=True)



		@dataclass
		class Auxiliary(Variable):



		@dataclass
		class Variable(YAMLObject):

Uh oh!

Conversation

alanlujan91 commented Jul 19, 2023

Uh oh!

codecov bot commented Jul 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

alanlujan91 commented Jul 25, 2023

Uh oh!

alanlujan91 commented Jul 25, 2023

Uh oh!

sbenthall commented Jul 25, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbenthall commented Jul 25, 2023

Uh oh!

alanlujan91 commented Jul 25, 2023

Uh oh!

alanlujan91 commented Jul 25, 2023

Uh oh!

sbenthall commented Jul 25, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanlujan91 commented Aug 7, 2023

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

codecov bot commented Jul 19, 2023 •

edited

Loading