Skip to content

Incorporate distributions into inference#601

Open
TimothyWillard wants to merge 8 commits intodevfrom
incorporate-params-into-inference
Open

Incorporate distributions into inference#601
TimothyWillard wants to merge 8 commits intodevfrom
incorporate-params-into-inference

Conversation

@TimothyWillard
Copy link
Contributor

@TimothyWillard TimothyWillard commented Aug 5, 2025

Describe your changes.

This pull request:

  • Fixed bug in the call signature to check_parameter_positivity in build_step_source_arg.
  • Fixed typos in call to distribution_from_confuse_config in gempyor.inference_parameters.
  • Added DistributionABC.support abstract property to give the lower/upper bounds of a distribution.
  • Incorporated support abstract property into the InferenceParameters class and modified the 'simple_usa_statelevel' example to demonstrate.
  • Added gempyor.NPI.helpers.SpatialGroups to consistently represent modifier subpopulations and converted get_spatial_groups into the from_subpopulations class method for ease of creation from modifier configuration.

The remaining TODOs are:

Does this pull request make any user interface changes? If so please describe.

The user interface changes are allowing the usage of distributions other than 'truncnorm' in the value for modifiers.

Those are reflected in updates to the documentation in ...

What does your pull request address? Tag relevant issues.

This pull request addresses #598.

@TimothyWillard TimothyWillard added gempyor Concerns the Python core. high priority High priority. inference Concerns the parameter inference framework. next release Marks a PR as a target to include in the next release. labels Aug 5, 2025
@TimothyWillard TimothyWillard self-assigned this Aug 5, 2025
Copy link
Contributor

@pearsonca pearsonca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably fine, but I'd like to discuss repeating the pattern we've been using with the abstract class delegating to an internal implementing method (i.e. sample and _sample_impl)

seems like there are potentially natural opportunities with pydantic approaches

@TimothyWillard TimothyWillard force-pushed the incorporate-params-into-inference branch from 03864c4 to c1ddfc8 Compare August 6, 2025 15:15
@emprzy
Copy link
Collaborator

emprzy commented Aug 7, 2025

I like the added support to distributions. Will definitely keep this in mind as I'm working on #595 . Need to more thoroughly review in a bit

@TimothyWillard TimothyWillard force-pushed the incorporate-params-into-inference branch from c1ddfc8 to 979150d Compare August 18, 2025 13:04
Copy link
Contributor

@pearsonca pearsonca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor items to consider - but nothing must-have

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally fine with changing these, let's just make sure to check any connected documentation to ensure it still makes sense with the changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No documentation updates are required to accommodate the example change. The 'simple_usa_statelevel' example is only mentioned in the HPC submission guide which just uses the config as an EMCEE inference example but does not discuss the details of the config contents.

(C flepiMoP/venv) ~/D/G/H/f/documentation (Light refactor of `InferenceParameters` class, wq, ae, model-output-api*) > rg -F 'simple_usa_statelevel'
gitbook/how-to-run/advanced-run-guides/running-on-a-hpc-with-slurm.md
99:$ export PROJECT_PATH="$FLEPI_PATH/examples/simple_usa_statelevel/"
115:    simple_usa_statelevel.yml
163:    simple_usa_statelevel.yml > simple_usa_statelevel_estimation.log 2>&1 & disown
166:In short, this command will submit 6 test jobs that will vary simulations and measure time and memory. The number of simulations will be used to project the required resources. The test jobs will range from 1/5 to 1/10 of the target job size. This command will take a bit to run because it needs to wait on these test jobs to finish running before it can do the analysis, so you can check on the progress by checking the output of the `simple_usa_statelevel_estimation.log` file.
186:    simple_usa_statelevel.yml

Comment on lines +44 to +46
def __call__(self) -> float | int:
"""A shortcut for `self.sample(size=1)`."""
return self.sample(size=1).item()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is this relocation is a linting based thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relocation was manual, I'm used to putting dunders near the top of the class. Happy to undo to clean up the diff if preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the relocation - also agree with this loc for dunders. Was asking more to see if this is something we should be adding to our style guide.

Comment on lines +159 to +167
@property
def _lower(self) -> float:
"""The lower bound of the fixed distribution's support."""
return self.value

@property
def _upper(self) -> float:
"""The upper bound of the fixed distribution's support."""
return self.value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for later: is there way we can use pydantic aliases for these kind of things?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(applies to several potential options)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cursory look at Alias indicates this almost entirely about serialization / deserialization, not object access.

we could potentially make lower / upper the fields, and then "value" (in this example) the alias.

but definitely this is a tabled item for future refinement - maybe incorporate in #603 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another look at our options, I think we're best off waiting until pydantic/pydantic#11685. I'll add a note to #603.

@TimothyWillard TimothyWillard mentioned this pull request Aug 21, 2025
Fix `TypeError` caused by providing the subpopulation populations,
`list[int]`, to `check_parameter_positivity` by providing the
subpopulation names, `list[str]`. This would cause an error when trying
to construct an error message when it tried to join the subpopulation
names together. The exception was: `sequence item 0: expected str
instance, numpy.int64 found`.
Fixed typo in the name of the distribution creation helper in two places
in `inference_parameter.py`. Also corrected call to use
`DistributionABC.sample` method because a specific size was required.
Added a `support` property to the `DistributionABC` class which returns
a tuple of two numbers representing the upper and lower bounds
respectively. This property pulls said bounds from either the
`_lower/upper_bound` attributes if defined by the implementation or
`_lower/upper` properties if overridden by the implementation. Added to
the current examples to demonstrate usage. Minor modifications to the
commond distributions test was required.
Incorporated bounds provided by `DistributionABC.support` into the
`InferenceParameters` class for lower/upper bounds used in proposal
rejection. Also converted lower/upper bounds representation from list of
floats to numpy arrays of floats for performance reasons.
Used the 'simple_usa_statelevel' example to demonstrate how inference on
modifiers now supports more than 'truncnorm' distributions. Changed the
'truncnorm' usage in the seir modifiers section to use a 'gamma'
distribution with the same mean and variance.
@TimothyWillard TimothyWillard force-pushed the incorporate-params-into-inference branch from 979150d to 87030fd Compare August 22, 2025 16:06
boolean vector of True if the parameter is bigger than the upper bound and False if not
"""
return np.array((proposal > self.ubs))
lower_bounds, upper_bounds = zip(*[dist.support for dist in self.pdists])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how often is check_in_bounds called? i'm not sure about the elimination of ubs, lbs as properties of this object - might still be useful to create and cache those with initialization / parsing, especially if this method is called many times post initialization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still think they should go as arguments to the add_... method, but maybe do this extract => append there)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frequently. I'll make the adjustment.

"value",
] = proposal[p_idx]
for p_idx in range(len(self)):
df = snpi_df_mod if self.ptypes[p_idx] == "seir_modifiers" else hnpi_df_mod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love it - i assume this logic is working fine as pointer-/reference-like updating operation, rather than copy-on-assign to a subsequently ignored df?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside, i'm not clocking a proposal injection test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add

Added unit tests and changed the API of
`gempyor.NPI.helpers.get_spatial_groups`. The API change moves away from
requiring a confuse configview to use the function, which pushes the
confuse wrangling to the caller. Improves unit testing and general
purpose usage and makes the function more stable as we move away from
confuse. Also simplified/improved the logic and enhanced the
documentation and type hints.
@TimothyWillard TimothyWillard force-pushed the incorporate-params-into-inference branch from 87030fd to aef7714 Compare August 25, 2025 19:03
@TimothyWillard TimothyWillard force-pushed the incorporate-params-into-inference branch from aef7714 to c327f2e Compare August 25, 2025 20:09
For ease of constructing and manipulating modifier spatial groups
converted the `SpatialGroups` typed dict to a frozen data class. This
allows for the continued typing support as well as adding support for
iterating over a `SpatialGroups` class in a consistent way. Also
migrated `get_spatial_groups` into the
`SpatialGroups.from_subpopulations` class method. Added corresponding
documentation and unit tests.
* Address suggestions from `pylint` to make the code more readable as
	well as add documentation/type hints.
* Added initial, highly targeted unit tests. Including for the
	`inject_proposal` method.
* Consolidated `hit_*` methods into the `check_in_bound` method.
* Made the `{l/u}bs` attributes private and extract them from the
	distribution object on behalf of the user in `add_single_parameter`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gempyor Concerns the Python core. high priority High priority. inference Concerns the parameter inference framework. next release Marks a PR as a target to include in the next release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature request]: Incorporate gempyor.distributions Into gempyor.inference_parameter.InferenceParameters

3 participants