Skip to content

Comments

Numeric methods#61

Open
michaeldickens wants to merge 99 commits intorethinkpriorities:mainfrom
michaeldickens:numeric-methods
Open

Numeric methods#61
michaeldickens wants to merge 99 commits intorethinkpriorities:mainfrom
michaeldickens:numeric-methods

Conversation

@michaeldickens
Copy link
Contributor

Changes

This PR adds support for representing distributions numerically. The PR includes documentation on how exactly it works, see doc/source/numeric_distributions.rst. So I will just summarize the changes.

When people do cost-effectiveness analyses or other sorts of Fermi estimates, if they incorporate uncertainty at all, they almost always use Monte Carlo sampling. Numerically representing probability distributions as histograms is usually much more accurate at a given level of speed (or, equivalently, much faster at a given level of accuracy), so it kind of bugs me that people hardly ever use numeric methods. But as far as I know, there aren't any good tools for numeric methods, and it's a bit tricky to implement from scratch: not difficult in the grand scheme of things, but more involved than writing a Monte Carlo simulation, which you can do in only a few lines of code. So I decided to extend Squigglepy to support numeric methods.

I hope to reproduce this PR in Squiggle since Squiggle is actively supported and more widely used, but I did it in Squigglepy first because I thought it would be simpler (especially because Python has good numeric libraries).

An overview of supported features:

  1. Numeric representation of any of these distribution types: normal, log-normal, uniform, constant, chi-square, exponential, PERT, beta, gamma, Pareto
  2. Support for the special types MixtureDistribution and ComplexDistribution
  3. mathematical operations over numeric distributions: addition, subtraction, multiplication, division, negation, reciprocal, exp, log, power
  4. statistical operations over numeric distributions: mean, standard deviation, cdf, ppf, get random sample, clip, condition on the random variable satisfying a condition, probability that the random variable satisfies a condition
  5. Distributions with non-infinitesimal probability mass at zero (eg for representing interventions that have some chance of no effect)

QA

  1. There are a lot of possible edge cases. I wrote a lot of tests, which catch many edge cases that will most likely never happen in practice (e.g., what if you construct a two-sided distribution where only 0.00000000001% of the probability mass is below 0? what if a distribution has a standard deviation of 1 quintillion?), but there could well be broken edge cases that I didn't find.
  2. mypy is raising a bunch of errors because the code sometimes dynamically dispatches based on a variable's type, and mypy doesn't like that. It looks like mypy also has failures in other files so I don't know if valid static typing is a requirement for Squigglepy. If it is, I can do back and try to fix the type errors.

from_distribution where clipped dist has incorrect pos and neg EV contribution
@michaeldickens michaeldickens mentioned this pull request Jan 1, 2024
@michaeldickens
Copy link
Contributor Author

Oh I should also mention, the function sq.numeric works basically as a drop-in replacement for sq.sample. I tested this PR on two of Laura Duffy's cost-effectiveness models (https://github.com/rethinkpriorities/risk_model) and it was pretty easy to port over, just had to ctrl-F sq.sample -> sq.numeric plus a couple other changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant