Implement exact multinomial math in server.logprob() etc.

The current implementation of data conditioning in the server uses a "with-replacement approximation" of the likelihood function, i.e. it uses `pow()` instead of `gamma()` or `factorial()`. This design decision was based on an early non-vectorized implementation, where calling `gamma()` was very slow; however the server is now vectorized, so it should be cheap to implement exact likelihood computations.

## How?

For math details, see Stephen Tu's [excellent writeup](https://people.eecs.berkeley.edu/~stephentu/writeups/dirichlet-conjugate-prior.pdf) or [Wikipedia](https://en.wikipedia.org/wiki/Dirichlet-multinomial_distribution).

## Where?

This approximation is pervasive in `serving.py`, simply search for "with-replacement":
- In [`TreeCatServer.sample()`](https://github.com/posterior/treecat/blob/53a7df9179b2086e3f86697e8b906fa5375fb5a5/treecat/serving.py#L199)
- In [`TreeCatServer.logprob()`](https://github.com/posterior/treecat/blob/53a7df9179b2086e3f86697e8b906fa5375fb5a5/treecat/serving.py#L265)
- In [`TreeCatServer.marginals()`](https://github.com/posterior/treecat/blob/53a7df9179b2086e3f86697e8b906fa5375fb5a5/treecat/serving.py#L314)

## Performance Impact

Results from `treecat.profile serve` show that the majority of time is spent on `np.dot()` in propagation. Therefore there should be negligible cost in switching from `np.pow` to `scipy.special.gammaln`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement exact multinomial math in server.logprob() etc. #13

How?

Where?

Performance Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement exact multinomial math in server.logprob() etc. #13

Description

How?

Where?

Performance Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions