-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Historically VarInfo comprises metadata + accumulators. When you do DynamicPPL.evaluate!!(model, vi), if model's leaf context is DefaultContext, then this:
- reads values from
vi.metadata; - accumulates outputs in
vi.accs.
Now, in the last couple of months there has been a push towards using OnlyAccsVarInfo — a varinfo whose metadata is empty — where possible. 'Where possible', here, means when the model's leaf context is InitContext. That is possible because InitContext provides a full specification of how to generate the values. For example:
InitContext{R,<:InitFromPrior}generates values by sampling from the distribution on the rhs of tildeInitContext{R,<:InitFromParams}generates values by reading from a dictionary, NamedTuple, or other container
(here R <: Random.AbstractRNG)
This not only has performance benefits (since there is no more metadata to modify), but also makes the flow of data cleaner:
- values are generated from
model.context; - outputs are written into
vi.accs.
Instead of having a single struct, vi, whose state can be modified in every possible way by tilde_assume!! calls, this means that there is a clear input (model.context) and output (vi.accs). This is true whenever a model has InitContext as its leaf context.
This suggests that DefaultContext should be removed, and replaced with InitFromMetadata, or perhaps just InitFromParams(::Metadata). This is completely equivalent in terms of how much data is being carried around: in effect, what this means is that when we call DynamicPPL.evaluate!!(model, vi), it means that we are shifting the metadata from the vi argument into the context field of the model argument.
(Note that here I call it Metadata, but this could equally well be VarNamedTuple in the near future.)
As I've argued elsewhere, calling this DefaultContext is quite misleading and hides quite a lot of complexity. It hides, for example, the fact that you need to generate a Metadata before you can evaluate a model with DefaultContext. If it was called InitFromParams(::Metadata) instead, then it becomes immediately obvious that you need to have a Metadata before you can use it.
As a followup to this, I think that when instantiating a model, the default context field should be InitContext(InitFromPrior()). That means that when you create a model you can immediately start working with it without having to generate a set of parameters first, or without having to set a new leaf context.
Finally, note that removing DefaultContext would mean that our only meaningful leaf context would now be InitContext. InitContext essentially wraps an rng, plus a strategy. If we unwrapped this and removed the context field from DynamicPPL.Model, then in the near-ish future, we could the signature of evaluate!! to instead be
DynamicPPL.evaluate!!(model, vi) # now
DynamicPPL.evaluate!!(rng, model, strategy, accs) # future(note, BTW, that this is actually the signature of DynamicPPL.init!!.) In other words, we can get rid of the VarInfo struct entirely, and only use accumulators. Furthermore, each argument to evaluate!! will have a clear and separate meaning, as opposed to the current situation, where the semantics of evaluate!! are determined by an ugly combination of model and vi.
Note that the removal of metadata would also opens the pathway to threaded assume statements. The only thing that makes threaded assume impossible is the fact that writing into metadata is not thread-safe. However, reading from metadata is thread-safe.
On a technical level, this proposal also needs a way to bootstrap itself. In other words, if we want to use InitFromParams(::Metadata), then we have to have a way of generating a Metadata (by evaluating a model). Currently, this is done by using InitContext with a full VarInfo, but if we get rid of vi.metadata then this won't be possible. I propose instead that there needs to be an accumulator that will generate Metadata.
Because accumulators can be written in a thread-safe manner, or equivalently they at least have a thread-safe interface that can be implemented, as long as such an accumulator is properly implemented, we will have complete thread safety for assume and observe statements.