-
Notifications
You must be signed in to change notification settings - Fork 39
Description
As far as I can see, the only reason to do linking is because samplers need it. Samplers moreover interface with a vector of floats, not a VarInfo. So really the only place when we need linked variables is when we call getindex_internal or something like vi[:]. If I'm mistaken about this, and there's a different need, all of the below may be moot, so do say so.
This lead me to wonder why we bother storing linked variables at all, rather than converting at getindex/setindex time. Currently our philosophy seems to be that samplers give and take vectors of linked floats, and when interfacing with a sampler we typically call vi[:] and unflatten(vi, vec_of_floats) to read/write those vectors of floats from/to the internal storage of the VarInfo, where the variables are linked if link has been called on vi at some point in its history. Then whenever we need the unlinked versions of variables, such as when calling logpdf, we convert the internal, linked storage to the unlinked one.
An alternative approach would be to say that VarInfo only ever stores unlinked variables, and rather than calling link on vi once at the beginning of sampling and using getindex/setindex, if you would want linked variables you would need to call getindex_linked and setindex_linked or unflatten_linked, and the linking would happen on the fly in that call.
Needs linked Needs unlinked
┌──────────┐ ┌───────────┐ ┌─────────────┐
│ │ │ │ │ │
│ Sampler │ ◄─────► │ VarInfo │ ◄─────► │ logpdf call │
│ │ │ │ │ │
└──────────┘ ▲ └───────────┘ ▲ └─────────────┘
│ │
│ │
Alternative: Current:
(Un)link here (Un)link here
I'm not sure if the alternative is better, but I do see some benefits to it:
- VarInfos would have less state that they carry. Currently when you call
getindex_internalorunflattenyou need to know/remember whether this VarInfo has been linked at some point in its history (or check for it) to know what you'll get. - Sampler interface code would be more explicit, at the call-site, as to whether it wants linked variables or not. This would make the interface code a bit easier to understand.
- VarInfo would be simpler. We could drop the
istransflags, and never worry about things like having some variables linked and others not. Note that we would still need to carry the linking transformations in the VarInfo, to know how to transform when someone calls e.g.getindex_linked.
This could also help us move towards something like SimpleVarInfo or the @parameters struct of TuringLang/Turing.jl#2492, where the storage in VarInfo would more closely match how the model sees the variables. I haven't thought about this carefully, but I could imagine this leading to some performance gains and maybe code simplification.
This is currently more an idea to be discussed than a proposal to implement. Any thoughts? Downsides to linking at getindex/setindex time?