Some sources of inefficiency.

I've investigated the efficiency of the current repo, especially with the `tutorials/examples/train_hypergrid_simple.py` example and already pushed some changes in #376.

There are still some inefficiencies compared to the [lightest implementation of HyperGrid (w/ TB & DB)](https://gist.github.com/malkin1729/9a87ce4f19acdc2c24225782a8b81c15) as follows:

1. When we use learnable PB, we call PF and PB independently, i.e., we require two forward passes of NN for each step. While we could share the parameters between PF and PB, we cannot call them in one forward pass in the current implementation.
2. `DiscreteEnv` enforces its state `DiscreteStates` to possess forward & backward masks and update them during `Env._step`. However, this can be inefficient because having those masks makes `States.__getitem__` (e.g., slicing) slower since it requires 3 tensor slicing (states.tensor, and two masks), but in many cases, we don't need the sliced masks.
3. `torch.distributions.Categorical` is inherently slower than `probs.multinomial(1)`. Its initialization takes some time, and sampling with it also seems slower than the multinomial.


1 needs to be handled with the related issue #315, since we may need to store backward probabilities in the `Trajectories` object (now it can only store the forward probabilities).

2 can be resolved by making those masks optional and allowing the use of `@property` for masking instead, just like we do for `GraphEnv`s.

3 can be resolved by defining a custom distribution without all the sanity checks in `Categorical`, but it would be better to keep them for reliability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some sources of inefficiency. #377

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Some sources of inefficiency. #377

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions