Skip to content

Some sources of inefficiency. #377

@hyeok9855

Description

@hyeok9855

I've investigated the efficiency of the current repo, especially with the tutorials/examples/train_hypergrid_simple.py example and already pushed some changes in #376.

There are still some inefficiencies compared to the lightest implementation of HyperGrid (w/ TB & DB) as follows:

  1. When we use learnable PB, we call PF and PB independently, i.e., we require two forward passes of NN for each step. While we could share the parameters between PF and PB, we cannot call them in one forward pass in the current implementation.
  2. DiscreteEnv enforces its state DiscreteStates to possess forward & backward masks and update them during Env._step. However, this can be inefficient because having those masks makes States.__getitem__ (e.g., slicing) slower since it requires 3 tensor slicing (states.tensor, and two masks), but in many cases, we don't need the sliced masks.
  3. torch.distributions.Categorical is inherently slower than probs.multinomial(1). Its initialization takes some time, and sampling with it also seems slower than the multinomial.

1 needs to be handled with the related issue #315, since we may need to store backward probabilities in the Trajectories object (now it can only store the forward probabilities).

2 can be resolved by making those masks optional and allowing the use of @property for masking instead, just like we do for GraphEnvs.

3 can be resolved by defining a custom distribution without all the sanity checks in Categorical, but it would be better to keep them for reliability.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions