At the beginning of Section 4 "AN ARCHITECTURE FOR VERY DEEP VAES" it says:
"This VAE consists only of convolutions, nonlinearities, and Gaussian stochastic layers."
But the code uses also a Discretized Mixture of Logistics (DML) (see "DmolNet" class on vae_helpers.py).
Aside from quantitative model performance (e.g. FID scores), what is the difference between using only Gaussian stochastic layers instead of DML wrt reconstructions and sampling performance?