Hi, I was wondering why did you use a VAE to get the embedding instead of just the output of a middle layer from a CNN?