I have been reviewing your implementation of the Transformer model (located at /MASSA/Multimodal_pretrain/src_v0/model.py), and I noticed that the encoder and decoder appear to be missing positional encoding and residual connections. I would like to ask whether the absence of these components may affect the model's completeness and its overall performance.