You use the for-loop for multi-head. (Time `x` The Number of Heads) And also use the for-loop for Graph attention. (Time `x` The Number of Graph) It will be very slow. Is there any other way to solve that point ?