Hi, since the RNN is used to perform a multi label classification task, shouldn't Sigmoid be used instead of Softmax layer at the end to calculate the probabilities? Couldn't Softmax be the perfect for a Multi-class classification problem? Please let me know your thoughts!