Speech Emotion Recognition(SER) Papers
- 2021_Deep learning approaches for speech emotion recognition: state of the art and research challenges Paper
- 2023_Transformers in Speech Processing: A Survey Paper
- 2023_Multimodal Deep Learning(CV+NLP)
- 2017_End-to-End Multimodal Emotion Recognition using Deep Neural Networks (A+V) Code[] tzirakis/Multimodal-Emotion-Recognition: (github.com)
- Transformer-based Self-supervised Multimodal Representation Learning for Wearable Emotion Recognition 2303.17611.pdf (arxiv.org)
- 2022_A study of transformer-based end-to-end speech recognition system for Kazakh language
- 2019_Attention Based Fully Convolutional Network for Speech Emotion Recognition
- 2019_Emotion Recognition from Speech
- 2020_ MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION
- 2020_Speech Emotion Recognition with deep learning
- 2020_ SELF-SUPERVISED LEARNING WITH CROSS-MODAL TRANSFORMERS FOR EMOTION RECOGNITION
- 2021_Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings
- 2022_GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition
- 2022_ LIGHT-SERNET: A LIGHTWEIGHT FULLY CONVOLUTIONAL NEURAL NETWORK FOR SPEECH EMOTION RECOGNITION
- ☆2023_TEMPORAL MODELING MATTERS: A NOVEL TEMPORAL EMOTIONAL MODELING APPROACH FOR SPEECH EMOTION RECOGNITION (SER SOTA )
- 2023_Speech Emotion Recognition Based on Two-Stream Deep Learning Model Using Korean Audio Information
- 2020_ SELF-SUPERVISED LEARNING WITH CROSS-MODAL TRANSFORMERS FOR EMOTION RECOGNITION
- 2020_Context-Dependent Domain Adversarial Neural Network for Multimodal Emotion Recognition(SER SOTA- Code×)
- 2021_ 음성감정인식 성능 향상을 위한 트랜스포머 기반 전이학습 및 다중작업학습 [Transformer Encoder WA70.6%]
- 2021_EMOPIA: A MULTI-MODAL POP PIANO DATASET FOR EMOTION RECOGNITION AND EMOTION-BASED MUSIC GENERATION [LSTM]
- 2021_ Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning
- 2021_Multimodal End-to-End Sparse Model for Emotion Recognition(T+A+V)
- 2021_MULTIMODAL EMOTION RECOGNITION WITH HIGH-LEVEL SPEECH AND TEXT FEATUR
- ☆2022_COGMEN: COntextualized GNN based Multimodal Emotion recognitionF1-score 84.5%