LABNet: A Lightweight Attentive Beamforming Network for Ad-hoc Multichannel Microphone Invariant Real-Time Speech Enhancement
This is the official implementation of LABNet.
This work has been extended and submitted to IEEE Trans. Audio, Speech, Lang. Process.(TASLPRO), under the title “A Collaborative Neural Speech Enhancement Framework for Distributed Microphone Arrays”(under review).
Multichannel speech enhancement (SE) aims to restore clean speech from noisy measurements by leveraging spatiotemporal signal features. In ad-hoc array conditions, microphone invariance (MI) requires systems to handle different microphone numbers and array geometries. From a practical perspective, multichannel recordings inevitably increase the computational burden for edge-device applications, highlighting the necessity of lightweight and efficient deployments. In this work, we propose a lightweight attentive beamforming network (LABNet) to integrate MI in a low-complexity real-time SE system. We design a three-stage framework for efficient intra-channel modeling and inter-channel interaction. A cross-channel attention module is developed to aggregate features from each channel selectively. Experimental results demonstrate our LABNet achieves impressive performance with ultra-light resource overhead while maintaining the MI, indicating great potential for ad-hoc array processing.
python train.pypython test.pyIf you have any questions, please feel free to contact us at: hyyan@mail.ustc.edu.cn or cqjiang@mail.ustc.edu.cn