Describe the feature
Add support for I2S microphone input and speaker output, for devices like ReSpeaker XVF3800.
- This would enable a Voice Command Channel in Mimiclaw, including:
- I2S Audio backend, for recording and playback
- ASR path, to convert the voice commands to messages and send them to the agent
- TTS path, to get the replies from the agent and convert them to voice
- The XMOS DSP already performs audio front-end processing such as: Acoustic Echo Cancellation (AEC), Beamforming, Noise Suppression, Automatic Gain Control
Motivation
I’m currently integrating Mimiclaw into the ReSpeaker XVF3800 microphone array. It is a ESP32S3 hardware with a 4-mic microphone array and XMOS DSP. This hardware is widely used in smart speaker or voice assistant scenarios(e.g. HomeAssistant). It provides built-in audio front-end processing (AEC, beamforming, noise suppression, AGC), which means Mimiclaw can directly receive a clean speech stream, making it a good fit for embedded voice assistant scenarios.
I have already implemented the I2S Audio backend and ASR path, so I can send voice command to the agent now. My next step is to implement the TTS playback path, allowing agent responses to be played through the speaker.
If useful, I’d be happy to contribute code or help test this feature on XVF3800 hardware.
Alternatives considered
No response
Describe the feature
Add support for I2S microphone input and speaker output, for devices like ReSpeaker XVF3800.
Motivation
I’m currently integrating Mimiclaw into the ReSpeaker XVF3800 microphone array. It is a ESP32S3 hardware with a 4-mic microphone array and XMOS DSP. This hardware is widely used in smart speaker or voice assistant scenarios(e.g. HomeAssistant). It provides built-in audio front-end processing (AEC, beamforming, noise suppression, AGC), which means Mimiclaw can directly receive a clean speech stream, making it a good fit for embedded voice assistant scenarios.
I have already implemented the I2S Audio backend and ASR path, so I can send voice command to the agent now. My next step is to implement the TTS playback path, allowing agent responses to be played through the speaker.
If useful, I’d be happy to contribute code or help test this feature on XVF3800 hardware.
Alternatives considered
No response