Popular repositories Loading
-
delayed-streams-modeling
delayed-streams-modeling PublicKyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
-
Repositories
Showing 10 of 23 repositories
- moshi Public
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
kyutai-labs/moshi’s past year of commit activity - sphn Public
python bindings for symphonia/opus - read various audio formats from python and write opus files
kyutai-labs/sphn’s past year of commit activity - ARC-Encoder Public
kyutai-labs/ARC-Encoder’s past year of commit activity - casa Public
A vision-language model with an improved cross-attention mechanism for scalable streaming inference
kyutai-labs/casa’s past year of commit activity - tts_longeval Public
kyutai-labs/tts_longeval’s past year of commit activity - delayed-streams-modeling Public
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
kyutai-labs/delayed-streams-modeling’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…