kyutai

moshi Public

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9.8k 915

pocket-tts Public

A TTS that fits in your CPU (and pocket)

Python 3.6k 401

delayed-streams-modeling Public

Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.

Python 2.9k 300

hibiki Public

Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- H…

Rust 1.4k 111

unmute Public

Make text LLMs listen and speak

Python 1.2k 215

moshi-finetune Public

Python 411 61

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kyutai

Popular repositories Loading

Repositories

Uh oh!

People

Top languages

Most used topics

Uh oh!