Zac Zuo

Kimi-Audio - The universal open source model for audio AI

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation.

Add a comment

Replies

Best
Zac Zuo

Hi everyone!


Check out Kimi-Audio from Moonshot AI, an open-source project aiming for a universal audio foundation model.


This single 7B model is designed to handle many different audio tasks, like ASR, audio Q&A, generation, sound classification, and even full speech-to-speech conversations. Kimi gave us strong performance across various benchmarks.


Importantly, they've released the model weights (Base and Instruct versions), code, and a full evaluation toolkit called Kimi-Audio-Evalkit, that's the interesting part, and it's openly for the community to use and build upon.