Voxtral Transcribe 2 by Mistral - Real-time speech-to-text with speaker diarization
byβ’
Voxtral Transcribe 2 delivers ultra-fast, highly accurate speech-to-text with real-time transcription and speaker diarization. Built for live apps, voice agents, and meetings, it supports 13 languages, word-level timestamps, and privacy-first deployment . All at industry-leading speed and cost.



Replies
ZeroHuman.
@byalexaiΒ Congrats on the launch! How do you balance openness like open weights, or edge deployment, with ensuring quality, safety, and consistency for enterprise customers?
minimalist phone: creating folders
That moment when Mistral thinks faster than I speak. :D
It was definitely time for Mistral to launch something SOTA! Awesome.
at $0.003/min, this basically kills the margin for a lot of transcription wrappers. curious if the "diarization" actually handles people talking over each other (cross-talk), or if it still gets confused?
Real-time transcription with speaker diarization is a game-changer for meetings and interviews. Does it support multiple languages and export to editable formats like Word or Google Docs?
Awesome! The speed is really great.
speaker diarization is always tricky. how does it perform with overlapping speech? and whats the latency for real-time use?
Speaker diarization is the feature that separates good transcription from great transcription! How many speakers can it reliably distinguish? And does the real-time aspect work well for live meetings or is there noticeable latency? Mistral's been shipping quality models consistently.