Qwen3-Omni - Native end-to-end multilingual omni-modal LLM

Flowtica Scribe

•6mo ago

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Replies

Best

Flowtica Scribe

Hunter

📌

Hi everyone!

The native multimodal model from the Qwen3 series is here. My main focus has been on native voice capabilities, and this model is very impressive.

According to the official benchmarks, its performance in ASR, audio understanding, and voice conversation is on par with Google's Gemini 2.5 Pro. It also supports 119 languages.

You can experience the model's capabilities right now on Qwen Chat by enabling the voice (or video) mode.

Report

6mo ago

Whether I’m using it on my phone or tablet, the app adapts perfectly to different screen sizes—no awkward formatting issues at all.

Report

6mo ago