Universal-Streaming delivers all the streaming speech-to-text voice agents need in one robust API: ultra-fast immutable transcripts, higher accuracy, built-in endpointing, and transparent pricing at $0.15/hour with unlimited concurrency.
Six months ago, we launched Universal-2 to tackle last-mile accuracy in speech recognition—today, we’re excited to introduce Universal-Streaming, our purpose-built, real-time speech-to-text model designed specifically for voice agents.
Universal-Streaming isn’t just fast—it’s a game-changer for real-time apps:
⚡ ~300ms immutable transcripts with no partial/final tradeoff
🧠 Intelligent endpointing that smooths out awkward pauses and interruptions
🔒 Accurate on the tokens that matter—emails, codes, names
🌎 Unlimited concurrency at just $0.15/hr with no surprise fees
And it’s not just about the benchmarks—developers building real-world voice agents are already seeing the difference: more natural interactions, higher task completion, and easier scaling from 5 to 50,000+ concurrent users.
Whether you're shipping AI assistants, realtime transcription tools, or something entirely new—Universal-Streaming gives you the power to build voice products your users will actually love to talk to.
We can’t wait to see what you build—try it out and tell us what you think!
@devon_malloy congrats on the launch. Couple of Qs: 1) which languages does it support? 2) do you have benchmark data by language vs 11labs Scribe and Deepgram Nova-3? Couldn't find answers to either.
@alex_redfern The Universal-Streaming model is currently available in english. As for the benchmark data, we have a robust breakdown on the comparison for Nova-3. A few impacts worth noting is the 44% increase in P50 Latency, 66% reduction in cost, and 21% better WER for alphanumerics. I will pass on the feedback for a Scribe benchmark. Thanks!
🚀 AssemblyAI is a total game-changer for anyone building AI products with voice data! Their speech-to-text accuracy is seriously impressive, and the extra features like speaker detection, sentiment analysis, and chapter detection make it super powerful. 🗣️✨
I especially love the PII redaction—privacy is a huge concern, and AssemblyAI handles it seamlessly. Knowing top companies like Fireflies.ai and Glean trust this platform gives me a lot of confidence, too! 💪
If you’re looking to unlock the true potential of voice data for your product, AssemblyAI is a must-try. Excited to see how it keeps evolving! 🔥
Report
AssemblyAI is a powerful platform for building AI products with voice data! By leveraging its industry-leading Speech AI models, you can integrate accurate speech-to-text, speaker detection, sentiment analysis, chapter detection, PII redaction, and more. I’m excited to see how it helps companies like Fireflies.ai, Glean, and Loop unlock the power of voice data and create top-tier products and experiences!
Report
Hey, AssemblyAI Team, I’ve been working on a fully offline, on-device streaming ASR engine for iOS. Would love to get feedback or benchmark ideas!
I love the price range you stand at! This is great for so many types of people, and the affordability to productivity on this product is amazing. I'd love to see how this resonates with your audience.
Replies
AssemblyAI
LanguaTalk
@devon_malloy congrats on the launch. Couple of Qs: 1) which languages does it support? 2) do you have benchmark data by language vs 11labs Scribe and Deepgram Nova-3? Couldn't find answers to either.
AssemblyAI
@alex_redfern The Universal-Streaming model is currently available in english. As for the benchmark data, we have a robust breakdown on the comparison for Nova-3. A few impacts worth noting is the 44% increase in P50 Latency, 66% reduction in cost, and 21% better WER for alphanumerics. I will pass on the feedback for a Scribe benchmark. Thanks!
Pokecut
🚀 AssemblyAI is a total game-changer for anyone building AI products with voice data! Their speech-to-text accuracy is seriously impressive, and the extra features like speaker detection, sentiment analysis, and chapter detection make it super powerful. 🗣️✨
I especially love the PII redaction—privacy is a huge concern, and AssemblyAI handles it seamlessly. Knowing top companies like Fireflies.ai and Glean trust this platform gives me a lot of confidence, too! 💪
If you’re looking to unlock the true potential of voice data for your product, AssemblyAI is a must-try. Excited to see how it keeps evolving! 🔥
AssemblyAI is a powerful platform for building AI products with voice data! By leveraging its industry-leading Speech AI models, you can integrate accurate speech-to-text, speaker detection, sentiment analysis, chapter detection, PII redaction, and more. I’m excited to see how it helps companies like Fireflies.ai, Glean, and Loop unlock the power of voice data and create top-tier products and experiences!
Hey, AssemblyAI Team,
I’ve been working on a fully offline, on-device streaming ASR engine for iOS.
Would love to get feedback or benchmark ideas!
If anyone's curious to try it on their device, I’m happy to add you to TestFlight — just DM me.
https://github.com/make1986/ios-streaming-asr-offline
I love the price range you stand at! This is great for so many types of people, and the affordability to productivity on this product is amazing. I'd love to see how this resonates with your audience.