Octave TTS - Describe any AI voice and prompt its emotional delivery
by•
The first LLM for text-to-speech. While other TTS just “reads” words, Octave grasps their meaning. Create any AI voice with a descriptive prompt, guide its emotional delivery (angrier! more sarcasm!), and bring your stories to life with human-like expression.



Replies
Hume AI
Hey Product Hunt! I’m Alan Cowen, CEO and Chief Scientist at Hume AI.
We're launching Octave, the first of a new generation of text-to-speech models. Traditional TTS models focus on the mechanical process of turning letters into sounds. Octave isn't a traditional TTS model, but a voice-enabled LLM, trained on 1000x more language. As a result, it understands the cognitive and emotional aspects of human speech. It reads your script like a human actor, delivering realistic emotions, sarcasm, pace, word emphasis, and more.
And unlike any other other TTS system, it can take explicit instructions to generate any voice you describe and modify its emotional tone and speaking style.
Octave is made possible by Hume's research. We're leading the space in voice-enabled LLMs, and we run large-scale psychology studies to help fine-tune our models to generate the right voices at the right time, drawing on a decade of research at the intersection of emotion science and AI.
We’re launching both a platform for creators and an API for developers. We're also launching the Expressive TTS Arena (arena.hume.ai)—a new public benchmark for evaluating emotion-rich, long-form speech generation with instructions.
Ready to try it?
Try Octave: hume.ai
Join our Discord: https://link.hume.ai/discord
Follow our updates: x.com/hume_ai
I’ll be here all day to answer questions and discuss how this technology evolved from our emotion research. Thank you for checking out Octave!
Hume AI
@masump Thank you, Masum!
Lancepilot
Can you describe how Octave's ability to understand the cognitive and emotional aspects of human speech improves its text-to-speech output?
Hume AI
@odeth_negapatan1 When humans speak, they’re actually using a lot of intelligence to predict word emphases, emotional intonations, pacing, and other speaking styles. That’s what separates human speech from past TTS models. Octave brings that intelligence to text to speech for the first time. It implicitly predicts when things are novel, funny, sarcastic, resentful, etc., and adapts its voice accordingly to deliver that text just as a human would.
Spiritory
What sets Octave apart from traditional TTS models, and how does its training on 1000x more language enhance its performance? Congratulations!
Hume AI
@andy_wong4 Hey Andy! Octave has the intelligence of a cutting edge language model, whereas traditional TTS models are trained on a lot less data and don’t understand the contextual relationship between words and vocal sounds.This makes Octave better in predicting how a sentence should sound as if a skilled actor would be reading it. Try giving it really expressive text with no prompt and see what voice it generates or give it acting instructions like “speak in an angry tone” or “whisper” (with appropriate text)!
DiffSense
🥰 Oh this is so cool! When I use 11labs or OpenAIs voice synths, I usually have to record many takes and then remix snippets to get the right tonality and feel. 11labs. Please buy this company 🙏
Hume AI
@sentry_co This is such a common frustration – and exactly the pain point we wanted to solve with Octave ;)
Hume AI
@sentry_co Thanks for the support! ✨
DiffSense
I asked the CEO of 11labs about this problem the other day in the PH forums in an 11labs AMA. He forgot to reply 😅. So I guess they also know about this pain-point.
Now it’s the time to use more human like product to see how it replies to my questions related to all facets of life from professional to personal.
Hume AI
@ajay27324 Thanks Ajay! Enabling AI to engage in more rich and human-like speech, communication, and understanding of our expressions is one of our main goals. Excited to hear how you use Octave!
Hume AI
@ajay27324 Awesome! Let us know what you think. Thanks Ajay :)
I trained a TTS system myself last year and I am 100% amazed by how well Octave sounds! :)
Hume AI
@jpc Thanks Jakub! Means a lot coming from you.
Ovren
Congrats on the launch, Alan and the Hume AI team! Octave’s emotional TTS is a game-changer for content creation. Excited to see where this goes! 🎉
Hume AI
@mikita_aliaksandrovich Hey Mikita! Thanks so much. What kinds of new content do you foresee Octave enabling?
Hume AI
@mikita_aliaksandrovich Thank you!
Love the emotional control feature. Huge leap forward for voice tech!
Congratulations on launching Octave TTS! It's impressive how it goes beyond standard text-to-speech by understanding context and emotion.
How do you ensure the emotional delivery aligns with user intent, and what measures are in place for continuous improvement in voice accuracy?
Hume AI
@shea2 Thanks for the kind words, Shea! Contextually sensitive emotional delivery and expressiveness is ensured by how much we weight feedback, evaluations, and ratings from real humans in our data collection and training process. All of our products are built on methods and approaches from emotion science, in particular the science of how we express ourselves. We plan to continue improving the accuracy and nuance of Octave with further iteration on this approach!
yoooo this is really sick!! i think this is going to have a big impact on independent storytellers and videographers
Hume AI
@catapultingcupcakes Thanks, Elle! We are also really excited to see how this enables new workflows and possibilities for storytellers.
Hume AI
@catapultingcupcakes Thank you! We appreciate you taking the time to check us out.