Fish Audio S1 is the most expressive and emotionally rich TTS model—creating lifelike voices that capture emotion, rhythm, and nuance. Clone any voice in 10 seconds, preserving accent, tone, and speaking habits with unmatched realism.
Replies
Best
I tested at least 30 models for cloning voice...BUT this looks so legit, especially for emotion control. Looking forward to using it🚀🚀
@jason_shen3 Thanks so much Jason! We hope you use fish audio for some cool stuff!
Report
@zhizdev a random thoughts: Can I clone others' voice with customized text to read instead strictly follow the text message you provided? That can allow me to clone others' voice if I only have recorded audio files about them.
@zhizdev@jason_shen3 The text on the landing page voice clone demo is just a guide in case it's hard to come up with something to say, but you can essentially say anything according to your needs. We don't put a limit on voice clone slots right now, so you can test out a few different samples with different pace/emotion/tone and see the results for yourself :) It really depends on what you're trying to do with it.
Few audio models produce great quality voice with emotions and I do hope Fish audio could be the one. Will you also provide end to end agent product for prosumers? Or just provide api services?
@cruise_chen Our product for prosumers now includes a full story studio to make speech generation workflows a lot easier, i.e. audiobooks, video narration etc. And we're adding a bunch of prosumer focused features in the coming 2 months too :)
@abod_rehman Thanks so much for your support! Our model has been trained across multiple languages so it is able to capture accents as well as speaking behavior!
Congrats Helena and team on the launch! Incredible work behind the scenes and so thrilled to see this come to life! How can I help get more startups building with your TTS/are conversational agents on the roadmap next?
@michelle_fang2 thank you for being there for me all the time behind the scenes ♥️ Lemme cook harder and we make some noise tgt HAHA and yes, on the roadmap and currently training a new model for it!
🎧 Just gave Fish Audio a spin — and wow, the emotional depth is next level. I’ve played around with other TTS tools before, but they often fall flat when it comes to tone and expressiveness. This one? Feels like it gets the soul of the voice. 😮💨
I’m especially impressed by the 10-second voice cloning — tested it with a friend’s audio snippet, and the result was uncanny.
Curious: how does it compare with commercial models like ElevenLabs in multilingual scenarios? Have you stress-tested accents or emotion transfer in other languages?
Massive props to the team behind So-VITS-SVC and Bert-VITS2 — open-source + this level of polish is rare. 🔥
@kui_jason Thanks for giving us a spin! We support multilingual by default and they can be mixed in a sentence. We are working hard to get emotion right in more languages!
Report
@zhizdev how do you mix multiple languages in a sentence? i've been trying to do that and burning through my credits. use case: travel videos in english narration but correctly saying French/Italian/Chinese attraction names. that's the only thing stopping me from getting the annual subscription!
@zhizdev@owen_t Oh hmm that's a great point you're raising, right now it'd be hard cuz depending on the voice model you use it tries to adapt to the speaking patterns of that original accent, which is why S1 captures the voice traits so well. We're working on a newer model with more options to satisfy different use cases including the one you mentioned. Pls stay tuned!
Replies
I tested at least 30 models for cloning voice...BUT this looks so legit, especially for emotion control. Looking forward to using it🚀🚀
Fish Audio
@jason_shen3 Thanks so much Jason! We hope you use fish audio for some cool stuff!
@zhizdev a random thoughts: Can I clone others' voice with customized text to read instead strictly follow the text message you provided? That can allow me to clone others' voice if I only have recorded audio files about them.
Fish Audio
@zhizdev @jason_shen3 The text on the landing page voice clone demo is just a guide in case it's hard to come up with something to say, but you can essentially say anything according to your needs. We don't put a limit on voice clone slots right now, so you can test out a few different samples with different pace/emotion/tone and see the results for yourself :) It really depends on what you're trying to do with it.
Claap
Honestly this is impressive, the results are good! Can't wait to see in what directions this could be used.
Fish Audio
@seantiffonnet Thanks so much Sean! We really appreciate it!
Great voice product - congrats!!
Fish Audio
@yichen_guo1 Thanks so much! This means a lot!
Agnes AI
Few audio models produce great quality voice with emotions and I do hope Fish audio could be the one. Will you also provide end to end agent product for prosumers? Or just provide api services?
Fish Audio
@cruise_chen Thanks so much for the support! Our immediate roadmap is to make voice better!
Fish Audio
@cruise_chen Our product for prosumers now includes a full story studio to make speech generation workflows a lot easier, i.e. audiobooks, video narration etc. And we're adding a bunch of prosumer focused features in the coming 2 months too :)
Propulseapp
Congrats
Fish Audio
@jules_camille_dore Thanks appreciate it!
Triforce Todos
This is amazing! How do you handle different accents with just 10 seconds of audio?
Fish Audio
@abod_rehman Thanks so much for your support! Our model has been trained across multiple languages so it is able to capture accents as well as speaking behavior!
All-in-1 Coliving Guide by Elysian House
Congrats Helena and team on the launch! Incredible work behind the scenes and so thrilled to see this come to life! How can I help get more startups building with your TTS/are conversational agents on the roadmap next?
Fish Audio
@michelle_fang2 thank you for being there for me all the time behind the scenes ♥️ Lemme cook harder and we make some noise tgt HAHA
and yes, on the roadmap and currently training a new model for it!
Ramp
insane.
Fish Audio
@hellyeah insanity
Good voice tech is usually extremely expensive. Nice to see a high-quality option that's actually affordable
Fish Audio
@max_chang3 Thanks so much for your support Max!
Fish Audio
@max_chang3 thank you MAXIMILIAN
HabitGo
🎧 Just gave Fish Audio a spin — and wow, the emotional depth is next level. I’ve played around with other TTS tools before, but they often fall flat when it comes to tone and expressiveness. This one? Feels like it gets the soul of the voice. 😮💨
I’m especially impressed by the 10-second voice cloning — tested it with a friend’s audio snippet, and the result was uncanny.
Curious: how does it compare with commercial models like ElevenLabs in multilingual scenarios? Have you stress-tested accents or emotion transfer in other languages?
Massive props to the team behind So-VITS-SVC and Bert-VITS2 — open-source + this level of polish is rare. 🔥
Fish Audio
@kui_jason Thanks for giving us a spin! We support multilingual by default and they can be mixed in a sentence. We are working hard to get emotion right in more languages!
@zhizdev how do you mix multiple languages in a sentence? i've been trying to do that and burning through my credits. use case: travel videos in english narration but correctly saying French/Italian/Chinese attraction names. that's the only thing stopping me from getting the annual subscription!
Fish Audio
@zhizdev @owen_t Oh hmm that's a great point you're raising, right now it'd be hard cuz depending on the voice model you use it tries to adapt to the speaking patterns of that original accent, which is why S1 captures the voice traits so well. We're working on a newer model with more options to satisfy different use cases including the one you mentioned. Pls stay tuned!