Fish Audio S2 - Real Expressive AI Voices
by•
We've open-sourced Fish Audio S2, a new generation of expressive TTS that lets you direct voices with natural language. Add cues like [whisper] or [laughing nervously], generate multi-speaker dialogue in one pass, and create scary-real voices across 80+ languages.


Replies
Fish Audio
Hi our beloved PH!
[excited] [slightly nervous]
Today we’re launching Fish Audio S2, our new text-to-speech model.
[long pause]
Hear Fish S2 Read This!
This is a big step beyond S1, redefining expressive voice AI. Write emotion cues anywhere in the text and hear the speech flow exactly how [emphasis] YOU direct it.
And, [inhale] we’re open-sourcing all of it.
GitHub: https://github.com/fishaudio/fish-speech/
HuggingFace: https://huggingface.co/fishaudio/s2-pro/
Shout out to SGLang for powering our stack.
There’s much more to S2.
Try it yourself now: https://fish.audio/s2/
As always, we want to give back to the community. For the launch, we’re offering free generation credits and an exclusive 50% OFF promo code: PH-FishS2
Go build weird things with it :)
We’d love to hear what you make.
Fish Audio
@hehe6z incredibly proud of this one, amazing job team!
Fish Audio
@rissa_cao teamwork 👾
Vozo AI — Video localization
@hehe6z Congrats on the launch! Curious how you see Fish Audio compared with ElevenLabs — what do you think are the biggest advantages or differences today?
@hehe6z best ai tool i ever found for my work
Calling Clones
Can I use this in a raspberri pi voice assistant that I have at home?
What abour the voice cloning to use it in phone calls?
eleven labs is not that good.. ( or I dont know how to set it up)
Fish Audio
@javierfandos Hi Javi, this is a great point - yes you absolutely can! For example home-assistant has direct fish audio support, you can check out the deets here: https://www.home-assistant.io/integrations/fish_audio/. Voice cloning is also one of the flagship features our users love because of the extreme realism :)
Calling Clones
I'm lauching something soon! I need to find somenthing! Will take a look! dankeee
Fish Audio
@javierfandos that's awesome looking forward to your launch!!
Calling Clones
@hehe6z WOW! just cloned my voice. its actually better than eleven labs!
big fish audio fans for a long time, been witness the team always go above and beyond. let's gooooo s2! congrats on this launch
Fish Audio
@kellyann3644 Thank you Kelly for the long time support. We appreciate you so much <3
Klariqo AI Voice Assistants
Oh my this is mind blowing. Does it support streaming on self hosted?
Fish Audio
@ansh_deb Oh hey Ansh good to see you again!! Yes it surely does!
Klariqo AI Voice Assistants
@hehe6z That's amazing! Would love to give it a try soon!
Fish Audio
@ansh_deb let me know if we can support with anything!
How does Fish Audio maintain consistent emotional prosody and rhythmic nuance across long-form content, and what specific architectural improvements over So-VITS-SVC allow for such high-fidelity cloning from only 10 seconds of source audio?
Fish Audio
@mordrag great question Denis! S2 moves beyond systems like So-VITS-SVC and instead generates speech with a large speech-language model that operates on discrete audio tokens, which lets it maintain the traits over long passages. because S2 is heavily pretrained on large-scale speech data, the reference clip mainly anchors speaker identity and style, so it can clone voices extremely well from just 15 seconds of sample audio.
Cue
This is a big unlock for anyone building voice-driven products. Directing voices with natural language cues like [whisper] or [laughing nervously] instead of fiddling with sliders is so much more intuitive. Love that it's open source too. What languages are you seeing the most community demand for?
Fish Audio
@dparrelli Besides English a lot of Spanish, Chinese, and Japanese! Thank you for your support David!
HakkoAI
exactly what we need, gonna try it now
Fish Audio
Excited to see the new version coming! Will it support any new languages?
Fish Audio
@vladimir_osipov Thank you Vladimir! Yeah the language support has expanded significantly compared to S1. S2 Pro supports 80+ languages.
Tier 1: Japanese (ja), English (en), Chinese (zh)
Tier 2: Korean (ko), Spanish (es), Portuguese (pt), Arabic (ar), Russian (ru), French (fr), German (de)
Other supported languages: sv, it, tr, no, nl, cy, eu, ca, da, gl, ta, hu, fi, pl, et, hi, la, ur, th, vi, jw, bn, yo, sl, cs, sw, nn, he, ms, uk, id, kk, bg, lv, my, tl, sk, ne, fa, af, el, bo, hr, ro, sn, mi, yi, am, be, km, is, az, sd, br, sq, ps, mn, ht, ml, sr, sa, te, ka, bs, pa, lt, kn, si, hy, mr, as, gu, fo, and more.
Adjust Page Brightness - Smart Control
this is called gold mate! keep making more such products like these
Fish Audio
@kshitij_mishra4 thanks man!!
Flow GPT
Good job!
Fish Audio
@lifan_wang Thanks for your support Lifan! Hope you have fun trying it out, let us know your thoughts!