Expressive Text-to-Speech and Voice Cloning

Start new thread

Fish Audio S1 - Expressive Voice Cloning and Text-to-Speech

Product Hunt

•7mo ago

Fish Audio S1 is the most expressive and emotionally rich TTS model—creating lifelike voices that capture emotion, rhythm, and nuance. Clone any voice in 10 seconds, preserving accent, tone, and speaking habits with unmatched realism.

Replies

Best

Probé mucho pero fish audio es lo mejor con voces muy realistas y la mejora continúa, gracias fish audio y que sigan innovando..

Report

6mo ago

Fish Audio

Maker

@kenny_sacsara_aspajo ¡Muchas gracias por tus palabras! :)

Nos alegra mucho saber que te gustan las voces y que notes la mejora continua. Seguiremos innovando para que cada voz se sienta aún más real y expresiva

Report

6mo ago

I just hope new fish audio can read text up to 10 minutes, without the voice speeding up.

Report

6mo ago

Fish Audio

Maker

@egavas_79 It can read text up to 10 minutes without speeding up! If it gets longer to ~25+ tho some distortion starts to happen. We're working actively to improve this for the next generation model!

Report

6mo ago

Really great realism and completely UNCENSORED (unlike elevenlabs or the other big "fishes" in the pond). Really recommend it! Use it for ASMR role plays and it's so much more realistic than the competition!

Report

6mo ago

Fish Audio

Maker

@mastersystem1111 thank you!! and love the "fish" pun hahah

Report

6mo ago

Congrats

Report

6mo ago

UI Bakery

Love seeing such expressive TTS tech becoming accessible. Fantastic launch!

Report

6mo ago

Fish Audio

Maker

@vladimir_lugovsky thank you so much Vladimir, we're working hard every day to make even more accessible!

Report

6mo ago

InsForge

Impressed by how expressive the voices are, the emotion sits in the pauses and timing. I dropped a 10 second sample and it sounded surprisingly human, with little quirks that made it feel like a real person.

Report

6mo ago

Fish Audio

Maker

@hanghuang thanks so much for your support! we really appreciate it!

Report

6mo ago

InsForge

Impressed by how expressive the voices are, the emotion sits in the pauses and timing

Report

6mo ago

I’m still learning how to use the website, so I’m not sure if you already have something like this in mind, but one feature I’d love to see implemented is the ability to record my own voice and use that as a reference. What I mean is, I’d like the model to capture the cadence, tone, and emotional range of my voice, while still keeping the generated voice intact unless I choose to fully replace it. For example, if I wanted to add more emotion to a line—like sadness, excitement, or frustration—it would be great if the AI could analyze my sample and then mirror that same energy or inflection. Right now, some of the voices, even though they sound great, don’t always capture those subtle nuances or the emotional texture I’m aiming for. It would also be helpful to have an option to upload a short voice sample, maybe a few seconds long, without having to go through complicated prompts or on-screen steps. That would make it much easier for people who can’t see what’s on the screen or find it hard to navigate the interface visually. Ideally, the system could take that single clean sample and let me adjust the tone—like making it slightly higher, softer, or deeper—while maintaining the original emotional feel. Maybe there could also be a toggle or slider for mood, so if I wanted to sound calmer or more intimate, I could easily tweak that. Add the ability to get a female and mail version of the voices directly on the main interface. And it’d be amazing if the system eventually allowed for accent customization too—like being able to choose between English, Scottish, or other regional accents while still reflecting my own speaking rhythm. Basically, what I’m hoping for is a way to be a more active participant in the process—using my voice not just as text input but as an emotional guide for how I want the final result to sound.

Report

6mo ago

Fish Audio

Maker

@timecrestlore Thanks you for the really amazing inputs. I hear you on the accent customization part, that would be a really nice feature to have and we're looking into it.

For using a short sample of your voice as reference and have that be the emotional guide but tweak on top of that - that's actually the core functionality we provide currently, if I understood you correctly. You simply make a voice here: https://fish.audio/app/voice-cloning/ and then you can start using that in the playground: https://fish.audio/app/text-to-speech/

Report

6mo ago

HabitGo

The voice quality is insanely realistic. Can’t believe it only needs 10 seconds of input!🔥

Report

6mo ago

Capalyze

So excited to see a more affordable and powerful solution in the field of TTS, which is both imaginative and practical. Congratulations on the launch!

Report

6mo ago

Fish Audio

Maker

@nextgennerd thank you so much Stan! Would love to have you test it and let us know what you think!

Report

6mo ago

•••

3 4 5