Luciano Cukar

Why we chose OpenAI TTS for NoiseCut's voices (and what's next)

by

When your core feature is a daily audio briefing, voice quality IS the product. Here's how I thought about it.


Why OpenAI TTS

I tested several providers. OpenAI won for English: natural pacing, good intonation, and smooth transitions between topics in a briefing. A typical 5-7 minute Signal costs ~$0.03–0.05 to generate. At a few hundred users, that's fine. At 50K users (I can dream), that math changes fast.

The tradeoff: OpenAI is limited for other languages, and there are only a handful of voice options. No customization of tone or personality.


What changes at scale

For non-English languages, ElevenLabs may be the leading candidate. Their multilingual models are significantly better than OpenAI for Spanish, French, German, Portuguese. But it is really expensive.

For cost optimization, the open-source TTS space is moving incredibly fast. Kokoro already delivers quality close to commercial models at a fraction of the cost. Fish Speech is leading multilingual benchmarks. And Chatterbox by Resemble AI offers production-grade voice cloning under MIT license.

If/when NoiseCut scales, self-hosted open-source is the way to go


Nobody builds a daily habit around a voice they don't enjoy. We'd rather pay more per user now and keep people coming back than optimize costs and lose them.

Curious to hear from anyone working with TTS at scale — what's been your experience? And if you tried NoiseCut, how does the voice feel to you?

5 views

Add a comment

Replies

Be the first to comment