Kokori TTS app for macOS: Transform text to speech with a powerful local API server & desktop app. High-quality voices, speed control, and seamless menubar integration.
Kokori was built as a result of burning through elevenLabs credits when developing projects locally. I did not want to pay for every time I tested TTS generation worked in my app before moving over to a paid production ready solution.
Kokori runs an API server which accepts text, voice & speed parameters to respond with an audio file. Additionally - it offers a desktop application UI to manually generate TTS audio files, view detailed logs, see previously generated audio files and manage settings.
Report
@jack_callow Lovely product! What specific pain point made you build Kokori instead of relying on macOS’s built-in voices?
Report
That is a cool idea! Are you planning to add more voices (german for example)?
The underlying TTS model does not have support for German dialect/accent voices. In future, if the model improves or releases new versions I can update the app.
Every time I stub out a TTS call during dev, I either mock it or burn real credits. Kokori running a local API server with the same text/voice/speed params means I can iterate on the integration logic without touching ElevenLabs until I'm ready for prod.
Under the hood it's using an open source model with a similar name. I'll leave it ambiguous just to avoid clones of the app but a search on huggingFace should find it.
Replies
TourGuideJS
@jack_callow Lovely product! What specific pain point made you build Kokori instead of relying on macOS’s built-in voices?
That is a cool idea! Are you planning to add more voices (german for example)?
Congrats on the launch!
TourGuideJS
Hi @ksenia_sh
The underlying TTS model does not have support for German dialect/accent voices. In future, if the model improves or releases new versions I can update the app.
@jack_callow i see, thanks for the info!
Every time I stub out a TTS call during dev, I either mock it or burn real credits. Kokori running a local API server with the same text/voice/speed params means I can iterate on the integration logic without touching ElevenLabs until I'm ready for prod.
TourGuideJS
Hi @piroune_balachandran
That's the exact problem Kokori aims to solve. Hope you find it useful!
ConnectMachine
What is the underlying technology or API being used for converting from text to speech?
TourGuideJS
Hi @syed_shayanur_rahman
Under the hood it's using an open source model with a similar name.
I'll leave it ambiguous just to avoid clones of the app but a search on huggingFace should find it.