Adam Perlis

Walkie - Free local AI voice dictation tool

by
Voice control your entire Mac. Walkie transcribes your speech instantly, strips the filler, and formats your text, then goes further. Open apps, launch URLs, and control your workflow without touching the keyboard. Fast Mode handles it all in the cloud. Local Mode keeps everything on-device. Your voice, your rules.

Add a comment

Replies

Best
Adam Perlis
I got fed up with Wispr Flow charging for some basic services and decided to build a better version I could offer for free. Now we offer everything Wispr Flow does for free and just charge for the parts where we are truly innovating or cost us money.
Razib Ahmed

@adam_perlis I'm sorry to say, your app doesn't support Bangla language at all. Bengal is my mother tongue and it has almost 300,000,000 speakers or one of the top 5-6 languages on earth. I have tried the free version and then I could not type a single sentence.

 What I did next? I changed the language and worked with English language. Well, I have to say that your level of accuracy is great. I don't have any complaint about it. As you can see, I have typed by using your app Waki and presenting here. But the main problem is that in the free version, it takes 30 to 40 seconds to transcribe only 100 words.

 This is frustrating and there are plenty other options available. So I suggest that you take care of this matter if you really want to get a decent number of users.

My OS: Windows 11 and 16GB RAM

Adam Perlis

@razib_ahmed4 thanks for the thorough review and feedback. While I can't say when we will be able to support Bengal in a way that is useable what I can share is that we give options for nearly all the top voice transcription models. I suggest you download several and try them out and see what works best for you.

Adam Perlis

@razib_ahmed4 in regards to your performance issue. We are releasing a patch right now that should resolve performance issues. Keep use posted if you notice any changes.

Razib Ahmed

@adam_perlis Then I tried the paid or fast version and at first I typed some words in English language. The speed is fair enough, I have no complaint about it and I think that the paid version works quite well as far as speed is concerned. However, in your description you have pointed out that you wanted to offer a free alternative but I think that you need some catching up to do.

And in the fast or paid version, the accuracy level is quite good, great.

Then I tried with my mother tongue or Bengali language. The level of accuracy is not good. It made too many errors. The speed for the Bengali language is fine, but as I stated earlier, it is unworkable because of the errors.

Overall as a new product, it is good and I wish you all the success. Thanks a lot for coming up with a good app for speech-to-text or speech-to-recognition. I am hopeful that in coming months we will see a lot of developments from you.

swati paliwal

@adam_perlis What's the biggest frustration you had with Wispr Flow's paid basics that inspired this free alternative, and how did fixing it change your own workflow?

Prateek kumar

Much needed Wispr flow alternative, I like the idea of trigger phrases as it prevents dictating the same thing over and over.

Hopefully it will be able to handle accent diversity.

Adam Perlis

@prateek_kumar28 I have found that if you try it a few times with the same word and just keep correcting it or adding the correction manually in the Walkie setting its usually nails it.

Rustam Khasanov

Hi Adam, congrats on your launch!

Wispr is charging for first-class STT quality, good UX and transcription speed. These things cost money, could you please share your plans on how to offer this for free?

I'm facing the same tasks in my product and I am open to discuss this!


Adam Perlis

@rustam_khasanov while I am not able to share our trade secrets with a competitor haha I will say that we did lots of research to figure this out.

Rustam Khasanov

@adam_perlis anyway wish you luck and once again congrats!

We both are building essential products:)

Daniel L

@adam_perlis Would be useful for us lay people to know tho :)

blink twice if you're selling our data 😅

Pallavi

it has multi language support such as hindi ?

Adam Perlis

@pal_gai Yes any language supported here: https://whisper-api.com/docs/languages/

Elena Nimchenko

Love this mission. It’s not just about AI hype — it’s about practical creation and real output. Excited to dive in and connect with fellow creators!

Yana Kazantseva

Love this! feels like Gen AI builders really need a space that’s more focused than traditional networks. The “build together” angle and emphasis on actually shipping things is especially strong.

Excited to see communities forming around people who are not just exploring AI, but actively creating with it.

We also launched today on Product Hunt — building Ogoron, an AI system that automatically generates and maintains test coverage as products evolve. Different layer, but same mindset of helping builders move faster and with more confidence :)

Good luck with the launch!

Fraser Wiseman

the dual-mode approach is genuinely smart. most speech-to-text tools make you pick a lane: either cloud quality or local privacy. having both in one place covers different use cases without switching apps. i've used a few Whisper-based tools and local accuracy is better than people expect now.

curious what the "formatting" in Fast Mode actually does in practice. is it punctuation and paragraph cleanup, or does it restructure the content more meaningfully? that feels like the bit that separates quick voice notes from something you could actually send without editing.

Adam Perlis

@fraser_svg both modes have some level of formatting. So on the local side you get basic formality stuff. But in the fast mode you get a bit more thoughtfulness. For example, lets say you said "Let's meet for dinner at 7pm or actually 8pm." it would auto correct to "Let's meet for dinner at 8pm." There are many scenarios where its just more intelligent about how it formats. Its also context aware so it knows the app your in and adjusts the context, for example in an email or a Slack.

Fraser Wiseman

@adam_perlis ahh ok so it's not just cleaning up punctuation, it's actually catching the "wait no i meant 8pm" stuff before it sends. That's cool. Does it pick up on tone too? Like keeping slack casual but making an email more put together?

Adam Perlis

@fraser_svg yup you can define that in the settings.

André J

Any recommends for keeping my voice clean? All these talking is making it scrangly 😅 Maybe adjacent business opertunity? :D

Adam Perlis

@conduit_design what do you mean by clean?

André J

@adam_perlis Voice gets broken when you talk alot. Vibe code side effect 😬

Adam Perlis

@conduit_design Thanks for sharing but I am still not following your feedback. Do you mind elaborating more. What exactly breaks about the voice when you talk alot. If you can provide a concrete example we can troubleshoot the issue. It could be a number of things including your machine specs, having a certain feature toggled on, running too many programs at the same time, etc.

André J

@adam_perlis No, this is a human thing not a maschine thing. So a human can speak about 10k words before their voice gets brittle, our vocal chords are vibrating at very high freequencies etc. A very active day of coding often involves writing 10k words to an AI. Thats the tension. You can often see it in vibe code videos of devs that talk to their AI. their voice is gone. Esp with people that dont talk alot normally. They have way less than 10k spoken words a day. Maybe no more than 1k words. But there are things you can do. like 10min rest every 50 min. Also maybe analysying brittle ness with AI. so like feedback when its time to stop talking. And also stats. Youve spoken 10k words today etc.

Kiyoshi Nagahama

The dual-mode approach is smart. I'm building a Mac-native video editor that uses cloud transcription for accuracy, but I'd love to offer a local option for users who don't want to send audio off-device. How does Local Mode handle longer recordings — say 60+ minutes of a lecture with technical terminology? And how's the Japanese accuracy in Local Mode?

Adam Perlis

@cyberseeds I have not tried anything that long but if you want to test is and send me the results I would love that! TBH its likely a better idea to use a different tool for that purpose that would record first and transcribe later. Seems like an edge case but if there are more people who want this we could probably figure it out.

Dmytro Krutytskyi

Nice one, team. It's great to see a solid, free alternative in this space. Good luck with the launch!

12
Next
Last