Matthieu V

Thoth - Private, local AI transcription for your Mac

by
Your Mac has more power than the Apollo 11 guidance computer. Why send private audio to the cloud? 🚀 As a Laser Physicist, I built Thoth to reclaim that power. It’s a 100% native macOS transcriber for absolute privacy. Why Thoth: • On-Device: Whisper & LLMs run locally. No cloud, no data harvesting. • Core Audio: Record Zoom/Teams directly without clunky drivers. • SwiftUI: No Electron. Just pure speed. Stop renting servers. Use your hardware. 🛡️

Add a comment

Replies

Best
Matthieu V
Hi Product Hunt! 👋 I’m Matthieu, a Laser Physicist by day and indie dev by night. I built Thoth because I was tired of the current AI landscape. We’re being told we need the cloud for everything, yet we have these incredible M-series chips sitting on our desks. Sending a confidential business meeting to a third-party server just to get a summary felt like a massive step backward for privacy. With Thoth, I wanted to prove that "Native + Local" is the ultimate setup: Core Audio: This was the biggest challenge. I wanted users to be able to record their meetings on both side (system and microphone) without installing annoying virtual audio drivers that break every macOS update or having a sketchy bot joining the meeting. On-Device LLMs: Whether it's Whisper for transcription or Qwen/Llama for summaries, it all runs on your Neural Engine. You can optionally bring your own API key from major AI providers to translate/improve/summarize your transcripts. The "No-Electron" Manifest: I spent months on the SwiftUI implementation to make sure it feels like a real Mac app. Fast, lightweight, and respectful of your RAM. I’m here all day to chat about local AI, the struggles of Core Audio, or how to maintain a "math-first" approach to app development! Can’t wait to hear your feedback! 🚀
Pratik Raj

The Core Audio approach for recording both sides of a call without virtual drivers is the part that actually matters. BlackHole breaks on every other macOS update and having a bot join your meeting just to record it feels wrong from a privacy standpoint.

Curious how the summarisation quality holds up with local Qwen/Llama vs cloud — that's usually where users end up reaching for the API key option anyway. Are most of your users running this on M-series chips, or are you seeing people try it on Intel Macs too?

Matthieu V

@pratikraj Thanks for your comment !

I wanted to use Apple's own SDKs for this from the start. A third-party audio driver felt like the wrong foundation for an app built around minimizing dependencies.

On summarization I actually tested this morning with a real technical meeting recording, technical jargon, multiple speakers, dense acronyms. Gemma 4B was under a minute but hallucinated acronym expansions, invented full names for acronyms that sounded plausible but were wrong. Fine for a general standup, not great when the content is technical and stakes are higher. Gemma 12B was slower but more accurate. Claude via BYOK handled the technical terminology correctly and produced the cleanest structured output.

Local models are solid for structured extraction on general meetings. They start hallucinating on dense domain-specific content. That's usually when people reach for an API key. Giving them the custom keywords which Thoth can do helps things but does not make miracles.

Thoth requires Apple Silicon, though not by choice exactly.

The system audio API I use was introduced in macOS 26 which dropped Intel support entirely, so Intel got gated out at the OS level before I had to make the call.

Base and Small Whisper run fine on M1 (I have a M2 myself), local LLMs need 8GB+ unified memory minimum.

Early days on user feedback since I just launched today on PH and few weeks ago on App Store, hence the self-testing this morning!