Gemma offers flexibility: 4B for balanced performance, 12B for
maximum quality. Both run entirely on-device with excellent
instruction-following for structured summaries and action items.
Ministral 3B (1.9 GB) is the fastest on-device LLM in Thoth.
Runs smoothly even on base M1 machines for users who prioritize
speed. Open license enables full local deployment.
Llama 3.2 3B strikes the perfect balance between speed and quality.
At 2.3 GB, it's fast enough for real-time summaries while maintaining
excellent output. Optimized for Apple Silicon/CoreML.
Whisper runs entirely on-device using Apple's CoreML/Neural Engine.
It's open-source, works offline, and delivers excellent accuracy
across 99 languages without sending data to cloud servers.
Perfect for Thoth's privacy-first philosophy.