
Ollama
The easiest way to run large language models locally
5.0•26 reviews•1.8K followers
The easiest way to run large language models locally
5.0•26 reviews•1.8K followers
Run Llama 2 and other models on macOS, with Windows and Linux coming soon. Customize and create your own.
This is the 4th launch from Ollama. View more
Ollama v0.19
Launched this week
Ollama v0.19 rebuilds Apple Silicon inference on top of MLX, bringing much faster local performance for coding and agent workflows. It also adds NVFP4 support and smarter cache reuse, snapshots, and eviction for more responsive sessions.




Free
Launch Team





This is huge for local-first AI workflows. Curious how much real-world speedup people are seeing on M-series chips
Will have to try this out as a previous version totally drowned my 16gb mini.