Looking beyond Unsloth’s open-source finetuning? Try
Ollama for running LLMs locally;
Evoke for simple cloud APIs;
Thunder Compute to self-host with ultra-cheap GPUs;
Mystic AI to deploy on your cloud or theirs; and
Banana.dev for serverless GPUs and fast inference.