
SelfHostLLM
Calculate the GPU memory you need for LLM inference
103 followers
Calculate the GPU memory you need for LLM inference
103 followers
Calculate GPU memory requirements and max concurrent requests for self-hosted LLM inference. Support for Llama, Qwen, DeepSeek, Mistral and more. Plan your AI infrastructure efficiently.



GPT-4o
This is truely helpful! I've been wrestling with GPU sizing for my self-hosted LLMs, and this tool is a lifesaver. Being able to precisely estimate requirements before I even start spinning up instances is kinda genius imo. Does it work with different quantization methods too?
Very cool calculator, looking forward to checking this out.
Agnes AI
Love how SelfHostLLM lets you actually estimate GPU needs for different LLMs—no more guessing and overbuying fr. Super smart idea, realy impressed!