SelfHostLLM

Calculate the GPU memory you need for LLM inference

103 followers

Calculate the GPU memory you need for LLM inference

103 followers

Calculate GPU memory requirements and max concurrent requests for self-hosted LLM inference. Support for Llama, Qwen, DeepSeek, Mistral and more. Plan your AI infrastructure efficiently.

Free

Launch tags:Open Source•Developer Tools•Artificial Intelligence

Launch Team

Unblocked — Get AI agents to generate code that fits your system.

Get AI agents to generate code that fits your system.

Promoted

GPT-4o

This is truely helpful! I've been wrestling with GPU sizing for my self-hosted LLMs, and this tool is a lifesaver. Being able to precisely estimate requirements before I even start spinning up instances is kinda genius imo. Does it work with different quantization methods too?

Report

7mo ago