Guides
In-depth, data-backed guides on running AI models locally — VRAM requirements, quantization, hardware choice and more.
How Much VRAM Do You Need to Run an LLM Locally?
How much VRAM and RAM you need to run AI models locally — by model size and GGUF quantization, with exact figures and per-GPU guidance.
Read guide arrow_forwardGGUF Quantization Explained (Q4_K_M, Q5, Q8, IQ)
What GGUF quantization levels mean, how they trade memory for quality, and which one to choose for running LLMs locally.
Read guide chevron_right articleBest LLM for Your VRAM (8, 12, 16, 24 GB)
Which local LLM to run for your amount of VRAM — the biggest models that fit at a balanced quantization, with examples for each GPU memory size.
Read guide chevron_right articleBest GPU for Running Local LLMs
How to choose a GPU for running AI models locally: why VRAM is king, and which cards run which model sizes.
Read guide chevron_right articleHow to Run an LLM Locally (Beginner's Guide)
A beginner-friendly guide to running AI models on your own computer — choosing a tool, picking a model that fits, and example commands.
Read guide chevron_right