Which AI models run on a AMD Radeon RX 7900 XT?

With 20 GB of VRAM, here are the popular models you can run locally (4,096-token context, ~32.0 GB system RAM assumed), ranked by popularity.

VRAM
20 GB
Vendor
AMD
Fits in VRAM
34 models
Assumed RAM
32.0 GB

The AMD Radeon RX 7900 XT comes with 20 GB of VRAM. Among the popular GGUF models we track, it can run 34 of them entirely in VRAM — including Llama-3.2-1B-Instruct-Q8_0-GGUF, Qwen3-4B-GGUF, gpt-oss-20b-GGUF.

Larger models such as Qwen3-Coder-Next-GGUF still run on a AMD Radeon RX 7900 XT but require offloading part of the model to system RAM, which lowers speed. Models that exceed both VRAM and RAM are not listed.

New to this? Read: How much VRAM do you need?

ModelSize Quant.Quality MemorySpeed~ Verdict
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF 1.24B Q8_0 Excellent 2.57 GB 325.1 t/s Fits in VRAM
Qwen/Qwen3-4B-GGUF 4.02B Q8_0 Excellent 5.75 GB 100.3 t/s Fits in VRAM
unsloth/gpt-oss-20b-GGUF 20.91B F16 Very good 13.83 GB 31.1 t/s Fits in VRAM
janhq/Jan-v3.5-4B-gguf 4.41B GGUF Excellent 10.04 GB 48.6 t/s Fits in VRAM
bartowski/gemma-2-2b-it-GGUF 2.61B F32 Excellent 11.33 GB 41.0 t/s Fits in VRAM
MaziyarPanahi/Qwen3-0.6B-GGUF 0.75B GGUF Excellent 2.62 GB 284.6 t/s Fits in VRAM
MaziyarPanahi/Qwen3-14B-GGUF 14.77B Q6_K Excellent 13.94 GB 35.4 t/s Fits in VRAM
MaziyarPanahi/Qwen3-8B-GGUF 8.19B GGUF Excellent 17.44 GB 26.2 t/s Fits in VRAM
MaziyarPanahi/Qwen3-32B-GGUF 32.76B Q3_K_L Good 19.7 GB 24.8 t/s Fits in VRAM
MaziyarPanahi/Qwen3-1.7B-GGUF 2.03B GGUF Excellent 5.28 GB 105.5 t/s Fits in VRAM
MaziyarPanahi/Qwen3-30B-A3B-GGUF 30.53B Q3_K_L Fair 18.27 GB 27.0 t/s Fits in VRAM
bartowski/Meta-Llama-3.1-8B-Instruct-GGUF 8.03B Q8_0 Excellent 10.12 GB 50.3 t/s Fits in VRAM
Qwen/Qwen2.5-1.5B-Instruct-GGUF 1.78B GGUF Excellent 4.76 GB 120.6 t/s Fits in VRAM
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF 30.53B Q4_K_XL Good 19.92 GB 24.3 t/s Fits in VRAM
MaziyarPanahi/Phi-3.5-mini-instruct-GGUF 3.82B Q8_0 Excellent 5.53 GB 105.8 t/s Fits in VRAM
Qwen/Qwen2.5-3B-Instruct-GGUF 3.4B GGUF Excellent 8.02 GB 63.2 t/s Fits in VRAM
bartowski/Llama-3.2-3B-Instruct-GGUF 3.21B F16 Excellent 7.66 GB 66.8 t/s Fits in VRAM
Qwen/Qwen2.5-0.5B-Instruct-GGUF 0.63B GGUF Excellent 2.36 GB 339.1 t/s Fits in VRAM
MaziyarPanahi/Qwen3-4B-Instruct-2507-GGUF 4.02B GGUF Excellent 9.27 GB 53.3 t/s Fits in VRAM
LiquidAI/LFM2.5-8B-A1B-GGUF 8.47B BF16 Excellent 17.99 GB 25.3 t/s Fits in VRAM
MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF 7.25B GGUF Excellent 15.6 GB 29.6 t/s Fits in VRAM
MaziyarPanahi/gemma-3-4b-it-GGUF 3.88B GGUF Excellent 8.98 GB 55.3 t/s Fits in VRAM
MaziyarPanahi/Meta-Llama-3-8B-Instruct-GGUF 8.03B GGUF Excellent 17.13 GB 26.7 t/s Fits in VRAM
MaziyarPanahi/Qwen2.5-7B-Instruct-GGUF 7.62B GGUF Excellent 16.32 GB 28.2 t/s Fits in VRAM
MaziyarPanahi/Phi-4-mini-instruct-GGUF 3.84B GGUF Excellent 8.9 GB 55.9 t/s Fits in VRAM
MaziyarPanahi/Yi-Coder-1.5B-Chat-GGUF 1.48B GGUF Excellent 4.14 GB 145.4 t/s Fits in VRAM
MaziyarPanahi/DeepSeek-R1-0528-Qwen3-8B-GGUF 8.19B GGUF Excellent 17.44 GB 26.2 t/s Fits in VRAM
MaziyarPanahi/Mistral-Nemo-Instruct-2407-GGUF 12.25B Q8_0 Excellent 14.62 GB 33.0 t/s Fits in VRAM
MaziyarPanahi/Llama-3-8B-Instruct-32k-v0.1-GGUF 8.03B GGUF Excellent 17.13 GB 26.7 t/s Fits in VRAM
MaziyarPanahi/gemma-3-1b-it-GGUF 1.0B GGUF Excellent 3.15 GB 214.0 t/s Fits in VRAM
MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF 70.55B IQ1_S Very low 19.14 GB 28.0 t/s Fits in VRAM
TheBloke/Mistral-7B-Instruct-v0.2-GGUF 7.24B Q8_0 Excellent 9.27 GB 55.8 t/s Fits in VRAM
MaziyarPanahi/Yi-Coder-9B-Chat-GGUF 8.83B GGUF Excellent 18.68 GB 24.3 t/s Fits in VRAM
MaziyarPanahi/gemma-3-12b-it-GGUF 11.77B Q8_0 Excellent 14.11 GB 34.3 t/s Fits in VRAM
unsloth/Qwen3-Coder-Next-GGUF 79.67B Q4_1 Very good 51.73 GB 1.1 t/s Offload
Qwen/Qwen2.5-Coder-32B-Instruct-GGUF 32.76B Q5_K_M Excellent 46.89 GB 1.2 t/s Offload
MaziyarPanahi/Mixtral-8x22B-v0.1-GGUF 140.62B Q2_K Low 50.2 GB 1.0 t/s Offload
MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF 70.55B Q5_K_M Very good 51.37 GB 1.1 t/s Offload

"Fits in VRAM" = fast, fully on GPU. "Offload" = part on system RAM, slower. Speed is a rough estimate.

Frequently asked questions

How much VRAM does the AMD Radeon RX 7900 XT have?

The AMD Radeon RX 7900 XT has 20 GB of VRAM, which determines how large a model it can run entirely on the GPU.

What is the best LLM to run on a AMD Radeon RX 7900 XT?

Among popular models, hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF runs well on a AMD Radeon RX 7900 XT using the Q8_0 quantization (about 2.57 GB). Larger models trade speed for capability via RAM offloading.

Can a AMD Radeon RX 7900 XT run a 7–8B model?

Yes. A 7–8B model like Qwen3-8B-GGUF fits entirely in the 20 GB of a AMD Radeon RX 7900 XT (GGUF).

Can a AMD Radeon RX 7900 XT run a 13–14B model?

Yes. A 13–14B model like Qwen3-14B-GGUF fits entirely in the 20 GB of a AMD Radeon RX 7900 XT (Q6_K).

Can a AMD Radeon RX 7900 XT run a 70B model?

Yes. A 70B model like Meta-Llama-3.1-70B-Instruct-GGUF fits entirely in the 20 GB of a AMD Radeon RX 7900 XT (IQ1_S).

Another graphics card