GPUdetecting...
VRAM
BW
RAM
CORES

Estimates based on browser APIs. Actual specs may vary.

S
0
Runs great
A
0
Runs well
B
0
Decent
C
0
Tight fit
D
0
Barely runs
F
22
Too heavy

Llama 3.1 8B

Llama 3.1 Community

Meta · 8B · Llama 3.1 Community

Meta's versatile 8B — great quality/speed ratio

F
4.6 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2024-07Architecture DenseMemory —
chatcodereasoning

DeepSeek R1

MIT

DeepSeek · 671B · active 37B · MIT

Massive MoE reasoning model — 37B active

F
344.2 GB·64Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-01Architecture MoEactive 37BMemory —
reasoning

DeepSeek V3.2

MIT

DeepSeek · 685B · active 37B · MIT

State-of-the-art MoE — 37B active params

F
351.4 GB·128Kctx·6mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-12Architecture MoEactive 37BMemory —
chatcodereasoning

GPT-OSS 120B

Apache 2.0

OpenAI · 117B · active 5.1B · Apache 2.0

OpenAI's flagship open-weight MoE — 52.6% SWE-bench

F
60.4 GB·128Kctx·10mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-08Architecture MoEactive 5.1BMemory —
chatreasoningcode

DeepSeek R1 Distill 32B

MIT

DeepSeek · 32B · MIT

R1 reasoning distilled into Qwen 32B — sweet spot

F
16.9 GB·64Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-01Architecture DenseMemory —
reasoning

GPT-OSS 20B

Apache 2.0

OpenAI · 21B · active 3.6B · Apache 2.0

OpenAI's open-weight MoE with configurable reasoning

F
11.3 GB·128Kctx·10mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-08Architecture MoEactive 3.6BMemory —
chatreasoningcode

Llama 3.3 70B

Llama 3.3 Community

Meta · 70B · Llama 3.3 Community

Best open model at 70B class

F
36.4 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2024-12Architecture DenseMemory —
chatreasoningcode

Gemma 3 27B

Gemma

Google · 27B · Gemma

Google's flagship Gemma 3 model

F
14.3 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-03Architecture DenseMemory —
chatvisionreasoning

Qwen 2.5 Coder 32B

Apache 2.0

Alibaba · 32B · Apache 2.0

Best open-source coding model at release

F
16.9 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2024-11Architecture DenseMemory —
code

Qwen 3 32B

Apache 2.0

Alibaba · 32B · Apache 2.0

Qwen 3 flagship dense model

F
16.9 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-04Architecture DenseMemory —
chatcodereasoning

Mistral Small 3.1 24B

Apache 2.0

Mistral AI · 24B · Apache 2.0

Multimodal Mistral with vision support

F
12.8 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-03Architecture DenseMemory —
chatvisioncode

Llama 4 Scout 17B

Llama 4 Community

Meta · 109B · active 17B · Llama 4 Community

MoE with 16 experts, 17B active params

F
56.3 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-04Architecture MoEactive 17BMemory —
chatvisionreasoning

Gemma 3 4B

Gemma

Google · 4B · Gemma

Google's compact multimodal model

F
3.0 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-03Architecture DenseMemory —
chatvision

Llama 3.2 1B

Llama 3.2 Community

Meta · 1B · Llama 3.2 Community

Meta's smallest Llama for edge devices

F
1.0 GB·128Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2024-09Architecture DenseMemory —
chatedge

Phi-4 14B

MIT

Microsoft · 14B · MIT

Microsoft's reasoning-focused model

F
7.7 GB·16Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2024-12Architecture DenseMemory —
reasoningcode

Devstral 2 123B

MRL

Mistral AI · 123B · MRL

Dense 123B coding model — 72.2% SWE-bench Verified

F
63.5 GB·256Kctx·6mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-12Architecture DenseMemory —
code

Qwen 3.5 9B

Apache 2.0

Alibaba · 9B · Apache 2.0

Multimodal Qwen 3.5 mid-size

F
5.1 GB·32Kctx·4mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2026-02Architecture DenseMemory —
chatvision

Kimi K2

Kimi

Moonshot AI · 1T · active 32B · Kimi

1T-param MoE with 384 experts — 32B active, strong agentic coding

F
512.7 GB·128Kctx·11mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-07Architecture MoEactive 32BMemory —
chatreasoningcode

Phi-4 Mini 3.8B

MIT

Microsoft · 3.8B · MIT

Microsoft's compact reasoning model

F
2.8 GB·16Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-02Architecture DenseMemory —
chatcodereasoning

Qwen 3 0.6B

Apache 2.0

Alibaba · 600M · Apache 2.0

Ultra-light Qwen 3 model for constrained devices

F
0.8 GB·32Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-04Architecture DenseMemory —
chatedge

Qwen 3 1.7B

Apache 2.0

Alibaba · 1.7B · Apache 2.0

Compact Qwen 3 for mobile and edge

F
1.5 GB·32Kctx·1y ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2025-04Architecture DenseMemory —
chatedge

Qwen 3.5 0.8B

Apache 2.0

Alibaba · 800M · Apache 2.0

Ultra-tiny model for embedded and edge

F
0.9 GB·32Kctx·4mo ago
Q2_KQ3_K_MQ4_K_MQ5_K_MQ6_KQ8_0F16
Released 2026-02Architecture DenseMemory —
chatedge

How do I know which AI model I can run locally?

CanIRunAi detects your GPU, VRAM, RAM, and CPU cores directly in your browser, then matches your hardware against 20+ popular AI models. Models that fit in your VRAM get grades S/A/B, while those requiring more memory get lower grades.

What is the minimum GPU needed to run AI models locally?

You need at least 8GB VRAM to run 7-8B parameter models (like Llama 3.1 8B) at Q4 quantization. For larger models like 70B, you need 24GB+ VRAM. CPU-only inference is possible but significantly slower.

What is quantization and why does it matter?

Quantization reduces model precision (from 16-bit to 4-bit or lower), cutting VRAM usage by 2-4x with minimal quality loss. Q4_K_M is the most popular quantization level, offering the best balance of size and quality.

Can I run AI models without a GPU?

Yes, using CPU inference with llama.cpp or Ollama. However, expect 5-10x slower speeds. For best results, use smaller models (1-4B parameters) and ensure you have sufficient system RAM (at least 2x the model size).

Which is the best open-source AI model for coding?

As of 2025, DeepSeek R1 Distill 32B and Qwen 2.5 Coder 32B are top choices for coding tasks. For lighter hardware, Phi-4 Mini 3.8B offers good coding ability with minimal resource requirements.