What LLMs Can Run on Your Mac?

Chip

Unified Memory

Bandwidth 307 GB/s

Available for Models ~21 GB

Model Compatibility

Showing 867 of 867 models

General · Alibaba · 2026-02-28

Q8_0Excellent

1.5 GB6% of RAM~185 tok/s0.87B params

Run with ToolPiper

General · Alibaba · 2026-02-28

Q8_0Excellent

3.0 GB13% of RAM~71 tok/s2.27B params

Run with ToolPiper

Qwen3.5 0.8B Base

General · Alibaba · 2026-02-28

Q8_0Excellent

1.5 GB6% of RAM~185 tok/s0.87B params

Run with ToolPiper

Qwen3.5 2B Base

General · Alibaba · 2026-02-28

Q8_0Excellent

3.0 GB13% of RAM~71 tok/s2.27B params

Run with ToolPiper

LFM2.5 1.2B Thinking

General · Liquid AI · 2026-01-20

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

LFM2.5 1.2B Instruct

Chat · Liquid AI · 2026-01-06

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

General · Liquid AI · 2026-01-05

Q8_0Excellent

2.3 GB10% of RAM~101 tok/s1.6B params

Run with ToolPiper

LFM2.5 1.2B Base

General · Liquid AI · 2026-01-05

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

General · Liquid AI · 2025-12-25

Q8_0Excellent

3.4 GB14% of RAM~63 tok/s2.57B params

Run with ToolPiper

General · Liquid AI · 2026-01-04

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

LFM2 2.6B Transcript

General · Liquid AI · 2026-01-05

Q8_0Excellent

3.4 GB14% of RAM~63 tok/s2.57B params

Run with ToolPiper

General · Alibaba · 2026-02-27

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

General · Google · 2026-03-02

Q8_0Excellent

6.2 GB26% of RAM~31 tok/s5.12B params

Run with ToolPiper

Qwen3.5 4B Base

General · Alibaba · 2026-02-27

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

General · Liquid AI · 2025-10-07

Q8_0Excellent

9.8 GB41% of RAM~107 tok/s8.34B params

Run with ToolPiper

LFM2 ColBERT 350M

General · Liquid AI · 2025-10-28

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

General · Liquid AI · 2025-09-22

Q8_0Excellent

3.4 GB14% of RAM~63 tok/s2.57B params

Run with ToolPiper

LFM2 350M PII Extract JP

General · Liquid AI · 2025-09-30

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

granite 4.0 h tiny

General · ibm-granite · 2025-09-16

Q8_0Excellent

8.2 GB34% of RAM~167 tok/s6.94B params

Run with ToolPiper

granite 4.0 h micro

General · ibm-granite · 2025-09-16

Q8_0Excellent

4.1 GB17% of RAM~50 tok/s3.19B params

Run with ToolPiper

General · Liquid AI · 2025-08-12

Q8_0Excellent

1.0 GB4% of RAM~357 tok/s0.45B params

Run with ToolPiper

General · Liquid AI · 2025-10-22

Q8_0Excellent

3.8 GB16% of RAM~54 tok/s3B params

Run with ToolPiper

General · Liquid AI · 2025-08-12

Q8_0Excellent

2.3 GB9% of RAM~102 tok/s1.58B params

Run with ToolPiper

LFM2 1.2B Extract

General · Liquid AI · 2025-08-22

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

LFM2 350M Extract

General · Liquid AI · 2025-09-03

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

LFM2 350M ENJP MT

General · Liquid AI · 2025-09-03

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

General · Liquid AI · 2025-09-03

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

General · Liquid AI · 2025-08-25

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

General · Liquid AI · 2025-09-03

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

General · Liquid AI · 2025-07-10

Q8_0Excellent

1.8 GB8% of RAM~137 tok/s1.17B params

Run with ToolPiper

General · Liquid AI · 2025-07-10

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

General · Liquid AI · 2025-07-10

Q8_0Excellent

1.3 GB6% of RAM~217 tok/s0.74B params

Run with ToolPiper

EXAONE 4.0 1.2B

General · lgai-exaone · 2025-07-11

Q8_0Excellent

1.9 GB8% of RAM~126 tok/s1.28B params

Run with ToolPiper

General · Alibaba · 2025-04-27

Q8_0Excellent

1.3 GB6% of RAM~214 tok/s0.75B params

Run with ToolPiper

General · Alibaba · 2025-04-27

Q8_0Excellent

2.8 GB12% of RAM~79 tok/s2.03B params

Run with ToolPiper

General · huggingfacetb · 2025-07-08

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.08B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

DeepSeek R1 Distill Qwen 1.5B

Reasoning · DeepSeek

Q8_0Excellent

2.5 GB10% of RAM~90 tok/s1.78B params

Run with ToolPiper

Qwen3 Coder 30B A3B Instruct gptq 8bit

Coding · btbtyler09

Q8_0Excellent

10.9 GB45% of RAM~158 tok/s9.3B params

Run with ToolPiper

DeepSeek R1 0528 Qwen3 8B

Reasoning · lmstudio-community

mlx-8bitExcellent

1.9 GB8% of RAM~132 tok/s1.28B params

Run with ToolPiper

PaddleOCR VL 1.5

General · paddlepaddle

Q8_0Excellent

1.6 GB7% of RAM~168 tok/s0.96B params

Run with ToolPiper

gemma 3 12b it quantized W4A16

General · abhishekchohan

Q8_0Excellent

3.7 GB15% of RAM~56 tok/s2.86B params

Run with ToolPiper

tiny random BambaForCausalLM

General · hmellor

Q8_0Excellent

0.5 GB2% of RAM~5,360 tok/s0.03B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

VideoLLaMA3 2B Image HF

General · lkhl

Q8_0Excellent

2.7 GB11% of RAM~82 tok/s1.96B params

Run with ToolPiper

General · hmellor

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

General · titanml

Q8_0Excellent

0.8 GB3% of RAM~2,265 tok/s0.25B params

Run with ToolPiper

General · pfnet

Q8_0Excellent

1.9 GB8% of RAM~125 tok/s1.29B params

Run with ToolPiper

Qwen3 4B Thinking 2507

General · lmstudio-community

mlx-8bitExcellent

1.2 GB5% of RAM~268 tok/s0.63B params

Run with ToolPiper

Qwen3 Coder 30B A3B Instruct AWQ 4bit

Coding · cyankiwi

Q8_0Excellent

6.4 GB27% of RAM~277 tok/s5.31B params

Run with ToolPiper

granite 4.0 h tiny AWQ 4bit

General · cyankiwi

Q8_0Excellent

2.7 GB11% of RAM~579 tok/s2B params

Run with ToolPiper

Qwen3 VL 2B Thinking

General · Alibaba

Q8_0Excellent

2.9 GB12% of RAM~75 tok/s2.13B params

Run with ToolPiper

Phi 4 mini reasoning

Reasoning · lmstudio-community

mlx-8bitExcellent

1.1 GB5% of RAM~281 tok/s0.6B params

Run with ToolPiper

olmOCR 2 7B 1025 INT4

General · winninghealth

Q8_0Excellent

2.5 GB10% of RAM~91 tok/s1.77B params

Run with ToolPiper

Llama 3.2 1B Aegis SFT DPO

General · ahczhg

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

CyberXP_Agent_Llama_3.2_1B

General · abaryan

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

tiny random llama4

General · optimum-intel-internal-testing

Q8_0Excellent

0.5 GB2% of RAM~223,726 tok/s0.01B params

Run with ToolPiper

llama 3.2 1b code instruct

Coding · shahriarferdoush

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

granite 4.0 h 1b

General · ibm-granite

Q8_0Excellent

2.1 GB9% of RAM~110 tok/s1.46B params

Run with ToolPiper

typhoon ocr1.5 2b

Reasoning · typhoon-ai

Q8_0Excellent

2.9 GB12% of RAM~75 tok/s2.13B params

Run with ToolPiper

General · quanttrio

Q8_0Excellent

3.0 GB13% of RAM~71 tok/s2.27B params

Run with ToolPiper

Orpo Llama 3.2 1B 15k

General · adamlucek

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

General · lmstudio-community

mlx-8bitExcellent

0.9 GB4% of RAM~512 tok/s0.33B params

Run with ToolPiper

LFM2 1.2B MLX bf16

General · lmstudio-community

mlx-8bitExcellent

1.7 GB7% of RAM~144 tok/s1.17B params

Run with ToolPiper

Qwen3.5 4B PARO

General · z-lab

Q8_0Excellent

2.1 GB9% of RAM~110 tok/s1.46B params

Run with ToolPiper

gemma 3 tiny random

General · yujiepan

Q8_0Excellent

0.5 GB2% of RAM~16,081 tok/s0.01B params

Run with ToolPiper

Qwen3.5 35B A3B quantized.w4a16

General · apolo13x

Q8_0Excellent

7.6 GB32% of RAM~316 tok/s6.38B params

Run with ToolPiper

tiny random Idefics3ForConditionalGeneration

General · optimum-intel-internal-testing

Q8_0Excellent

0.5 GB2% of RAM~8,040 tok/s0.02B params

Run with ToolPiper

Qwen3 VL 4B Thinking AWQ 4bit

General · cyankiwi

Q8_0Excellent

2.5 GB10% of RAM~91 tok/s1.76B params

Run with ToolPiper

Huihui Qwen3.5 2B abliterated

General · huihui-ai

Q8_0Excellent

3.0 GB13% of RAM~71 tok/s2.27B params

Run with ToolPiper

General · paddlepaddle

Q8_0Excellent

1.6 GB7% of RAM~168 tok/s0.96B params

Run with ToolPiper

Qwen3 VL 2B RRG SFT

General · dmusingu

Q8_0Excellent

2.9 GB12% of RAM~75 tok/s2.13B params

Run with ToolPiper

Qwen3.5 2B AWQ 4bit

General · cyankiwi

Q8_0Excellent

3.1 GB13% of RAM~69 tok/s2.32B params

Run with ToolPiper

General · rednote-hilab

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.04B params

Run with ToolPiper

granite 4.0 micro base

General · ibm-granite

Q8_0Excellent

4.3 GB18% of RAM~47 tok/s3.4B params

Run with ToolPiper

General · kristaller486

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.04B params

Run with ToolPiper

General · Alibaba

mlx-8bitExcellent

1.1 GB5% of RAM~296 tok/s0.57B params

Run with ToolPiper

General · rednote-hilab

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.04B params

Run with ToolPiper

Qwen3.5 9B PARO

General · z-lab

Q8_0Excellent

4.3 GB18% of RAM~47 tok/s3.44B params

Run with ToolPiper

HTML Pruner Phi 3.8B

General · zstanjj

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.82B params

Run with ToolPiper

General · prism-ml

mlx-8bitExcellent

0.9 GB4% of RAM~444 tok/s0.38B params

Run with ToolPiper

General · nanonets

Q8_0Excellent

4.7 GB20% of RAM~43 tok/s3.75B params

Run with ToolPiper

Qwen3.5 4B Claude 4.6 Opus Reasoning Distilled

Reasoning · jackrong

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

pixtral 12b quantized.w4a16

General · redhatai

Q8_0Excellent

4.1 GB17% of RAM~50 tok/s3.23B params

Run with ToolPiper

General · stanfordaimi

Q8_0Excellent

4.7 GB20% of RAM~43 tok/s3.75B params

Run with ToolPiper

NV Reason CXR 3B

Reasoning · nvidia

Q8_0Excellent

4.7 GB20% of RAM~43 tok/s3.75B params

Run with ToolPiper

LFM2.5 Audio 1.5B

General · Liquid AI · 2025-12-18

Q8_0Excellent

2.1 GB9% of RAM~109 tok/s1.47B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

1.0 GB4% of RAM~328 tok/s0.49B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

1.3 GB6% of RAM~214 tok/s0.75B params

Run with ToolPiper

Qwen3 4B Thinking 2507

General · Alibaba

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.02B params

Run with ToolPiper

Qwen2.5 1.5B quantized.w8a8

General · redhatai

Q8_0Excellent

2.5 GB10% of RAM~90 tok/s1.78B params

Run with ToolPiper

General · farbodtavakkoli

Q8_0Excellent

1.6 GB7% of RAM~161 tok/s1B params

Run with ToolPiper

DeepSeek R1 Distill Qwen 7B

Reasoning · DeepSeek · 2025-01-20

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

OTel LLM 270M IT

General · farbodtavakkoli

Q8_0Excellent

0.8 GB3% of RAM~596 tok/s0.27B params

Run with ToolPiper

General · Google · 2026-03-02

Q8_0Excellent

9.4 GB39% of RAM~20 tok/s8B params

Run with ToolPiper

General · nanbeige

Q8_0Excellent

4.9 GB20% of RAM~41 tok/s3.93B params

Run with ToolPiper

General · lovedheart

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

Qwen3 1.7B Base

General · Alibaba

Q8_0Excellent

2.4 GB10% of RAM~93 tok/s1.72B params

Run with ToolPiper

General · farbodtavakkoli

Q8_0Excellent

5.3 GB22% of RAM~37 tok/s4.3B params

Run with ToolPiper

NVIDIA Nemotron 3 Nano 30B A3B AWQ

General · stelterlab

Q8_0Excellent

6.1 GB26% of RAM~32 tok/s5.05B params

Run with ToolPiper

gemma 3 27b it GPTQ 4b 128g

General · ista-daslab

Q8_0Excellent

6.3 GB26% of RAM~31 tok/s5.23B params

Run with ToolPiper

Qwen3Guard Gen 0.6B

General · Alibaba

Q8_0Excellent

1.3 GB6% of RAM~214 tok/s0.75B params

Run with ToolPiper

llava interleave qwen 0.5b hf

General · llava-hf

Q8_0Excellent

1.5 GB6% of RAM~187 tok/s0.86B params

Run with ToolPiper

Qwen3.5 4B AWQ 4bit

General · cyankiwi

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.77B params

Run with ToolPiper

InternVL3 1B hf

General · opengvlab

Q8_0Excellent

1.5 GB6% of RAM~171 tok/s0.94B params

Run with ToolPiper

qwen base invoicev1.01 1.5B

General · laap-ai

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

General · warshanks

Q8_0Excellent

1.9 GB8% of RAM~128 tok/s1.26B params

Run with ToolPiper

General · farbodtavakkoli

Q8_0Excellent

5.2 GB22% of RAM~38 tok/s4.25B params

Run with ToolPiper

Qwen3 VL 4B Thinking

General · Alibaba

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.44B params

Run with ToolPiper

General · aidc-ai

Q8_0Excellent

3.4 GB14% of RAM~63 tok/s2.57B params

Run with ToolPiper

General · janhq

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.02B params

Run with ToolPiper

General · tiger-lab

Q8_0Excellent

5.1 GB21% of RAM~39 tok/s4.15B params

Run with ToolPiper

Qwen3 4B DFlash b16

General · z-lab

Q8_0Excellent

1.1 GB5% of RAM~298 tok/s0.54B params

Run with ToolPiper

pii extractor gemma 3 270m it

General · jakobhuss

Q8_0Excellent

0.8 GB3% of RAM~596 tok/s0.27B params

Run with ToolPiper

SmolLM3 3B Base

General · huggingfacetb

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.08B params

Run with ToolPiper

Isaac 0.2 2B Preview

General · perceptronai

Q8_0Excellent

3.4 GB14% of RAM~63 tok/s2.57B params

Run with ToolPiper

General · datalab-to

Q8_0Excellent

6.4 GB27% of RAM~30 tok/s5.3B params

Run with ToolPiper

General · perceptronai

Q8_0Excellent

3.4 GB14% of RAM~63 tok/s2.57B params

Run with ToolPiper

General · zju-ai4h

Q8_0Excellent

5.9 GB25% of RAM~33 tok/s4.83B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

1.2 GB5% of RAM~259 tok/s0.62B params

Run with ToolPiper

gemma 3 12b it quantized.w4a16

General · redhatai

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.86B params

Run with ToolPiper

granite 4.0 350m

General · ibm-granite

Q8_0Excellent

0.9 GB4% of RAM~459 tok/s0.35B params

Run with ToolPiper

General · lmstudio-community

mlx-8bitExcellent

3.0 GB12% of RAM~73 tok/s2.31B params

Run with ToolPiper

General · tristepin

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

General · lmstudio-community

mlx-8bitExcellent

6.0 GB25% of RAM~33 tok/s5.12B params

Run with ToolPiper

General · salesforce

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

2.8 GB12% of RAM~79 tok/s2.03B params

Run with ToolPiper

General · lmstudio-community

mlx-8bitExcellent

1.9 GB8% of RAM~132 tok/s1.28B params

Run with ToolPiper

General · lmstudio-community

mlx-8bitExcellent

1.0 GB4% of RAM~352 tok/s0.48B params

Run with ToolPiper

General · idea-research

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.07B params

Run with ToolPiper

Qwen3.5 4B AWQ BF16 INT4

General · cyankiwi

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.74B params

Run with ToolPiper

Qwen3 1.7B MLX bf16

General · lmstudio-community

mlx-8bitExcellent

2.3 GB10% of RAM~98 tok/s1.72B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

2.6 GB11% of RAM~87 tok/s1.84B params

Run with ToolPiper

Llama 3.1 Nemotron Nano 4B v1.1

General · nvidia

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.51B params

Run with ToolPiper

General · primeintellect

Q8_0Excellent

1.3 GB6% of RAM~214 tok/s0.75B params

Run with ToolPiper

General · quanttrio

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

Heron NVILA Lite 1B hf

General · turing-motors

Q8_0Excellent

1.5 GB6% of RAM~177 tok/s0.91B params

Run with ToolPiper

tiny random minicpm v 4_5

General · optimum-intel-internal-testing

Q8_0Excellent

0.5 GB2% of RAM~8,040 tok/s0.02B params

Run with ToolPiper

Huihui Qwen3.5 4B abliterated

General · huihui-ai

Q8_0Excellent

5.6 GB23% of RAM~35 tok/s4.54B params

Run with ToolPiper

InternVL3_5 1B HF

General · opengvlab

Q8_0Excellent

1.7 GB7% of RAM~152 tok/s1.06B params

Run with ToolPiper

vllm translategemma 4b it

General · infomaniak-ai

Q8_0Excellent

6.0 GB25% of RAM~32 tok/s4.97B params

Run with ToolPiper

InternVL3_5 2B HF

General · opengvlab

Q8_0Excellent

3.1 GB13% of RAM~68 tok/s2.35B params

Run with ToolPiper

InternVL3 2B hf

General · opengvlab

Q8_0Excellent

2.8 GB12% of RAM~77 tok/s2.09B params

Run with ToolPiper

GLM 4.6V Flash AWQ 8bit

General · cyankiwi

Q8_0Excellent

5.4 GB23% of RAM~36 tok/s4.43B params

Run with ToolPiper

Huihui Qwen3.5 4B Claude 4.6 Opus abliterated

General · huihui-ai

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.66B params

Run with ToolPiper

General · hcompany

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.44B params

Run with ToolPiper

InternVL2_5 2B MPO hf

General · opengvlab

Q8_0Excellent

3.0 GB12% of RAM~73 tok/s2.21B params

Run with ToolPiper

NuExtract 2.0 2B

General · numind

Q8_0Excellent

3.0 GB12% of RAM~73 tok/s2.21B params

Run with ToolPiper

granite 4.0 3b vision

General · ibm-granite

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4B params

Run with ToolPiper

General · thisisiron

Q8_0Excellent

1.8 GB7% of RAM~142 tok/s1.13B params

Run with ToolPiper

Gemma 4 26B A4B JANG_4M CRACK

General · dealignai

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.72B params

Run with ToolPiper

Qwen2.5 VL 3B Instruct

Chat · Alibaba · 2025-01-26

Q8_0Excellent

4.7 GB20% of RAM~43 tok/s3.75B params

Run with ToolPiper

LightOnOCR 2 1B

General · lightonai

Q8_0Excellent

1.6 GB7% of RAM~159 tok/s1.01B params

Run with ToolPiper

Qwen2.5 Coder 1.5B Instruct

Coding · Alibaba · 2024-09-18

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.09B params

Run with ToolPiper

Qwen2.5 Coder 0.5B Instruct

Coding · Alibaba

Q8_0Excellent

1.0 GB4% of RAM~328 tok/s0.49B params

Run with ToolPiper

Qwen2.5 Coder 3B Instruct

Coding · Alibaba

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.09B params

Run with ToolPiper

QVikhr 3 1.7B Instruction noreasoning

Reasoning · vikhrmodels

Q8_0Excellent

2.4 GB10% of RAM~93 tok/s1.72B params

Run with ToolPiper

Qwen2.5 Coder 14B Instruct

Coding · lmstudio-community

mlx-8bitExcellent

3.0 GB12% of RAM~73 tok/s2.31B params

Run with ToolPiper

MinerU2.5 2509 1.2B

General · opendatalab

Q8_0Excellent

1.8 GB7% of RAM~139 tok/s1.16B params

Run with ToolPiper

Qwen2.5 Coder 3B

Coding · Alibaba

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.09B params

Run with ToolPiper

Falcon H1 0.5B Base

General · TII

Q8_0Excellent

1.1 GB5% of RAM~309 tok/s0.52B params

Run with ToolPiper

Qwen2.5 Coder 0.5B

Coding · Alibaba

Q8_0Excellent

1.0 GB4% of RAM~328 tok/s0.49B params

Run with ToolPiper

Phi 4 reasoning plus

Reasoning · lmstudio-community

mlx-8bitExcellent

2.9 GB12% of RAM~74 tok/s2.29B params

Run with ToolPiper

General · salesforce

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.09B params

Run with ToolPiper

LightOnOCR 2 1B base

General · lightonai

Q8_0Excellent

1.6 GB7% of RAM~159 tok/s1.01B params

Run with ToolPiper

LFM2 Audio 1.5B

General · Liquid AI · 2025-08-28

Q8_0Excellent

2.1 GB9% of RAM~109 tok/s1.47B params

Run with ToolPiper

Qwen3 VL 2B Instruct

Chat · Alibaba

Q8_0Excellent

2.9 GB12% of RAM~75 tok/s2.13B params

Run with ToolPiper

Llama 3.2 1B Instruct FP8 dynamic

Chat · redhatai

Q8_0Excellent

2.2 GB9% of RAM~107 tok/s1.5B params

Run with ToolPiper

General · huggingfacetb

Q8_0Excellent

0.6 GB3% of RAM~1,237 tok/s0.13B params

Run with ToolPiper

Llama 3.2 1B Instruct FP8

Chat · redhatai

Q8_0Excellent

2.2 GB9% of RAM~107 tok/s1.5B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.02B params

Run with ToolPiper

Qwen3 VL 4B Instruct AWQ 4bit

Chat · cyankiwi

Q8_0Excellent

2.5 GB10% of RAM~91 tok/s1.76B params

Run with ToolPiper

tiny random Gemma2ForCausalLM

General · hmellor

Q8_0Excellent

0.5 GB2% of RAM~16,081 tok/s0.01B params

Run with ToolPiper

Qwen3 VL 4B Instruct

Chat · lmstudio-community

mlx-8bitExcellent

1.6 GB7% of RAM~162 tok/s1.04B params

Run with ToolPiper

LFM2.5 1.2B Instruct

Chat · lmstudio-community

mlx-8bitExcellent

0.9 GB4% of RAM~512 tok/s0.33B params

Run with ToolPiper

DeepSeek R1 0528 Qwen3 8B

Reasoning · DeepSeek

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

Qwen3 30B A3B Instruct 2507 AWQ

Chat · stelterlab

Q8_0Excellent

5.6 GB24% of RAM~319 tok/s4.61B params

Run with ToolPiper

General · huggingfacetb

Q8_0Excellent

2.4 GB10% of RAM~94 tok/s1.71B params

Run with ToolPiper

Qwen3 VL 8B Instruct AWQ 4bit

Chat · cyankiwi

Q8_0Excellent

3.7 GB16% of RAM~55 tok/s2.91B params

Run with ToolPiper

General · huggingfacetb

Q8_0Excellent

0.9 GB4% of RAM~447 tok/s0.36B params

Run with ToolPiper

starvector 1b im2svg

General · starvector

Q8_0Excellent

2.1 GB9% of RAM~112 tok/s1.43B params

Run with ToolPiper

General · efficient-large-model

Q8_0Excellent

3.4 GB14% of RAM~62 tok/s2.61B params

Run with ToolPiper

Qwen3 4B SafeRL

General · Alibaba

Q8_0Excellent

5.4 GB23% of RAM~36 tok/s4.41B params

Run with ToolPiper

Qwen3 4B Instruct 2507

Chat · lmstudio-community

mlx-8bitExcellent

1.2 GB5% of RAM~268 tok/s0.63B params

Run with ToolPiper

General · yannqi

Q8_0Excellent

5.9 GB24% of RAM~33 tok/s4.82B params

Run with ToolPiper

Qwen2.5 Coder 32B Instruct

Coding · lmstudio-community

mlx-8bitExcellent

6.0 GB25% of RAM~33 tok/s5.12B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

5.4 GB23% of RAM~36 tok/s4.41B params

Run with ToolPiper

Qwen3 VL 2B Instruct FP8

Chat · Alibaba

Q8_0Excellent

3.2 GB13% of RAM~66 tok/s2.44B params

Run with ToolPiper

General · nvidia

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.72B params

Run with ToolPiper

Qwen3 30B A3B Instruct 2507 AWQ 4bit

Chat · cyankiwi

Q8_0Excellent

6.4 GB27% of RAM~277 tok/s5.31B params

Run with ToolPiper

General · bytedance-seed

Q8_0Excellent

11.0 GB46% of RAM~215 tok/s9.37B params

Run with ToolPiper

Vikhr Llama 3.2 1B Instruct

Chat · vikhrmodels

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

Reasoning · stevenhh2000

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Cosmos Reason2 2B W4A16 Edge2

Embedding · embedl

Q8_0Excellent

2.9 GB12% of RAM~75 tok/s2.14B params

Run with ToolPiper

Qwen3 VL 30B A3B Instruct AWQ 4bit

Chat · cyankiwi

Q8_0Excellent

7.0 GB29% of RAM~345 tok/s5.85B params

Run with ToolPiper

InternVL3_5 4B HF

General · opengvlab

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.73B params

Run with ToolPiper

General · intel

Q8_0Excellent

3.6 GB15% of RAM~57 tok/s2.82B params

Run with ToolPiper

General · Alibaba · 2026-02-27

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

General · DeepSeek

Q8_0Excellent

4.2 GB18% of RAM~48 tok/s3.34B params

Run with ToolPiper

General · DeepSeek

Q8_0Excellent

4.3 GB18% of RAM~47 tok/s3.39B params

Run with ToolPiper

Phi 4 multimodal instruct

Chat · Microsoft · 2025-02-24

Q8_0Excellent

6.7 GB28% of RAM~29 tok/s5.57B params

Run with ToolPiper

Qwen3 Coder Next AWQ 4bit

Coding · cyankiwi

Q8_0Excellent

16.6 GB69% of RAM~162 tok/s14.44B params

Run with ToolPiper

GigaChat3 10B A1.8B

Chat · ai-sage

Q8_0Excellent

13.3 GB55% of RAM~216 tok/s11.48B params

Run with ToolPiper

Qwen3 Coder Next AWQ 4bit

Coding · bullpoint

Q8_0Excellent

16.6 GB69% of RAM~162 tok/s14.44B params

Run with ToolPiper

Qwen3.5 9B Claude 4.6 Opus Reasoning Distilled

Reasoning · jackrong

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

Qwen3.5 9B Base

General · Alibaba · 2026-02-26

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

Coding · BigCode

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.03B params

Run with ToolPiper

Llama 3.2 3B Instruct FP8

Chat · redhatai

Q8_0Excellent

4.5 GB19% of RAM~45 tok/s3.61B params

Run with ToolPiper

Qwen3.5 9B Claude 4.6 Opus Reasoning Distilled v2

Reasoning · jackrong

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

Qwen2.5 VL 3B Instruct abliterated

Chat · huihui-ai

Q8_0Excellent

4.7 GB20% of RAM~43 tok/s3.75B params

Run with ToolPiper

Qwen3.5 9B gemini 3.1 opus 4.6 reasoning

Reasoning · momix-44

Q8_0Excellent

11.0 GB46% of RAM~17 tok/s9.41B params

Run with ToolPiper

Qwen2.5 1.5B Instruct

Chat · Alibaba

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

Qwen3 4B Instruct 2507

Chat · Alibaba

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.02B params

Run with ToolPiper

Qwen2.5 0.5B Instruct

Chat · Alibaba

Q8_0Excellent

1.0 GB4% of RAM~328 tok/s0.49B params

Run with ToolPiper

Qwen2 1.5B Instruct

Chat · Alibaba

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

Qwen3 VL 4B Instruct

Chat · Alibaba

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.44B params

Run with ToolPiper

Qwen3 4B Instruct 2507 FP8

Chat · Alibaba

Q8_0Excellent

5.4 GB23% of RAM~36 tok/s4.41B params

Run with ToolPiper

Phi 3.5 mini instruct

Chat · Microsoft · 2024-08-16

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.82B params

Run with ToolPiper

Qwen2 0.5B Instruct

Chat · Alibaba

Q8_0Excellent

1.0 GB4% of RAM~328 tok/s0.49B params

Run with ToolPiper

Qwen3 4B Instruct 2507 GPTQ Int4

Chat · junhowie

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.02B params

Run with ToolPiper

Qwen3 Next 80B A3B Thinking AWQ 4bit

General · cyankiwi

Q8_0Excellent

16.9 GB71% of RAM~159 tok/s14.74B params

Run with ToolPiper

Qwen2.5 0.5B Instruct

Chat · gensyn

Q8_0Excellent

1.0 GB4% of RAM~328 tok/s0.49B params

Run with ToolPiper

Qwen1.5 0.5B Chat

Chat · Alibaba

Q8_0Excellent

1.2 GB5% of RAM~259 tok/s0.62B params

Run with ToolPiper

Nemotron H 4B Instruct 128K

Chat · nvidia

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.49B params

Run with ToolPiper

Qwen1.5 1.8B Chat

Chat · Alibaba

Q8_0Excellent

2.6 GB11% of RAM~87 tok/s1.84B params

Run with ToolPiper

EXAONE 3.5 2.4B Instruct

Chat · lgai-exaone

Q8_0Excellent

3.2 GB13% of RAM~67 tok/s2.41B params

Run with ToolPiper

Chat · jetlm

Q8_0Excellent

2.8 GB12% of RAM~79 tok/s2.03B params

Run with ToolPiper

General · Liquid AI · 2026-02-24

Q5_K_MTight

18.3 GB76% of RAM~108 tok/s23.84B params

Try Q4_K_M (15.9 GB, ~127 tok/s)

Run with ToolPiper

Gemma 4 31B JANG_4M CRACK

General · dealignai

Q8_0Excellent

7.7 GB32% of RAM~25 tok/s6.43B params

Run with ToolPiper

Qwen3 VL 4B Instruct FP8

Chat · Alibaba

Q8_0Excellent

5.9 GB25% of RAM~33 tok/s4.83B params

Run with ToolPiper

tiny_starcoder_py

Coding · BigCode

Q8_0Excellent

0.7 GB3% of RAM~1,005 tok/s0.16B params

Run with ToolPiper

Qwen2.5 VL 3B Instruct quantized.w8a8

Chat · redhatai

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.07B params

Run with ToolPiper

Qwen2 VL 2B Instruct AWQ

Chat · Alibaba

Q8_0Excellent

3.2 GB13% of RAM~66 tok/s2.44B params

Run with ToolPiper

Mistral Small 3.1 24B Instruct 2503 GPTQ 4b 128g

Chat · ista-daslab

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.73B params

Run with ToolPiper

Qari OCR v0.3 VL 2B Instruct

Chat · namaa-space

Q8_0Excellent

3.0 GB12% of RAM~73 tok/s2.21B params

Run with ToolPiper

Qwen2.5 VL 3B Instruct FP8 dynamic

Chat · redhatai

Q8_0Excellent

5.0 GB21% of RAM~40 tok/s4.07B params

Run with ToolPiper

Huihui Qwen3 VL 4B Instruct abliterated

Chat · huihui-ai

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.44B params

Run with ToolPiper

Qwen3 VL 4B Instruct Unredacted MAX

Chat · prithivmlmods

Q8_0Excellent

5.5 GB23% of RAM~36 tok/s4.44B params

Run with ToolPiper

Qwen2 VL OCR2 2B Instruct

Chat · prithivmlmods

Q8_0Excellent

3.0 GB12% of RAM~73 tok/s2.21B params

Run with ToolPiper

Qwen2.5 3B Instruct

Chat · Alibaba

Q8_0Excellent

3.9 GB16% of RAM~52 tok/s3.09B params

Run with ToolPiper

General · vikhyatk

Q8_0Excellent

2.7 GB11% of RAM~83 tok/s1.93B params

Run with ToolPiper

General · Meta · 2024-09-18

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

3.0 GB12% of RAM~73 tok/s2.21B params

Run with ToolPiper

Florence 2 large

General · Microsoft

Q8_0Excellent

1.4 GB6% of RAM~206 tok/s0.78B params

Run with ToolPiper

h2ovl mississippi 800m

General · h2oai

Q8_0Excellent

1.4 GB6% of RAM~194 tok/s0.83B params

Run with ToolPiper

h2ovl mississippi 2b

General · h2oai

Q8_0Excellent

2.9 GB12% of RAM~75 tok/s2.15B params

Run with ToolPiper

Qwen2.5 Math 1.5B

General · Alibaba

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

General · ibm-research

Q8_0Excellent

4.3 GB18% of RAM~199 tok/s3.37B params

Run with ToolPiper

t5gemma s s prefixlm

General · Google

Q8_0Excellent

0.8 GB4% of RAM~519 tok/s0.31B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

1.5 GB6% of RAM~171 tok/s0.94B params

Run with ToolPiper

llava onevision qwen2 0.5b ov hf

General · llava-hf

Q8_0Excellent

1.5 GB6% of RAM~181 tok/s0.89B params

Run with ToolPiper

General · allenai

Q8_0Excellent

2.2 GB9% of RAM~109 tok/s1.48B params

Run with ToolPiper

kanana 1.5 v 3b instruct

Chat · kakaocorp

Q8_0Excellent

4.6 GB19% of RAM~44 tok/s3.67B params

Run with ToolPiper

General · state-spaces

Q8_0Excellent

0.6 GB3% of RAM~1,237 tok/s0.13B params

Run with ToolPiper

General · bigscience

Q8_0Excellent

1.1 GB5% of RAM~287 tok/s0.56B params

Run with ToolPiper

paligemma 3b mix 224

General · Google

Q8_0Excellent

3.8 GB16% of RAM~55 tok/s2.92B params

Run with ToolPiper

General · Google

Q8_0Excellent

3.3 GB14% of RAM~64 tok/s2.51B params

Run with ToolPiper

General · vikp

Q8_0Excellent

0.8 GB4% of RAM~519 tok/s0.31B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

1.5 GB6% of RAM~171 tok/s0.94B params

Run with ToolPiper

General · Microsoft

Q8_0Excellent

1.3 GB6% of RAM~217 tok/s0.74B params

Run with ToolPiper

General · generalanalysis

Q8_0Excellent

1.2 GB5% of RAM~268 tok/s0.6B params

Run with ToolPiper

tiny random internvl2

General · optimum-intel-internal-testing

Q8_0Excellent

0.5 GB2% of RAM~8,040 tok/s0.02B params

Run with ToolPiper

gemma 1.1 2b it

General · Google

Q8_0Excellent

3.3 GB14% of RAM~64 tok/s2.51B params

Run with ToolPiper

Qwen3 8B speculator.eagle3

General · redhatai

Q8_0Excellent

1.6 GB7% of RAM~158 tok/s1.02B params

Run with ToolPiper

t5gemma 2 1b 1b

General · Google

Q8_0Excellent

2.9 GB12% of RAM~76 tok/s2.12B params

Run with ToolPiper

General · Google

Q8_0Excellent

0.8 GB3% of RAM~596 tok/s0.27B params

Run with ToolPiper

Minnow Math 1.5B

General · kitefishai

Q8_0Excellent

2.3 GB10% of RAM~99 tok/s1.63B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

3.0 GB12% of RAM~73 tok/s2.21B params

Run with ToolPiper

Llama Guard 3 1B

General · Meta

Q8_0Excellent

2.2 GB9% of RAM~107 tok/s1.5B params

Run with ToolPiper

General · bigscience

Q8_0Excellent

2.4 GB10% of RAM~93 tok/s1.72B params

Run with ToolPiper

InternVL3_5 GPT OSS 20B A4B Preview

General · opengvlab

Q8_0Excellent

0.9 GB4% of RAM~412 tok/s0.39B params

Run with ToolPiper

Qwen3.5 9B NVFP4

General · axionml

Q8_0Excellent

7.9 GB33% of RAM~24 tok/s6.63B params

Run with ToolPiper

General · aidc-ai

Q8_0Excellent

1.9 GB8% of RAM~127 tok/s1.27B params

Run with ToolPiper

Florence 2 large

General · florence-community

Q8_0Excellent

1.4 GB6% of RAM~206 tok/s0.78B params

Run with ToolPiper

stablelm 3b 4e1t

General · Stability AI

Q8_0Excellent

3.6 GB15% of RAM~57 tok/s2.8B params

Run with ToolPiper

SmolVLM Instruct

Chat · huggingfacetb

Q8_0Excellent

3.0 GB13% of RAM~71 tok/s2.25B params

Run with ToolPiper

Qwen3.5 122B A10B heretic int4 AutoRound

General · happypatrick

Q6_KExcellent

16.5 GB69% of RAM~143 tok/s18.54B params

Run with ToolPiper

gemma 2 2b jpn it

General · Google

Q8_0Excellent

3.4 GB14% of RAM~62 tok/s2.61B params

Run with ToolPiper

General · stepfun-ai

Q8_0Excellent

1.1 GB5% of RAM~287 tok/s0.56B params

Run with ToolPiper

falcon mamba tiny dev

General · TII

Q8_0Excellent

0.5 GB2% of RAM~16,081 tok/s0.01B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

1.5 GB6% of RAM~171 tok/s0.94B params

Run with ToolPiper

t5gemma 2 270m 270m

General · Google

Q8_0Excellent

1.4 GB6% of RAM~204 tok/s0.79B params

Run with ToolPiper

gemma 3 27b it quantized.w4a16

General · redhatai

Q8_0Excellent

7.9 GB33% of RAM~24 tok/s6.64B params

Run with ToolPiper

paligemma 3b mix 224

General · fal

Q8_0Excellent

3.8 GB16% of RAM~55 tok/s2.92B params

Run with ToolPiper

moondream 2b 2025 04 14 4bit

General · moondream

Q8_0Excellent

2.0 GB8% of RAM~123 tok/s1.31B params

Run with ToolPiper

General · ahmed-masry

Q8_0Excellent

3.8 GB16% of RAM~55 tok/s2.92B params

Run with ToolPiper

Perception LM 1B

General · facebook

Q8_0Excellent

2.2 GB9% of RAM~105 tok/s1.53B params

Run with ToolPiper

General · zai-org

Q8_0Excellent

2.8 GB12% of RAM~78 tok/s2.07B params

Run with ToolPiper

Vintern 1B v3_5

General · 5cd-ai

Q8_0Excellent

1.5 GB6% of RAM~171 tok/s0.94B params

Run with ToolPiper

General · Meta · 2024-09-18

Q8_0Excellent

4.1 GB17% of RAM~50 tok/s3.21B params

Run with ToolPiper

General · bigscience

Q8_0Excellent

1.1 GB5% of RAM~287 tok/s0.56B params

Run with ToolPiper

pythia 70m deduped

General · eleutherai

Q8_0Excellent

0.6 GB3% of RAM~1,608 tok/s0.1B params

Run with ToolPiper

SmolLM2 135M Instruct

Chat · huggingfacetb

Q8_0Excellent

0.6 GB3% of RAM~1,237 tok/s0.13B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

0.7 GB3% of RAM~1,072 tok/s0.15B params

Run with ToolPiper

General · salesforce

Q8_0Excellent

4.7 GB19% of RAM~43 tok/s3.74B params

Run with ToolPiper

NVIDIA Nemotron Nano 9B v2

General · nvidia · 2025-08-12

Q8_0Excellent

10.4 GB43% of RAM~18 tok/s8.89B params

Run with ToolPiper

General · Google · 2024-07-16

Q8_0Excellent

3.4 GB14% of RAM~62 tok/s2.61B params

Run with ToolPiper

SmolVLM 256M Instruct

Chat · huggingfacetb

Q8_0Excellent

0.8 GB3% of RAM~618 tok/s0.26B params

Run with ToolPiper

DeepSeek V2 Lite

General · DeepSeek

Q8_0Excellent

18.0 GB75% of RAM~74 tok/s15.71B params

Run with ToolPiper

japanese gpt neox small

General · rinna

Q8_0Excellent

0.7 GB3% of RAM~804 tok/s0.2B params

Run with ToolPiper

Qwen1.5 MoE A2.7B

General · Alibaba

Q8_0Excellent

16.5 GB69% of RAM~99 tok/s14.32B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

1.1 GB4% of RAM~315 tok/s0.51B params

Run with ToolPiper

General · huggingfacetb

Q8_0Excellent

0.6 GB3% of RAM~1,237 tok/s0.13B params

Run with ToolPiper

General · peft-internal-testing

Q8_0Excellent

0.6 GB3% of RAM~1,237 tok/s0.13B params

Run with ToolPiper

SmolVLM2 256M Video Instruct

Chat · huggingfacetb

Q8_0Excellent

0.8 GB3% of RAM~618 tok/s0.26B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

2.2 GB9% of RAM~106 tok/s1.52B params

Run with ToolPiper

General · ai-sweden-models

Q8_0Excellent

0.7 GB3% of RAM~846 tok/s0.19B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

0.5 GB2% of RAM~16,081 tok/s0.01B params

Run with ToolPiper

pythia 160m deduped

General · eleutherai

Q8_0Excellent

0.7 GB3% of RAM~766 tok/s0.21B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

1.7 GB7% of RAM~149 tok/s1.08B params

Run with ToolPiper

Qwen2.5 Coder 7B

Coding · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · Microsoft

Q8_0Excellent

2.1 GB9% of RAM~113 tok/s1.42B params

Run with ToolPiper

DeepSeek Coder V2 Lite Instruct FP8

Coding · redhatai

Q8_0Excellent

18.0 GB75% of RAM~74 tok/s15.71B params

Run with ToolPiper

General · huggingfacetb

Q8_0Excellent

2.4 GB10% of RAM~94 tok/s1.71B params

Run with ToolPiper

General · jackfram

Q8_0Excellent

0.7 GB3% of RAM~1,005 tok/s0.16B params

Run with ToolPiper

General · ibm-research

Q8_0Excellent

4.4 GB18% of RAM~46 tok/s3.51B params

Run with ToolPiper

paligemma2 3b pt 448

General · Google

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.03B params

Run with ToolPiper

SmolVLM 500M Instruct

Chat · huggingfacetb

Q8_0Excellent

1.1 GB4% of RAM~315 tok/s0.51B params

Run with ToolPiper

tiny aya global

General · coherelabs

Q8_0Excellent

4.2 GB18% of RAM~48 tok/s3.35B params

Run with ToolPiper

DanTagGen delta rev2

General · kblueleaf

Q8_0Excellent

0.9 GB4% of RAM~412 tok/s0.39B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

2.0 GB8% of RAM~117 tok/s1.37B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

0.5 GB2% of RAM~5,360 tok/s0.03B params

Run with ToolPiper

General · liautoad

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.84B params

Run with ToolPiper

h2o danube3 500m chat

Chat · h2oai

Q8_0Excellent

1.1 GB4% of RAM~315 tok/s0.51B params

Run with ToolPiper

paligemma2 3b ft docci 448

General · Google

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.03B params

Run with ToolPiper

paligemma2 3b mix 224

General · Google

Q8_0Excellent

3.9 GB16% of RAM~53 tok/s3.03B params

Run with ToolPiper

tinyllama oneshot w8w8 test static shape change

General · nm-testing

Q8_0Excellent

1.7 GB7% of RAM~146 tok/s1.1B params

Run with ToolPiper

deepseek vl2 tiny

General · isotr0py

Q8_0Excellent

4.3 GB18% of RAM~48 tok/s3.37B params

Run with ToolPiper

General · kblueleaf

Q8_0Excellent

1.1 GB4% of RAM~315 tok/s0.51B params

Run with ToolPiper

General · allenai

Q8_0Excellent

1.8 GB8% of RAM~136 tok/s1.18B params

Run with ToolPiper

pythia 410m deduped

General · eleutherai

Q8_0Excellent

1.1 GB4% of RAM~315 tok/s0.51B params

Run with ToolPiper

InternVL2_5 4B MPO

General · opengvlab

Q8_0Excellent

4.6 GB19% of RAM~43 tok/s3.71B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

4.6 GB19% of RAM~43 tok/s3.71B params

Run with ToolPiper

Mono InternVL 2B

General · opengvlab

Q8_0Excellent

4.0 GB17% of RAM~52 tok/s3.11B params

Run with ToolPiper

Vintern 3B R beta

General · 5cd-ai

Q8_0Excellent

4.6 GB19% of RAM~43 tok/s3.71B params

Run with ToolPiper

Crow 9B Opus 4.6 Distill Heretic_Qwen3.5 NVFP4

General · rhoninseiei

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.3B params

Run with ToolPiper

Qwen2.5 Coder 7B Instruct

Coding · Alibaba · 2024-09-17

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · Microsoft

Q8_0Excellent

3.6 GB15% of RAM~58 tok/s2.78B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · Google · 2026-05-23

Q8_0Excellent

13.8 GB58% of RAM~13 tok/s11.96B params

Run with ToolPiper

Florence 2 base

General · Microsoft

Q8_0Excellent

0.8 GB3% of RAM~699 tok/s0.23B params

Run with ToolPiper

DeepSeek Coder V2 Lite Instruct

Coding · DeepSeek · 2024-06-14

Q8_0Excellent

18.0 GB75% of RAM~67 tok/s15.71B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

3.5 GB15% of RAM~59 tok/s2.72B params

Run with ToolPiper

CodeLlama 7b Instruct hf

Coding · codellama

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

General · Google

Q8_0Excellent

5.3 GB22% of RAM~37 tok/s4.3B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

3.7 GB16% of RAM~55 tok/s2.91B params

Run with ToolPiper

Coding · defog

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

General · aidc-ai

Q8_0Excellent

5.7 GB24% of RAM~35 tok/s4.62B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

deepseek coder 6.7b instruct

Coding · DeepSeek

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

CodeLlama 7b hf

Coding · codellama

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

General · Microsoft

Q8_0Excellent

0.7 GB3% of RAM~893 tok/s0.18B params

Run with ToolPiper

deepseek coder 6.7b base

Coding · DeepSeek

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Florence 2 base ft

General · Microsoft

Q8_0Excellent

0.8 GB3% of RAM~699 tok/s0.23B params

Run with ToolPiper

Florence 2 large ft

General · Microsoft

Q8_0Excellent

1.4 GB6% of RAM~209 tok/s0.77B params

Run with ToolPiper

General · baidu

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.74B params

Run with ToolPiper

Ovis1.6 Llama3.2 3B

General · aidc-ai

Q8_0Excellent

5.1 GB21% of RAM~39 tok/s4.14B params

Run with ToolPiper

General · arnir0

Q8_0Excellent

0.5 GB2% of RAM~16,081 tok/s0.01B params

Run with ToolPiper

Florence 2 SD3 Captioner

General · gokaygokay

Q8_0Excellent

0.8 GB3% of RAM~596 tok/s0.27B params

Run with ToolPiper

Florence 2 Flux

General · gokaygokay

Q8_0Excellent

0.8 GB3% of RAM~596 tok/s0.27B params

Run with ToolPiper

Florence 2 base

General · florence-community

Q8_0Excellent

0.8 GB3% of RAM~699 tok/s0.23B params

Run with ToolPiper

General · yifeihu

Q8_0Excellent

1.4 GB6% of RAM~196 tok/s0.82B params

Run with ToolPiper

InternVL3_5 4B MPO

General · opengvlab

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.73B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

5.1 GB21% of RAM~39 tok/s4.15B params

Run with ToolPiper

Qwen3.5 9B NVFP4

General · apolo13x

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.54B params

Run with ToolPiper

General · Alibaba · 2025-04-27

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

Mistral 7B v0.1

General · Mistral AI

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

General · ggml-org

Q8_0Excellent

0.5 GB2% of RAM~8,425 tok/s0.04B params

Run with ToolPiper

olmOCR 2 7B 1025

General · allenai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Meta Llama 3.1 8B FP8

General · redhatai

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

General · reducto

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Qwen3 30B A3B NVFP4

General · nvidia

Q8_0Excellent

17.9 GB75% of RAM~94 tok/s15.58B params

Run with ToolPiper

olmOCR 2 7B 1025 FP8

General · allenai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Llama 3.1 Nemotron Nano 8B v1

General · nvidia

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

General · allenai

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.3B params

Run with ToolPiper

General · bytedance-seed

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Apriel 5B Instruct

Chat · servicenow-ai

Q8_0Excellent

5.9 GB25% of RAM~33 tok/s4.83B params

Run with ToolPiper

T5_Paraphrase_Paws

General · vamsi

Q8_0Excellent

0.7 GB3% of RAM~731 tok/s0.22B params

Run with ToolPiper

Hermes 3 Llama 3.1 8B

General · NousResearch

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

General · erwanf

Q8_0Excellent

0.5 GB2% of RAM~4,020 tok/s0.04B params

Run with ToolPiper

NuExtract 2.0 8B

General · numind

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · xlangai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

MiMo VL 7B RL 2508

General · xiaomimimo

Q8_0Excellent

9.8 GB41% of RAM~19 tok/s8.31B params

Run with ToolPiper

General · salesforce

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

prometheus 7b v2.0

General · prometheus-eval

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

General · haochenwang

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Flux Prompt Enhance

General · gokaygokay

Q8_0Excellent

0.7 GB3% of RAM~731 tok/s0.22B params

Run with ToolPiper

NeuralMonarch 7B

General · mlabonne

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

General · parasail-ai

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

shisa gamma 7b v1

General · augmxnt

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

llama2.c stories15M

General · xenova

Q8_0Excellent

0.5 GB2% of RAM~8,040 tok/s0.02B params

Run with ToolPiper

General · octomed

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · typhoon-ai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · singh8898

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

olmOCR 7B 0825 FP8

General · allenai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

Llama 3.2 1B Instruct

Chat · Meta

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

Phi tiny MoE instruct

Chat · Microsoft

Q8_0Excellent

4.7 GB20% of RAM~254 tok/s3.76B params

Run with ToolPiper

llava v1.6 mistral 7b hf

General · llava-hf

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.57B params

Run with ToolPiper

NVIDIA Nemotron Nano 9B v2 Japanese

General · nvidia

Q8_0Excellent

10.4 GB43% of RAM~18 tok/s8.89B params

Run with ToolPiper

General · lmstudio-community

mlx-4bitGreat

15.2 GB63% of RAM~118 tok/s23.84B params

Run with ToolPiper

gemma 3n E4B it MLX bf16

General · lmstudio-community

mlx-8bitExcellent

8.9 GB37% of RAM~22 tok/s7.85B params

Run with ToolPiper

InternVL3_5 1B Instruct

Chat · opengvlab

Q8_0Excellent

1.7 GB7% of RAM~152 tok/s1.06B params

Run with ToolPiper

SWE agent LM 7B

General · swe-bench

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · datalab-to

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

General · HuggingFace · 2023-10-26

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

Phi mini MoE instruct

Chat · Microsoft

Q8_0Excellent

9.0 GB38% of RAM~125 tok/s7.65B params

Run with ToolPiper

Qwen2.5 Math 1.5B Instruct

Chat · Alibaba

Q8_0Excellent

2.2 GB9% of RAM~104 tok/s1.54B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.1 GB38% of RAM~21 tok/s7.72B params

Run with ToolPiper

NVIDIA Nemotron Nano 9B v2 Base

General · nvidia

Q8_0Excellent

10.4 GB43% of RAM~18 tok/s8.89B params

Run with ToolPiper

NVIDIA Nemotron Nano 9B v2 FP8

General · nvidia

Q8_0Excellent

10.4 GB43% of RAM~18 tok/s8.89B params

Run with ToolPiper

Zamba2 1.2B instruct

Chat · zyphra

Q8_0Excellent

1.9 GB8% of RAM~132 tok/s1.22B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.1 GB38% of RAM~21 tok/s7.72B params

Run with ToolPiper

Qwen3 VL 8B Thinking

General · Alibaba

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

blip2 flan t5 xl

General · salesforce

Q8_0Excellent

4.9 GB20% of RAM~41 tok/s3.94B params

Run with ToolPiper

llama joycaption beta one hf llava

General · fancyfeast

Q8_0Excellent

10.0 GB41% of RAM~19 tok/s8.48B params

Run with ToolPiper

OLMoE 1B 7B 0125 Instruct

Chat · allenai

Q8_0Excellent

8.2 GB34% of RAM~138 tok/s6.92B params

Run with ToolPiper

General · allenai

Q8_0Excellent

9.2 GB38% of RAM~21 tok/s7.76B params

Run with ToolPiper

General · virtue-ai-hub

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

OLMo 2 0425 1B Instruct

Chat · allenai

Q8_0Excellent

2.2 GB9% of RAM~109 tok/s1.48B params

Run with ToolPiper

Abliterated Llama 3.2 1B Instruct

Chat · cazzz307

Q8_0Excellent

1.9 GB8% of RAM~130 tok/s1.24B params

Run with ToolPiper

Coding · BigCode · 2024-02-20

Q8_0Excellent

8.5 GB35% of RAM~22 tok/s7.17B params

Run with ToolPiper

llava v1.6 mistral 7b

General · liuhaotian

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.57B params

Run with ToolPiper

llava med v1.5 mistral 7b

General · Microsoft

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.57B params

Run with ToolPiper

General · inclusionai

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

General · llava-hf

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.57B params

Run with ToolPiper

Qwen3 VL 8B Thinking FP8

General · Alibaba

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

deepseek vl 1.3b chat

Chat · DeepSeek

Q8_0Excellent

2.7 GB11% of RAM~81 tok/s1.98B params

Run with ToolPiper

General · mbzuai

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

General · thelamapi

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

Llama 3.2 3B Instruct

Chat · Meta

Q8_0Excellent

4.1 GB17% of RAM~50 tok/s3.21B params

Run with ToolPiper

Qwen2.5 VL 7B Instruct

Chat · Alibaba · 2025-01-26

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

Phi 3 mini 4k instruct gptq 4bit

Chat · kaitchup

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.82B params

Run with ToolPiper

DeepSeek V2 Lite Chat

Chat · DeepSeek

Q8_0Excellent

18.0 GB75% of RAM~74 tok/s15.71B params

Run with ToolPiper

Phi 3 mini 4k instruct

Chat · Microsoft

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.82B params

Run with ToolPiper

DeepSeek R1 Distill Qwen 14B

Reasoning · DeepSeek

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

Qwen3 14B NVFP4

General · nvidia

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.16B params

Run with ToolPiper

EXAONE Deep 7.8B

General · lgai-exaone

Q8_0Excellent

9.2 GB38% of RAM~21 tok/s7.82B params

Run with ToolPiper

General · mbzuai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

Qwen3Guard Gen 8B

General · Alibaba

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

llava onevision qwen2 7b ov

General · lmms-lab

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

xflux_text_encoders

Coding · xlabs-ai

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.76B params

Run with ToolPiper

General · xiaomimimo

Q8_0Excellent

9.2 GB38% of RAM~21 tok/s7.83B params

Run with ToolPiper

Qwen3 30B A3B Instruct 2507 FP4

Chat · nvfp4

Q8_0Excellent

17.9 GB75% of RAM~94 tok/s15.58B params

Run with ToolPiper

General · inclusionai

Q8_0Excellent

18.6 GB78% of RAM~124 tok/s16.26B params

Run with ToolPiper

InternVL3 8B hf

General · opengvlab

Q8_0Excellent

9.4 GB39% of RAM~20 tok/s7.94B params

Run with ToolPiper

Kunoichi DPO v2 7B

General · sanjiwatsuki

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

olmOCR 7B 0225 preview

General · allenai

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

SmolLM 135M Instruct

Chat · huggingfacetb

Q8_0Excellent

0.6 GB3% of RAM~1,237 tok/s0.13B params

Run with ToolPiper

MiniCPM V 2_6 int4

General · openbmb

Q8_0Excellent

9.8 GB41% of RAM~19 tok/s8.32B params

Run with ToolPiper

General · nytopop

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

General · nvidia

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · inclusionai

Q8_0Excellent

18.6 GB78% of RAM~124 tok/s16.26B params

Run with ToolPiper

Qwen3 VL 32B Instruct AWQ 4bit

Chat · cyankiwi

Q8_0Excellent

8.3 GB35% of RAM~23 tok/s7.03B params

Run with ToolPiper

Gliese Qwen3.5 9B Abliterated Caption

General · prithivmlmods

Q8_0Excellent

11.0 GB46% of RAM~17 tok/s9.41B params

Run with ToolPiper

Qwen3.5 9B Claude 4.6 HighIQ THINKING HERETIC UNCENSORED

General · davidau

Q8_0Excellent

11.0 GB46% of RAM~17 tok/s9.41B params

Run with ToolPiper

General · bytedance-seed

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · zju-ai4h

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.04B params

Run with ToolPiper

stablelm 2 1_6b chat

Chat · Stability AI · 2024-04-08

Q8_0Excellent

2.3 GB10% of RAM~98 tok/s1.64B params

Run with ToolPiper

General · inclusionai · 2025-02-28

Q8_0Excellent

19.2 GB80% of RAM~69 tok/s16.8B params

Run with ToolPiper

General · openai

Q6_KExcellent

19.1 GB80% of RAM~58 tok/s21.51B params

Run with ToolPiper

General · allenai

Q8_0Excellent

10.2 GB42% of RAM~19 tok/s8.66B params

Run with ToolPiper

General · quanttrio

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

Qwen3.5 9B AWQ 4bit

General · cyankiwi

Q8_0Excellent

11.5 GB48% of RAM~16 tok/s9.88B params

Run with ToolPiper

General · lovedheart

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

General · huggingfacem4

Q8_0Excellent

9.9 GB41% of RAM~19 tok/s8.4B params

Run with ToolPiper

Dream v0 Instruct 7B

Chat · dream-org

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · openbmb

Q8_0Excellent

10.2 GB43% of RAM~18 tok/s8.7B params

Run with ToolPiper

Qwen2.5 7B Instruct 1M

Chat · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

Huihui Qwen3.5 9B abliterated

General · huihui-ai

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

General · open-bee

Q8_0Excellent

10.2 GB42% of RAM~19 tok/s8.68B params

Run with ToolPiper

gpt_bigcode santacoder

Coding · BigCode

Q8_0Excellent

1.7 GB7% of RAM~144 tok/s1.12B params

Run with ToolPiper

Moonlight 16B A3B

General · moonshotai

Q8_0Excellent

18.3 GB76% of RAM~139 tok/s15.96B params

Run with ToolPiper

InternVL3_5 4B Instruct

Chat · opengvlab

Q8_0Excellent

5.8 GB24% of RAM~34 tok/s4.73B params

Run with ToolPiper

General · zai-org

Q8_0Excellent

12.0 GB50% of RAM~16 tok/s10.29B params

Run with ToolPiper

granite 3b code base 2k

Coding · ibm-granite

Q8_0Excellent

4.4 GB18% of RAM~46 tok/s3.48B params

Run with ToolPiper

InternVL3_5 8B HF

General · opengvlab

Q8_0Excellent

10.0 GB42% of RAM~19 tok/s8.53B params

Run with ToolPiper

Llama 4 Scout 17B 16E Instruct quantized.w4a16

Chat · redhatai

Q6_KExcellent

17.5 GB73% of RAM~98 tok/s19.6B params

Run with ToolPiper

Qwopus3.5 9B v3

General · jackrong

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

Qwen3 14B NVFP4

General · redhatai

Q8_0Excellent

10.5 GB44% of RAM~18 tok/s8.99B params

Run with ToolPiper

Huihui Qwen3.5 9B Claude 4.6 Opus abliterated

General · huihui-ai

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.65B params

Run with ToolPiper

Qwen3.5 9B AWQ BF16 INT8

General · cyankiwi

Q8_0Excellent

11.5 GB48% of RAM~16 tok/s9.83B params

Run with ToolPiper

General · allenai

Q8_0Excellent

10.2 GB42% of RAM~19 tok/s8.68B params

Run with ToolPiper

nomic embed text v1.5

Embedding · Nomic · 2024-02-10

Q8_0Excellent

0.7 GB3% of RAM~1,149 tok/s0.14B params

Run with ToolPiper

TinyLlama 1.1B Chat v1.0

Chat · Community · 2023-12-30

Q8_0Excellent

1.7 GB7% of RAM~146 tok/s1.1B params

Run with ToolPiper

Mistral 7B Instruct v0.2

Chat · Mistral AI

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.24B params

Run with ToolPiper

General · Meta

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

phi 3 mini 4k instruct

Chat · Microsoft · 2024-04-22

Q8_0Excellent

4.8 GB20% of RAM~42 tok/s3.82B params

Run with ToolPiper

granite 3.3 8b instruct

Chat · ibm-granite

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.17B params

Run with ToolPiper

saiga_llama3_8b

General · ilyagusev

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Llama 3.1 8B Instruct FP8

Chat · nvidia

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Meta Llama 3.1 8B Instruct FP8

Chat · redhatai

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

General · stepfun-ai

Q8_0Excellent

11.8 GB49% of RAM~16 tok/s10.17B params

Run with ToolPiper

Hermes 2 Pro Llama 3 8B

General · NousResearch

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Meta Llama 3.1 8B Instruct

Chat · NousResearch

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Qwen3.5 122B A10B AWQ 4bit

General · cyankiwi

Q5_K_MTight

18.6 GB77% of RAM~128 tok/s24.27B params

Try Q4_K_M (16.2 GB, ~151 tok/s)

Run with ToolPiper

General · NousResearch

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Mistral 7B Instruct v0.3 GPTQ

Chat · thesven

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.25B params

Run with ToolPiper

Olmo 3 7B Instruct SFT

Chat · allenai

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.3B params

Run with ToolPiper

Llama 3 Patronus Lynx 8B Instruct v1.1

Chat · patronusai

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Nemotron H 8B Base 8K

General · nvidia

Q8_0Excellent

9.5 GB40% of RAM~20 tok/s8.1B params

Run with ToolPiper

Meta Llama 3.1 8B Instruct quantized.w4a16

Chat · redhatai

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Llammas base p1 GPT 4o human error mix paragraph GEC

General · tartunlp

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Llama 3 8B Instruct Gradient 1048k

Chat · gradientai

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

gemma 3 27b it int4 awq

General · gaunernst

Q8_0Excellent

7.1 GB30% of RAM~27 tok/s5.93B params

Run with ToolPiper

Meta Llama 3.1 8B Instruct FP8 dynamic

Chat · redhatai

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Falcon3 7B Instruct

Chat · TII · 2024-11-29

Q8_0Excellent

8.8 GB37% of RAM~22 tok/s7.46B params

Run with ToolPiper

Qwen2.5 7B Instruct

Chat · Alibaba · 2024-09-16

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

bge large en v1.5

Embedding · BAAI · 2023-09-12

Q8_0Excellent

0.9 GB4% of RAM~473 tok/s0.34B params

Run with ToolPiper

Qwen3 VL 8B Instruct

Chat · Alibaba

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

llava 1.5 7b hf

General · llava-hf

Q8_0Excellent

8.4 GB35% of RAM~23 tok/s7.06B params

Run with ToolPiper

GLM 4.1V 9B Thinking

General · zai-org

Q8_0Excellent

12.0 GB50% of RAM~16 tok/s10.29B params

Run with ToolPiper

Qwen3 VL 8B Instruct FP8

Chat · Alibaba

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

Qwen2 7B Instruct

Chat · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · TII

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.22B params

Run with ToolPiper

XCurOS 1.2 8B VLBF16 Instruct

Chat · xcuros

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

XCurOS 0.1 8B Instruct

Chat · xcuros

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

Chat · Alibaba

Q8_0Excellent

9.1 GB38% of RAM~21 tok/s7.72B params

Run with ToolPiper

GLM 4.1V 9B Thinking AWQ

General · dengcao

Q8_0Excellent

12.1 GB50% of RAM~16 tok/s10.36B params

Run with ToolPiper

General · facebook

Q8_0Excellent

8.4 GB35% of RAM~23 tok/s7.04B params

Run with ToolPiper

General · omni-research

Q8_0Excellent

8.4 GB35% of RAM~23 tok/s7.06B params

Run with ToolPiper

General · allenai

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.25B params

Run with ToolPiper

General · adept

Q8_0Excellent

11.0 GB46% of RAM~17 tok/s9.41B params

Run with ToolPiper

MiniCPM Llama3 V 2_5

General · openbmb

Q8_0Excellent

10.0 GB42% of RAM~19 tok/s8.54B params

Run with ToolPiper

Cosmos Reason2 8B

Reasoning · nvidia

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

llava v1.6 vicuna 7b

General · liuhaotian

Q8_0Excellent

8.4 GB35% of RAM~23 tok/s7.06B params

Run with ToolPiper

Chat · BAAI

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.76B params

Run with ToolPiper

Mantis 8B siglip llama3

General · tiger-lab

Q8_0Excellent

10.0 GB41% of RAM~19 tok/s8.48B params

Run with ToolPiper

internlm2_5 7b chat

Chat · internlm

Q8_0Excellent

9.1 GB38% of RAM~21 tok/s7.74B params

Run with ToolPiper

llava llama 3 8b v1_1 transformers

General · xtuner

Q8_0Excellent

9.8 GB41% of RAM~19 tok/s8.36B params

Run with ToolPiper

Qwen3 30B A3B NVFP4

General · redhatai

Q8_0Excellent

20.0 GB83% of RAM~84 tok/s17.45B params

Run with ToolPiper

llava v1.6 vicuna 7b hf

General · llava-hf

Q8_0Excellent

8.4 GB35% of RAM~23 tok/s7.06B params

Run with ToolPiper

General · royokong

Q8_0Excellent

9.8 GB41% of RAM~19 tok/s8.36B params

Run with ToolPiper

llama3 llava next 8b hf

General · llava-hf

Q8_0Excellent

9.8 GB41% of RAM~19 tok/s8.36B params

Run with ToolPiper

Qwen3.5 27B Claude Opus 4.6 High Reasoning NVFP4

Reasoning · harleywang

Q6_KExcellent

17.1 GB71% of RAM~11 tok/s19.14B params

Run with ToolPiper

Huihui Qwen3 VL 8B Instruct abliterated

Chat · huihui-ai

Q8_0Excellent

10.3 GB43% of RAM~18 tok/s8.77B params

Run with ToolPiper

Qwen2.5 VL 7B Instruct FP4

Chat · asi992h

Q8_0Excellent

9.9 GB41% of RAM~19 tok/s8.4B params

Run with ToolPiper

Mistral 7B Instruct v0.3

Chat · Mistral AI · 2024-05-22

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.25B params

Run with ToolPiper

Qwen2 VL 7B Instruct

Chat · Alibaba

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

General · huggyllama

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Chat · essentialai

Q8_0Excellent

9.8 GB41% of RAM~19 tok/s8.31B params

Run with ToolPiper

General · cckevinn

Q8_0Excellent

11.3 GB47% of RAM~17 tok/s9.66B params

Run with ToolPiper

Qwen2.5 Math 7B

General · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

8.3 GB35% of RAM~23 tok/s6.99B params

Run with ToolPiper

General · llm360

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Turkish Gemma 9b T1

General · ytu-ce-cosmos

Q8_0Excellent

10.8 GB45% of RAM~17 tok/s9.24B params

Run with ToolPiper

SDAR 8B Chat b32

Chat · jetlm

Q8_0Excellent

9.6 GB40% of RAM~20 tok/s8.19B params

Run with ToolPiper

glm 4 9b chat hf

Chat · zai-org

Q8_0Excellent

11.0 GB46% of RAM~17 tok/s9.4B params

Run with ToolPiper

General · salesforce

Q8_0Excellent

9.1 GB38% of RAM~21 tok/s7.75B params

Run with ToolPiper

Video LLaVA 7B hf

General · languagebind

Q8_0Excellent

8.7 GB36% of RAM~22 tok/s7.37B params

Run with ToolPiper

openvla 7b finetuned libero spatial

General · openvla

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.54B params

Run with ToolPiper

Qwen3.5 9B Claude 4.6 HighIQ INSTRUCT HERETIC UNCENSORED MLX mxfp8

Chat · thecluster

mlx-8bitExcellent

10.5 GB44% of RAM~18 tok/s9.41B params

Run with ToolPiper

openvla 7b finetuned libero 10

General · openvla

Q8_0Excellent

8.9 GB37% of RAM~21 tok/s7.54B params

Run with ToolPiper

CodeLlama 7b Instruct hf

Coding · Meta · 2024-03-13

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Meta Llama 3 8B

General · Meta

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

9.5 GB40% of RAM~20 tok/s8.08B params

Run with ToolPiper

General · kmhf

Q8_0Excellent

9.2 GB38% of RAM~21 tok/s7.78B params

Run with ToolPiper

Moonlight 16B A3B Instruct

Chat · moonshotai

Q8_0Excellent

18.3 GB76% of RAM~139 tok/s15.96B params

Run with ToolPiper

Llama Guard 3 8B

General · Meta

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

salamandra 7b instruct

Chat · bsc-lt

Q8_0Excellent

9.2 GB38% of RAM~21 tok/s7.77B params

Run with ToolPiper

llava onevision qwen2 7b ov hf

General · llava-hf

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

9.5 GB40% of RAM~20 tok/s8.08B params

Run with ToolPiper

General · nvidia

Q8_0Excellent

9.5 GB40% of RAM~20 tok/s8.07B params

Run with ToolPiper

gemma 2 9b it AWQ

General · solidrust

Q8_0Excellent

11.8 GB49% of RAM~16 tok/s10.16B params

Run with ToolPiper

Phi 4 reasoning vision 15B

Reasoning · Microsoft

Q8_0Excellent

17.4 GB72% of RAM~11 tok/s15.12B params

Run with ToolPiper

gemma 3 12b it qat q4_0 unquantized

General · lightricks

Q8_0Excellent

14.1 GB59% of RAM~13 tok/s12.19B params

Run with ToolPiper

Molmo 7B D 0924

General · allenai

Q8_0Excellent

9.4 GB39% of RAM~20 tok/s8.02B params

Run with ToolPiper

LLaVA OneVision 1.5 8B Instruct

Chat · lmms-lab

Q8_0Excellent

10.0 GB42% of RAM~19 tok/s8.53B params

Run with ToolPiper

Skywork VL Reward 7B

General · skywork

Q8_0Excellent

9.7 GB41% of RAM~19 tok/s8.29B params

Run with ToolPiper

gemma 3 12b it FP8 dynamic

General · redhatai

Q8_0Excellent

14.1 GB59% of RAM~13 tok/s12.19B params

Run with ToolPiper

General · Meta · 2024-07-14

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Qwen2.5 Coder 14B Instruct

Coding · Alibaba · 2024-11-06

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

HyperCLOVAX SEED Omni 8B

General · naver-hyperclovax

Q8_0Excellent

12.5 GB52% of RAM~15 tok/s10.74B params

Run with ToolPiper

General · mistral-experimental

Q8_0Excellent

14.6 GB61% of RAM~13 tok/s12.68B params

Run with ToolPiper

Chat · thudm · 2024-06-04

Q8_0Excellent

11.0 GB46% of RAM~17 tok/s9.4B params

Run with ToolPiper

General · coherelabs

Q8_0Excellent

10.1 GB42% of RAM~19 tok/s8.63B params

Run with ToolPiper

Llama 2 7b chat hf

Chat · NousResearch

Q8_0Excellent

8.0 GB33% of RAM~24 tok/s6.74B params

Run with ToolPiper

Chat · 01.ai · 2023-11-22

Q8_0Excellent

7.3 GB30% of RAM~27 tok/s6.06B params

Run with ToolPiper

t5gemma 2 4b 4b

General · Google

Q8_0Excellent

10.4 GB43% of RAM~18 tok/s8.85B params

Run with ToolPiper

Llama 3.2 11B Vision Instruct abliterated

Chat · huihui-ai

Q8_0Excellent

12.4 GB52% of RAM~15 tok/s10.67B params

Run with ToolPiper

NVIDIA Nemotron Nano 12B v2 VL FP8

General · nvidia

Q8_0Excellent

15.2 GB63% of RAM~12 tok/s13.18B params

Run with ToolPiper

NVIDIA Nemotron Nano 12B v2 VL BF16

General · nvidia

Q8_0Excellent

15.2 GB63% of RAM~12 tok/s13.18B params

Run with ToolPiper

Mistral NeMo Minitron 8B Instruct

Chat · nvidia

Q8_0Excellent

9.9 GB41% of RAM~19 tok/s8.41B params

Run with ToolPiper

falcon mamba 7b instruct

Chat · TII

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.27B params

Run with ToolPiper

Qwen2.5 Coder 14B

Coding · Alibaba

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

General · opengvlab

Q8_0Excellent

10.7 GB45% of RAM~18 tok/s9.14B params

Run with ToolPiper

moondream3 preview

General · moondream

Q8_0Excellent

10.8 GB45% of RAM~17 tok/s9.27B params

Run with ToolPiper

deepseek vl 7b chat

Chat · DeepSeek

Q8_0Excellent

8.7 GB36% of RAM~22 tok/s7.34B params

Run with ToolPiper

General · Google · 2024-06-24

Q8_0Excellent

10.8 GB45% of RAM~17 tok/s9.24B params

Run with ToolPiper

Qwen2.5 Math 7B Instruct

Chat · Alibaba

Q8_0Excellent

9.0 GB38% of RAM~21 tok/s7.62B params

Run with ToolPiper

Meta Llama 3 8B Instruct

Chat · Meta

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Qwen3 Coder 30B A3B Instruct

Coding · lmstudio-community

mlx-4bitTight

19.3 GB80% of RAM~92 tok/s30.53B params

Run with ToolPiper

mistral nemo instruct 2407 awq

Chat · casperhansen

Q8_0Excellent

14.2 GB59% of RAM~13 tok/s12.25B params

Run with ToolPiper

falcon 7b instruct

Chat · TII · 2023-04-25

Q8_0Excellent

8.6 GB36% of RAM~22 tok/s7.22B params

Run with ToolPiper

SauerkrautLM Nemo 12b Instruct

Chat · vagosolutions

Q8_0Excellent

14.2 GB59% of RAM~13 tok/s12.25B params

Run with ToolPiper

Llama 3.2 11B Vision

General · Meta

Q8_0Excellent

12.4 GB52% of RAM~15 tok/s10.64B params

Run with ToolPiper

InternVL3 8B Instruct

Chat · opengvlab

Q8_0Excellent

9.4 GB39% of RAM~20 tok/s7.94B params

Run with ToolPiper

Llama 3.1 8B Instruct

Chat · Meta · 2024-07-18

Q8_0Excellent

9.5 GB39% of RAM~20 tok/s8.03B params

Run with ToolPiper

Qwen3.5 35B A3B

General · Alibaba · 2026-02-24

Q3_K_MTight

20.1 GB84% of RAM~117 tok/s35.95B params

Try Q2_K (16.2 GB, ~152 tok/s)

Run with ToolPiper

HyperCLOVAX SEED Think 14B GPTQ

General · k-compression

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.75B params

Run with ToolPiper

HyperCLOVAX SEED Think 14B

General · naver-hyperclovax

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.75B params

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled NVFP4

Reasoning · mconcat

Q6_KExcellent

19.7 GB82% of RAM~10 tok/s22.15B params

Run with ToolPiper

instructblip vicuna 7b

Chat · salesforce

Q8_0Excellent

9.3 GB39% of RAM~20 tok/s7.91B params

Run with ToolPiper

lavida llada v1.0 instruct hf transformers

Chat · konstantinoskk

Q8_0Excellent

9.9 GB41% of RAM~19 tok/s8.43B params

Run with ToolPiper

NVIDIA Nemotron 3 Nano 30B A3B NVFP4

General · nvidia

Q6_KExcellent

16.3 GB68% of RAM~12 tok/s18.24B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

Ovis2.6 30B A3B

General · aidc-ai

Q3_K_MTight

17.6 GB73% of RAM~102 tok/s31.38B params

Try Q2_K (14.2 GB, ~133 tok/s)

Run with ToolPiper

Qwen3 30B A3B Instruct 2507

Chat · lmstudio-community

mlx-4bitTight

19.3 GB80% of RAM~92 tok/s30.53B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

16.3 GB68% of RAM~11 tok/s14.17B params

Run with ToolPiper

General · Alibaba

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

Qwen3 30B A3B Thinking 2507

General · Alibaba

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

sarvam 30b uncensored

General · aoxo

Q3_K_MTight

18.0 GB75% of RAM~116 tok/s32.15B params

Try Q2_K (14.5 GB, ~150 tok/s)

Run with ToolPiper

General · sarvamai

Q3_K_MTight

18.0 GB75% of RAM~116 tok/s32.15B params

Try Q2_K (14.5 GB, ~150 tok/s)

Run with ToolPiper

typhoon2.5 qwen3 30b a3b

General · typhoon-ai

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

ERNIE 4.5 21B A3B

General · lmstudio-community

mlx-4bitExcellent

13.9 GB58% of RAM~14 tok/s21.83B params

Run with ToolPiper

General · Google

Q8_0Excellent

14.1 GB59% of RAM~13 tok/s12.19B params

Run with ToolPiper

InternVL3_5 38B AWQ 4bit

General · cyankiwi

Q8_0Excellent

14.0 GB58% of RAM~13 tok/s12.06B params

Run with ToolPiper

CodeLlama 13b Instruct hf

Coding · Meta · 2024-03-13

Q8_0Excellent

15.0 GB63% of RAM~12 tok/s13.02B params

Run with ToolPiper

Qwen3 Coder 30B A3B Instruct

Coding · Alibaba

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

Qwen3 Coder 30B A3B Instruct FP8

Coding · Alibaba

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

Llama 3.2 11B Vision Instruct

Chat · Meta · 2024-09-18

Q8_0Excellent

12.4 GB52% of RAM~15 tok/s10.67B params

Run with ToolPiper

Qwen3 VL 30B A3B Thinking

General · Alibaba

Q4_K_MTight

20.6 GB86% of RAM~118 tok/s31.07B params

Try Q3_K_M (17.4 GB, ~142 tok/s)

Run with ToolPiper

General · Alibaba

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

General · eleutherai

Q8_0Excellent

13.9 GB58% of RAM~13 tok/s12B params

Run with ToolPiper

Qwen3 Coder 30B A3B Instruct AWQ

Coding · quanttrio

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

bu 30b a3b preview

General · browser-use

Q4_K_MTight

20.6 GB86% of RAM~118 tok/s31.07B params

Try Q3_K_M (17.4 GB, ~142 tok/s)

Run with ToolPiper

Bielik 11B v3.0 Instruct

Chat · speakleash

Q8_0Excellent

13.0 GB54% of RAM~14 tok/s11.17B params

Run with ToolPiper

Qwen3 30B A3B GPTQ Int4

General · Alibaba

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

llava v1.6 vicuna 13b

General · liuhaotian

Q8_0Excellent

15.4 GB64% of RAM~12 tok/s13.35B params

Run with ToolPiper

Qwen3 30B A3B Base

General · Alibaba

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

SOLAR 10.7B Instruct v1.0

Chat · Upstage · 2023-12-12

Q8_0Excellent

12.5 GB52% of RAM~15 tok/s10.73B params

Run with ToolPiper

Qwen3 30B A3B AWQ

General · quixiai

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

Qwen3 32B NVFP4

General · redhatai

Q6_KExcellent

17.0 GB71% of RAM~11 tok/s19.11B params

Run with ToolPiper

Qwen3 30B A3B.w8a8

General · nytopop

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.55B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

GLM 4.6V AWQ 4bit

General · cyankiwi

Q6_KExcellent

17.4 GB72% of RAM~11 tok/s19.49B params

Run with ToolPiper

gemma 3 12b it int4 awq

General · gaunernst

Q8_0Excellent

14.1 GB59% of RAM~13 tok/s12.19B params

Run with ToolPiper

llava 1.5 13b hf

General · llava-hf

Q8_0Excellent

15.4 GB64% of RAM~12 tok/s13.35B params

Run with ToolPiper

InternVL3 14B hf

General · opengvlab

Q8_0Excellent

17.4 GB72% of RAM~11 tok/s15.12B params

Run with ToolPiper

Qwen3 30B A3B Instruct 2507 FP8

Chat · Alibaba

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

HarmBench Llama 2 13b cls

General · cais

Q8_0Excellent

15.0 GB63% of RAM~12 tok/s13.02B params

Run with ToolPiper

Qwen3 30B A3B Instruct 2507 GPTQ Int4

Chat · junhowie

Q4_K_MTight

20.2 GB84% of RAM~87 tok/s30.53B params

Try Q3_K_M (17.2 GB, ~105 tok/s)

Run with ToolPiper

General · llm-jp

Q8_0Excellent

15.8 GB66% of RAM~12 tok/s13.71B params

Run with ToolPiper

gemma 4 31B it NVFP4

General · redhatai

Q6_KExcellent

17.7 GB74% of RAM~11 tok/s19.87B params

Run with ToolPiper

Qwen3 VL 30B A3B Instruct AWQ

Chat · quanttrio

Q4_K_MTight

20.6 GB86% of RAM~118 tok/s31.07B params

Try Q3_K_M (17.4 GB, ~142 tok/s)

Run with ToolPiper

gemma 4 26B A4B it

General · Google · 2026-03-11

Q5_K_MTight

20.3 GB85% of RAM~9 tok/s26.54B params

Try Q4_K_M (17.6 GB, ~11 tok/s)

Run with ToolPiper

Qwen3 14B Instruct

Chat · openpipe

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

Kimi VL A3B Thinking

General · moonshotai

Q8_0Excellent

18.8 GB78% of RAM~10 tok/s16.41B params

Run with ToolPiper

Kimi VL A3B Thinking 2506

General · moonshotai

Q8_0Excellent

18.8 GB78% of RAM~10 tok/s16.41B params

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled FP8 Dynamic

Reasoning · mconcat

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B Claude Opus 4.6 High Reasoning

Reasoning · harleywang

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

General · Alibaba · 2026-02-24

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen2.5 14B Instruct AWQ

Chat · Alibaba

Q8_0Excellent

17.0 GB71% of RAM~11 tok/s14.77B params

Run with ToolPiper

Qwen3.5 35B A3B AWQ 4bit

General · cyankiwi

Q2_KGreat

16.6 GB69% of RAM~155 tok/s36.98B params

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled

Reasoning · jackrong

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Gemma 4 31B IT NVFP4

General · nvidia

Q6_KExcellent

18.6 GB77% of RAM~10 tok/s20.87B params

Run with ToolPiper

Qwen3.5 27B NVFP4

General · apolo13x

Q8_0Excellent

19.1 GB80% of RAM~10 tok/s16.71B params

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled v2

Reasoning · jackrong

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 35B A3B AWQ 8bit

General · cyankiwi

Q2_KGreat

16.6 GB69% of RAM~155 tok/s36.98B params

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled GPTQ int4

Reasoning · codgician

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled v2 AWQ

Reasoning · quanttrio

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B Claude 4.6 Opus Reasoning Distilled GPTQ int4

Reasoning · oxzoid

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B heretic v3 NVFP4

General · deaquay

Q8_0Excellent

19.1 GB80% of RAM~10 tok/s16.71B params

Run with ToolPiper

Qwen3.5 35B A3B FP8

General · Alibaba

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

InternVL3_5 GPT OSS 20B A4B Preview HF

General · opengvlab

Q6_KExcellent

18.9 GB79% of RAM~10 tok/s21.23B params

Run with ToolPiper

Qwen3.5 35B A3B AWQ

General · quanttrio

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

Huihui Qwen3.5 35B A3B Claude 4.6 Opus abliterated

General · huihui-ai

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

Huihui Qwen3.5 35B A3B abliterated

General · huihui-ai

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

Qwen3.5 35B A3B Base

General · Alibaba

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

Qwen3.5 27B NVFP4

General · axionml

Q8_0Excellent

19.6 GB82% of RAM~9 tok/s17.13B params

Run with ToolPiper

Qwen3.5 27B heretic

Coding · coder3101

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

MiniMax M2.1 AWQ 4bit

General · cyankiwi

Q3_K_MTight

20.6 GB86% of RAM~120 tok/s36.81B params

Try Q2_K (16.5 GB, ~156 tok/s)

Run with ToolPiper

llm jp 3.1 13b instruct4

Chat · llm-jp

Q8_0Excellent

15.8 GB66% of RAM~12 tok/s13.71B params

Run with ToolPiper

Kimi VL A3B Instruct

Chat · moonshotai

Q8_0Excellent

18.8 GB78% of RAM~10 tok/s16.41B params

Run with ToolPiper

Qwen3 32B NVFP4

General · nvidia

Q8_0Excellent

19.6 GB82% of RAM~9 tok/s17.16B params

Run with ToolPiper

Dolphin Mistral 24B Venice Edition

General · dphn

Q5_K_MTight

18.1 GB75% of RAM~11 tok/s23.57B params

Try Q4_K_M (15.7 GB, ~12 tok/s)

Run with ToolPiper

General · Google · 2026-03-11

Q3_K_MTight

18.3 GB76% of RAM~11 tok/s32.68B params

Try Q2_K (14.7 GB, ~14 tok/s)

Run with ToolPiper

Qwen3.5 35B A3B Claude 4.6 Opus Reasoning Distilled GPTQ int4

Reasoning · codgician

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

Qwen3.5 35B A3B Claude 4.6 Opus Reasoning Distilled

Reasoning · jackrong

Q3_K_MTight

20.1 GB84% of RAM~123 tok/s35.95B params

Try Q2_K (16.2 GB, ~159 tok/s)

Run with ToolPiper

deepseek vl2 small

General · DeepSeek

Q8_0Excellent

18.5 GB77% of RAM~10 tok/s16.15B params

Run with ToolPiper

NVIDIA Nemotron 3 Nano 30B A3B BF16

General · nvidia · 2025-12-04

Q3_K_MTight

17.7 GB74% of RAM~11 tok/s31.58B params

Try Q2_K (14.3 GB, ~14 tok/s)

Run with ToolPiper

DeepSeek R1 Distill Qwen 32B

Reasoning · DeepSeek · 2025-01-20

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

Mistral Small 24B Instruct 2501 AWQ

Chat · stelterlab

Q5_K_MTight

18.1 GB75% of RAM~11 tok/s23.57B params

Try Q4_K_M (15.7 GB, ~12 tok/s)

Run with ToolPiper

GLM 4.7 Flash REAP 23B A3B

General · cerebras

Q6_KTight

20.4 GB85% of RAM~9 tok/s23B params

Try Q5_K_M (17.6 GB, ~11 tok/s)

Run with ToolPiper

General · m-ric

Q5_K_MTight

19.4 GB81% of RAM~10 tok/s25.31B params

Try Q4_K_M (16.8 GB, ~12 tok/s)

Run with ToolPiper

General · rhymes-ai

Q5_K_MTight

19.4 GB81% of RAM~10 tok/s25.31B params

Try Q4_K_M (16.8 GB, ~12 tok/s)

Run with ToolPiper

Mistral Small 3.2 24B Instruct hf AWQ

Chat · gghfez

Q5_K_MTight

18.1 GB75% of RAM~11 tok/s23.57B params

Try Q4_K_M (15.7 GB, ~12 tok/s)

Run with ToolPiper

t5gemma 9b 9b ul2

General · Google

Q6_KExcellent

18.1 GB75% of RAM~10 tok/s20.33B params

Run with ToolPiper

gemma 4 31B it heretic

Coding · coder3101

Q3_K_MTight

17.6 GB73% of RAM~11 tok/s31.27B params

Try Q2_K (14.1 GB, ~15 tok/s)

Run with ToolPiper

General · lmstudio-community

mlx-4bitTight

18.9 GB79% of RAM~10 tok/s29.94B params

Run with ToolPiper

OpenReasoning Nemotron 32B

Reasoning · nvidia

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

gemma 4 26B A4B it AWQ 4bit

General · cyankiwi

Q5_K_MTight

20.3 GB85% of RAM~9 tok/s26.55B params

Try Q4_K_M (17.6 GB, ~11 tok/s)

Run with ToolPiper

gemma 3 27b it FP8 dynamic

General · redhatai

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.44B params

Try Q3_K_M (15.5 GB, ~13 tok/s)

Run with ToolPiper

gemma 4 26B A4B

General · Google

Q5_K_MTight

20.3 GB85% of RAM~9 tok/s26.54B params

Try Q4_K_M (17.6 GB, ~11 tok/s)

Run with ToolPiper

Qwen3.5 27B Claude 4.6 OS Auto Variable Thinking

General · davidau

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwopus3.5 27B v3

General · jackrong

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B Deckard PKD Heretic Uncensored Thinking

General · davidau

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B earica

General · voidful

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B earica hardness

General · voidful

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

gemma 4 26B A4B it AWQ 8bit

General · cyankiwi

Q5_K_MTight

20.3 GB85% of RAM~9 tok/s26.55B params

Try Q4_K_M (17.6 GB, ~11 tok/s)

Run with ToolPiper

Qwen3.5 27B ultra uncensored heretic v1

General · llmfan46

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.36B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B FP8

General · Alibaba

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B AWQ

General · quanttrio

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Huihui Qwen3.5 27B abliterated

General · huihui-ai

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Huihui Qwen3.5 27B Claude 4.6 Opus abliterated

General · huihui-ai

Q4_K_MTight

18.4 GB77% of RAM~10 tok/s27.78B params

Try Q3_K_M (15.7 GB, ~13 tok/s)

Run with ToolPiper

Qwen3.5 27B AWQ 4bit

General · cyankiwi

Q4_K_MTight

18.9 GB79% of RAM~10 tok/s28.55B params

Try Q3_K_M (16.1 GB, ~12 tok/s)

Run with ToolPiper

Qwen3.5 27B AWQ BF16 INT8

General · cyankiwi

Q4_K_MTight

18.8 GB78% of RAM~10 tok/s28.38B params

Try Q3_K_M (16.0 GB, ~12 tok/s)

Run with ToolPiper

Qwen3.5 27B AWQ BF16 INT4

General · cyankiwi

Q4_K_MTight

18.8 GB78% of RAM~10 tok/s28.38B params

Try Q3_K_M (16.0 GB, ~12 tok/s)

Run with ToolPiper

General · lgai-exaone · 2025-07-11

Q3_K_MTight

18.0 GB75% of RAM~11 tok/s32B params

Try Q2_K (14.4 GB, ~14 tok/s)

Run with ToolPiper

NVIDIA Nemotron 3 Nano 30B A3B

General · lmstudio-community

mlx-4bitTight

19.9 GB83% of RAM~10 tok/s31.58B params

Run with ToolPiper

vllm translategemma 27b it

General · infomaniak-ai

Q4_K_MTight

19.1 GB80% of RAM~10 tok/s28.84B params

Try Q3_K_M (16.2 GB, ~12 tok/s)

Run with ToolPiper

NVIDIA Nemotron 3 Nano 30B A3B FP8

General · nvidia

Q3_K_MTight

17.7 GB74% of RAM~11 tok/s31.58B params

Try Q2_K (14.3 GB, ~14 tok/s)

Run with ToolPiper

Qwen2.5 Coder 32B Instruct

Coding · Alibaba · 2024-11-06

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

General · zai-org

Q3_K_MTight

17.5 GB73% of RAM~11 tok/s31.22B params

Try Q2_K (14.1 GB, ~15 tok/s)

Run with ToolPiper

Nemotron Cascade 2 30B A3B

General · nvidia

Q3_K_MTight

17.7 GB74% of RAM~11 tok/s31.58B params

Try Q2_K (14.3 GB, ~14 tok/s)

Run with ToolPiper

NVIDIA Nemotron 3 Nano 30B A3B Base BF16

General · nvidia

Q3_K_MTight

17.7 GB74% of RAM~11 tok/s31.58B params

Try Q2_K (14.3 GB, ~14 tok/s)

Run with ToolPiper

GLM 4.7 Flash AWQ

General · quanttrio

Q3_K_MTight

17.5 GB73% of RAM~11 tok/s31.22B params

Try Q2_K (14.1 GB, ~15 tok/s)

Run with ToolPiper

ERNIE 4.5 VL 28B A3B PT

General · baidu

Q4_K_MTight

19.5 GB81% of RAM~10 tok/s29.4B params

Try Q3_K_M (16.5 GB, ~12 tok/s)

Run with ToolPiper

Qwen2.5 Coder 32B

Coding · Alibaba

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

ERNIE 4.5 VL 28B A3B Thinking

General · baidu

Q4_K_MTight

19.6 GB82% of RAM~10 tok/s29.66B params

Try Q3_K_M (16.7 GB, ~12 tok/s)

Run with ToolPiper

gemma 4 31B it heretic v2

General · momix-44

Q3_K_MTight

17.6 GB73% of RAM~11 tok/s31.27B params

Try Q2_K (14.1 GB, ~15 tok/s)

Run with ToolPiper

gemma 4 31B it AWQ

General · quanttrio

Q3_K_MTight

17.6 GB73% of RAM~11 tok/s31.27B params

Try Q2_K (14.1 GB, ~15 tok/s)

Run with ToolPiper

gemma 4 31B it FP8 block

General · redhatai

Q3_K_MTight

17.6 GB73% of RAM~11 tok/s31.27B params

Try Q2_K (14.1 GB, ~15 tok/s)

Run with ToolPiper

General · Google · 2025-03-01

Q4_K_MTight

18.2 GB76% of RAM~11 tok/s27.43B params

Try Q3_K_M (15.5 GB, ~13 tok/s)

Run with ToolPiper

GLM 4.7 Flash AWQ 4bit

General · cyankiwi

Q3_K_MTight

18.0 GB75% of RAM~11 tok/s32.14B params

Try Q2_K (14.5 GB, ~14 tok/s)

Run with ToolPiper

gemma 4 31B it AWQ 4bit

General · cyankiwi

Q3_K_MTight

18.1 GB75% of RAM~11 tok/s32.19B params

Try Q2_K (14.5 GB, ~14 tok/s)

Run with ToolPiper

EXAONE 4.0 32B FP8

General · lgai-exaone

Q3_K_MTight

18.0 GB75% of RAM~11 tok/s32.01B params

Try Q2_K (14.4 GB, ~14 tok/s)

Run with ToolPiper

EXAONE 4.0.1 32B

General · lgai-exaone

Q3_K_MTight

18.0 GB75% of RAM~11 tok/s32B params

Try Q2_K (14.4 GB, ~14 tok/s)

Run with ToolPiper

gemma 4 31B it AWQ 8bit

General · cyankiwi

Q3_K_MTight

18.1 GB75% of RAM~11 tok/s32.19B params

Try Q2_K (14.5 GB, ~14 tok/s)

Run with ToolPiper

General · Alibaba

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

Baichuan M2 32B

General · baichuan-inc

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

General · Google

Q3_K_MTight

18.3 GB76% of RAM~11 tok/s32.68B params

Try Q2_K (14.7 GB, ~14 tok/s)

Run with ToolPiper

General · Google · 2024-06-24

Q4_K_MTight

18.1 GB75% of RAM~11 tok/s27.23B params

Try Q3_K_M (15.4 GB, ~13 tok/s)

Run with ToolPiper

Qwen3 VL 32B Thinking

General · Alibaba

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.36B params

Try Q2_K (15.0 GB, ~14 tok/s)

Run with ToolPiper

HyperCLOVAX SEED Think 32B

General · naver-hyperclovax

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.31B params

Try Q2_K (15.0 GB, ~14 tok/s)

Run with ToolPiper

Olmo 3 1125 32B

General · allenai

Q3_K_MTight

18.1 GB75% of RAM~11 tok/s32.23B params

Try Q2_K (14.5 GB, ~14 tok/s)

Run with ToolPiper

karakuri vl 32b thinking 2507 exp

General · karakuri-ai

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.45B params

Try Q2_K (15.1 GB, ~14 tok/s)

Run with ToolPiper

General · Alibaba

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

General · Alibaba

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

Qwen3 32B FP8 dynamic

General · redhatai

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.77B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

CodeLlama 34b Instruct hf

Coding · codellama

Q3_K_MTight

18.9 GB79% of RAM~10 tok/s33.74B params

Try Q2_K (15.2 GB, ~14 tok/s)

Run with ToolPiper

xLAM 2 32b fc r

General · salesforce

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

Qwen3 VL 32B Instruct

Chat · Alibaba

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.36B params

Try Q2_K (15.0 GB, ~14 tok/s)

Run with ToolPiper

Qwen3 VL 32B Instruct AWQ

Chat · quanttrio

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.36B params

Try Q2_K (15.0 GB, ~14 tok/s)

Run with ToolPiper

Qwen2.5 VL 32B Instruct

Chat · Alibaba

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.45B params

Try Q2_K (15.1 GB, ~14 tok/s)

Run with ToolPiper

karakuri vl 32b instruct 2507

Chat · karakuri-ai

Q3_K_MTight

18.7 GB78% of RAM~11 tok/s33.45B params

Try Q2_K (15.1 GB, ~14 tok/s)

Run with ToolPiper

Qwen2.5 32B Instruct AWQ

Chat · Alibaba

Q3_K_MTight

18.4 GB77% of RAM~11 tok/s32.76B params

Try Q2_K (14.8 GB, ~14 tok/s)

Run with ToolPiper

OLMo 2 0325 32B Instruct

Chat · allenai · 2025-03-12

Q3_K_MTight

18.1 GB75% of RAM~11 tok/s32.23B params

Try Q2_K (14.5 GB, ~14 tok/s)

Run with ToolPiper

Qwen3.5 40B Claude 4.6 Opus Deckard Heretic Uncensored Thinking

General · davidau

Q2_KTight

17.7 GB74% of RAM~12 tok/s39.53B params

Run with ToolPiper

dolphin 2.9.1 yi 1.5 34b

General · dphn

Q3_K_MTight

19.3 GB80% of RAM~10 tok/s34.39B params

Try Q2_K (15.5 GB, ~13 tok/s)

Run with ToolPiper

InternVL3_5 30B A3B

General · opengvlab

Q4_K_MTight

20.4 GB85% of RAM~9 tok/s30.85B params

Try Q3_K_M (17.3 GB, ~11 tok/s)

Run with ToolPiper

CodeLlama 34b Instruct hf

Coding · Meta · 2024-03-14

Q3_K_MTight

18.9 GB79% of RAM~10 tok/s33.74B params

Try Q2_K (15.2 GB, ~14 tok/s)

Run with ToolPiper

General · liuhaotian

Q3_K_MTight

19.5 GB81% of RAM~10 tok/s34.75B params

Try Q2_K (15.6 GB, ~13 tok/s)

Run with ToolPiper

llava v1.6 34b hf

General · llava-hf

Q3_K_MTight

19.5 GB81% of RAM~10 tok/s34.75B params

Try Q2_K (15.6 GB, ~13 tok/s)

Run with ToolPiper

General · opengvlab

Q2_KTight

17.2 GB72% of RAM~12 tok/s38.39B params

Run with ToolPiper

Skywork R1V 38B

Reasoning · skywork

Q2_KTight

17.2 GB72% of RAM~12 tok/s38.39B params

Run with ToolPiper

General · moonshotai · 2026-01-01

Won't Fit

461.6 GB1,923% of RAM1058.59B params

Needs 462+ GB — try a Mac with more RAM

General · openai

Won't Fit

52.9 GB221% of RAM120.41B params

Needs 53+ GB — try a Mac with more RAM

Reasoning · DeepSeek · 2025-01-20

Won't Fit

298.6 GB1,244% of RAM684.53B params

Needs 299+ GB — try a Mac with more RAM

General · zai-org

Won't Fit

328.9 GB1,370% of RAM753.91B params

Needs 329+ GB — try a Mac with more RAM

NVIDIA Nemotron 3 Super 120B A12B NVFP4

General · nvidia

Won't Fit

29.8 GB124% of RAM67.23B params

Needs 30+ GB — try a Mac with more RAM

General · DeepSeek · 2025-12-01

Won't Fit

299.0 GB1,246% of RAM685.4B params

Needs 299+ GB — try a Mac with more RAM

Qwen3 Coder Next FP8

Coding · Alibaba

Won't Fit

35.2 GB147% of RAM79.68B params

Needs 35+ GB — try a Mac with more RAM

NVIDIA Nemotron 3 Super 120B A12B FP8

General · nvidia

Won't Fit

54.3 GB226% of RAM123.61B params

Needs 54+ GB — try a Mac with more RAM

Llama 3.1 70B Instruct

Chat · Meta · 2024-07-16

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

Qwen3.5 397B A17B

General · Alibaba · 2026-02-16

Won't Fit

176.2 GB734% of RAM403.4B params

Needs 176+ GB — try a Mac with more RAM

Qwen3.5 397B A17B FP8

General · Alibaba

Won't Fit

176.2 GB734% of RAM403.42B params

Needs 176+ GB — try a Mac with more RAM

Qwen3.5 122B A10B

General · Alibaba · 2026-02-24

Won't Fit

55.0 GB229% of RAM125.09B params

Needs 55+ GB — try a Mac with more RAM

Qwen3.5 122B A10B FP8

General · Alibaba

Won't Fit

55.0 GB229% of RAM125.09B params

Needs 55+ GB — try a Mac with more RAM

DeepSeek R1 0528

Reasoning · DeepSeek

Won't Fit

298.6 GB1,244% of RAM684.53B params

Needs 299+ GB — try a Mac with more RAM

Qwen3 Coder Next

Coding · Alibaba · 2026-01-30

Won't Fit

35.2 GB147% of RAM79.67B params

Needs 35+ GB — try a Mac with more RAM

Qwen2.5 72B Instruct

Chat · Alibaba · 2024-09-16

Won't Fit

32.2 GB134% of RAM72.71B params

Needs 32+ GB — try a Mac with more RAM

L3.3 GeneticLemonade Final v2 70B

General · zerofata

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

General · DeepSeek · 2024-12-25

Won't Fit

298.6 GB1,244% of RAM684.53B params

Needs 299+ GB — try a Mac with more RAM

General · minimaxai · 2026-02-12

Won't Fit

100.1 GB417% of RAM228.7B params

Needs 100+ GB — try a Mac with more RAM

Qwen3 235B A22B

General · Alibaba · 2025-04-27

Won't Fit

102.9 GB429% of RAM235.09B params

Needs 103+ GB — try a Mac with more RAM

DeepSeek R1 0528 NVFP4 v2

Reasoning · nvidia

Won't Fit

171.9 GB716% of RAM393.63B params

Needs 172+ GB — try a Mac with more RAM

Qwen3 Next 80B A3B Instruct

Chat · Alibaba

Won't Fit

35.9 GB150% of RAM81.32B params

Needs 36+ GB — try a Mac with more RAM

Llama 3.3 70B Instruct AWQ

Chat · kosbu

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

DeepSeek V3 0324

General · DeepSeek

Won't Fit

298.6 GB1,244% of RAM684.53B params

Needs 299+ GB — try a Mac with more RAM

Qwen3 Coder Next 8bit

Coding · nexveridian

Won't Fit

35.2 GB147% of RAM79.67B params

Needs 35+ GB — try a Mac with more RAM

Mixtral 8x7B Instruct v0.1

Chat · Mistral AI · 2023-12-10

Won't Fit

20.8 GB87% of RAM46.7B params

Needs 21+ GB — try a Mac with more RAM

llama 3.3 70b instruct awq

Chat · casperhansen

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

Qwen3 VL 235B A22B Instruct

Chat · Alibaba

Won't Fit

103.1 GB430% of RAM235.67B params

Needs 103+ GB — try a Mac with more RAM

Qwen2.5 72B Instruct abliterated

Chat · huihui-ai

Won't Fit

32.2 GB134% of RAM72.71B params

Needs 32+ GB — try a Mac with more RAM

Llama 3.3 70B Instruct

Chat · Meta · 2024-11-26

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

General · zai-org

Won't Fit

48.6 GB203% of RAM110.47B params

Needs 49+ GB — try a Mac with more RAM

General · zai-org · 2026-02-11

Won't Fit

328.8 GB1,370% of RAM753.86B params

Needs 329+ GB — try a Mac with more RAM

Qwen3 VL 235B A22B Thinking

General · Alibaba

Won't Fit

103.1 GB430% of RAM235.67B params

Needs 103+ GB — try a Mac with more RAM

Qwen3.5 122B A10B NVFP4

General · sehyo

Won't Fit

31.5 GB131% of RAM71.22B params

Needs 32+ GB — try a Mac with more RAM

Step 3.5 Flash FP8

General · stepfun-ai

Won't Fit

87.3 GB364% of RAM199.4B params

Needs 87+ GB — try a Mac with more RAM

NVIDIA Nemotron 3 Super 120B A12B BF16

General · nvidia

Won't Fit

54.3 GB226% of RAM123.61B params

Needs 54+ GB — try a Mac with more RAM

General · Meta

Won't Fit

177.3 GB739% of RAM405.85B params

Needs 177+ GB — try a Mac with more RAM

Qwen3 235B A22B Instruct 2507 FP8

Chat · Alibaba

Won't Fit

102.9 GB429% of RAM235.11B params

Needs 103+ GB — try a Mac with more RAM

Qwen3 Next 80B A3B Instruct FP8

Chat · Alibaba

Won't Fit

35.9 GB150% of RAM81.33B params

Needs 36+ GB — try a Mac with more RAM

Qwen3.5 122B A10B NVFP4

General · txn545

Won't Fit

28.5 GB119% of RAM64.35B params

Needs 29+ GB — try a Mac with more RAM

Meta Llama 3 70B

General · Meta

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

Llama 3_3 Nemotron Super 49B v1_5

General · nvidia

Won't Fit

22.2 GB93% of RAM49.87B params

Needs 22+ GB — try a Mac with more RAM

Llama 3.1 405B Instruct

Chat · Meta · 2024-07-16

Won't Fit

177.3 GB739% of RAM405.85B params

Needs 177+ GB — try a Mac with more RAM

XORTRON.CriminalComputing.LARGE.2026.3

General · darkc0de

Won't Fit

53.9 GB225% of RAM122.61B params

Needs 54+ GB — try a Mac with more RAM

jais adapted 70b chat 4bit bnb

Chat · inceptionai

Won't Fit

31.7 GB132% of RAM71.64B params

Needs 32+ GB — try a Mac with more RAM

General · inclusionai

Won't Fit

45.3 GB189% of RAM102.89B params

Needs 45+ GB — try a Mac with more RAM

General · zai-org

Won't Fit

156.6 GB652% of RAM358.34B params

Needs 157+ GB — try a Mac with more RAM

sarvam 105b uncensored

General · aoxo

Won't Fit

24.8 GB103% of RAM55.73B params

Needs 25+ GB — try a Mac with more RAM

Qwen3 VL 235B A22B Instruct FP8

Chat · Alibaba

Won't Fit

103.1 GB430% of RAM235.68B params

Needs 103+ GB — try a Mac with more RAM

General · nvidia

Won't Fit

190.1 GB792% of RAM435.24B params

Needs 190+ GB — try a Mac with more RAM

General · stepfun-ai

Won't Fit

140.3 GB585% of RAM320.97B params

Needs 140+ GB — try a Mac with more RAM

General · stepfun-ai

Won't Fit

87.3 GB364% of RAM199.38B params

Needs 87+ GB — try a Mac with more RAM

deepseek coder v2 instruct awq

Coding · casperhansen

Won't Fit

103.2 GB430% of RAM235.74B params

Needs 103+ GB — try a Mac with more RAM

gpt oss 120b heretic

General · kldzj

Won't Fit

51.4 GB214% of RAM116.83B params

Needs 51+ GB — try a Mac with more RAM

Meta Llama 3.3 70B Instruct AWQ INT4

Chat · ibnzterrell

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

General · Meta

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

Kimi K2 Instruct

Chat · moonshotai · 2025-07-11

Won't Fit

447.6 GB1,865% of RAM1026.47B params

Needs 448+ GB — try a Mac with more RAM

Qwen3 Coder 480B A35B Instruct

Coding · Alibaba · 2025-07-22

Won't Fit

209.6 GB873% of RAM480.15B params

Needs 210+ GB — try a Mac with more RAM

Qwen2.5 VL 72B Instruct

Chat · Alibaba

Won't Fit

32.5 GB135% of RAM73.41B params

Needs 32+ GB — try a Mac with more RAM

Kimi K2 Instruct 0905

Chat · moonshotai

Won't Fit

447.6 GB1,865% of RAM1026.47B params

Needs 448+ GB — try a Mac with more RAM

Qwen2 72B Instruct

Chat · Alibaba

Won't Fit

32.2 GB134% of RAM72.71B params

Needs 32+ GB — try a Mac with more RAM

General · lmstudio-community

Won't Fit

141.3 GB589% of RAM228.69B params

Needs 141+ GB — try a Mac with more RAM

General · minimaxai

Won't Fit

100.1 GB417% of RAM228.7B params

Needs 100+ GB — try a Mac with more RAM

Llama 4 Maverick 17B 128E Instruct FP8

Chat · Meta

Won't Fit

175.4 GB731% of RAM401.65B params

Needs 175+ GB — try a Mac with more RAM

General · zai-org

Won't Fit

156.6 GB652% of RAM358.34B params

Needs 157+ GB — try a Mac with more RAM

Meta Llama 3 70B Instruct

Chat · Meta

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

Qwen3.5 397B A17B

General · lmstudio-community

Won't Fit

69.4 GB289% of RAM111.93B params

Needs 69+ GB — try a Mac with more RAM

General · internlm

Won't Fit

105.3 GB439% of RAM240.71B params

Needs 105+ GB — try a Mac with more RAM

General · lmstudio-community

Won't Fit

72.4 GB302% of RAM116.83B params

Needs 72+ GB — try a Mac with more RAM

Qwen3.5 122B A10B AWQ

General · quanttrio

Won't Fit

55.0 GB229% of RAM125.09B params

Needs 55+ GB — try a Mac with more RAM

Kimi Linear 48B A3B Instruct

Chat · moonshotai

Won't Fit

21.9 GB91% of RAM49.12B params

Needs 22+ GB — try a Mac with more RAM

DeepSeek V3 0324 NVFP4

General · nvidia

Won't Fit

173.3 GB722% of RAM396.77B params

Needs 173+ GB — try a Mac with more RAM

DeepSeek V2.5 1210 FP8

General · redhatai

Won't Fit

103.2 GB430% of RAM235.74B params

Needs 103+ GB — try a Mac with more RAM

General · xiaomimimo · 2025-12-16

Won't Fit

135.4 GB564% of RAM309.79B params

Needs 135+ GB — try a Mac with more RAM

Kimi K2 Thinking

General · moonshotai

Won't Fit

461.3 GB1,922% of RAM1058.12B params

Needs 461+ GB — try a Mac with more RAM

General · Alibaba

Won't Fit

32.2 GB134% of RAM72.71B params

Needs 32+ GB — try a Mac with more RAM

MiniMax M2.5 AWQ

General · quanttrio

Won't Fit

100.1 GB417% of RAM228.69B params

Needs 100+ GB — try a Mac with more RAM

General · salesforce

Won't Fit

20.8 GB87% of RAM46.7B params

Needs 21+ GB — try a Mac with more RAM

Qwen3 235B A22B FP8

General · Alibaba

Won't Fit

102.9 GB429% of RAM235.11B params

Needs 103+ GB — try a Mac with more RAM

Seed OSS 36B Instruct

Chat · lmstudio-community

Won't Fit

22.8 GB95% of RAM36.15B params

Needs 23+ GB — try a Mac with more RAM

LongCat Flash Chat

Chat · meituan-longcat

Won't Fit

245.2 GB1,022% of RAM561.86B params

Needs 245+ GB — try a Mac with more RAM

command a vision 07 2025

General · coherelabs

Won't Fit

49.2 GB205% of RAM111.87B params

Needs 49+ GB — try a Mac with more RAM

General · zai-org

Won't Fit

47.4 GB198% of RAM107.71B params

Needs 47+ GB — try a Mac with more RAM

Llama 3_3 Nemotron Super 49B v1_5 FP8

General · nvidia

Won't Fit

22.2 GB93% of RAM49.87B params

Needs 22+ GB — try a Mac with more RAM

DeepSeek V3.2 NVFP4

General · nvidia

Won't Fit

172.3 GB718% of RAM394.5B params

Needs 172+ GB — try a Mac with more RAM

Llama 4 Scout 17B 16E Instruct FP8 dynamic

Chat · redhatai

Won't Fit

47.8 GB199% of RAM108.66B params

Needs 48+ GB — try a Mac with more RAM

Qwen2 VL 72B Instruct

Chat · Alibaba

Won't Fit

32.5 GB135% of RAM73.41B params

Needs 32+ GB — try a Mac with more RAM

General · minimaxai

Won't Fit

100.1 GB417% of RAM228.7B params

Needs 100+ GB — try a Mac with more RAM

Llama 3_3 Nemotron Super 49B v1

General · nvidia

Won't Fit

22.2 GB93% of RAM49.87B params

Needs 22+ GB — try a Mac with more RAM

Mixtral 8x22B Instruct v0.1

Chat · Mistral AI · 2024-04-16

Won't Fit

61.7 GB257% of RAM140.63B params

Needs 62+ GB — try a Mac with more RAM

Meta Llama 3.1 70B Instruct FP8

Chat · redhatai

Won't Fit

31.2 GB130% of RAM70.55B params

Needs 31+ GB — try a Mac with more RAM

Qwen3 Next 80B A3B Instruct

Chat · lmstudio-community

Won't Fit

49.5 GB206% of RAM79.67B params

Needs 50+ GB — try a Mac with more RAM

General · Alibaba

Won't Fit

32.2 GB134% of RAM72.71B params

Needs 32+ GB — try a Mac with more RAM

Llama 3.3 70B Instruct FP8 dynamic

Chat · redhatai

Won't Fit

31.2 GB130% of RAM70.56B params

Needs 31+ GB — try a Mac with more RAM

K EXAONE 236B A23B

General · lgai-exaone

Won't Fit

103.8 GB432% of RAM237.1B params

Needs 104+ GB — try a Mac with more RAM

MiniMax M2.5 NVFP4

General · nvidia

Won't Fit

51.2 GB213% of RAM116.35B params

Needs 51+ GB — try a Mac with more RAM

Llama 4 Maverick 17B 128E Instruct FP8

Chat · redhatai

Won't Fit

175.4 GB731% of RAM401.65B params

Needs 175+ GB — try a Mac with more RAM

General · inclusionai

Won't Fit

441.5 GB1,839% of RAM1012.47B params

Needs 441+ GB — try a Mac with more RAM

Llama 4 Maverick 17B 128E Instruct

Chat · Meta · 2025-04-01

Won't Fit

175.4 GB731% of RAM401.58B params

Needs 175+ GB — try a Mac with more RAM

Llama 4 Scout 17B 16E Instruct

Chat · redhatai

Won't Fit

47.8 GB199% of RAM108.64B params

Needs 48+ GB — try a Mac with more RAM

General · rednote-hilab · 2025-05-14

Won't Fit

62.7 GB261% of RAM142.77B params

Needs 63+ GB — try a Mac with more RAM

Nous Hermes 2 Mixtral 8x7B DPO

General · NousResearch · 2024-01-11

Won't Fit

20.8 GB87% of RAM46.7B params

Needs 21+ GB — try a Mac with more RAM

Llama 3.2 90B Vision Instruct

Chat · Meta

Won't Fit

39.1 GB163% of RAM88.59B params

Needs 39+ GB — try a Mac with more RAM

General · bigscience · 2022-05-19

Won't Fit

77.3 GB322% of RAM176.25B params

Needs 77+ GB — try a Mac with more RAM

General · allenai

Won't Fit

32.4 GB135% of RAM73.31B params

Needs 32+ GB — try a Mac with more RAM

Qwen3.5 397B A17B MXFP4

General · amd

Won't Fit

97.3 GB405% of RAM222.2B params

Needs 97+ GB — try a Mac with more RAM

General · gadflyii

Won't Fit

27.3 GB114% of RAM61.52B params

Needs 27+ GB — try a Mac with more RAM

falcon 180B chat

Chat · TII · 2023-09-04

Won't Fit

78.7 GB328% of RAM179.52B params

Needs 79+ GB — try a Mac with more RAM

ERNIE 4.5 300B A47B Paddle

General · baidu · 2025-06-28

Won't Fit

131.4 GB547% of RAM300.47B params

Needs 131+ GB — try a Mac with more RAM

Model not in the list? Paste a HuggingFace URL or ID for an instant fit check.

Frequently Asked Questions

How much RAM do I need to run LLMs on a Mac?

It depends on the model size and quantization. A 7B parameter model at Q4 quantization needs about 5 GB of RAM, while a 70B model needs 40+ GB. Apple Silicon Macs use unified memory, so your entire RAM pool is available for model weights — no separate VRAM required.

Can I run a 70B model on a MacBook Air?

Not comfortably. A 70B model at Q4 quantization needs about 40 GB of RAM. The MacBook Air maxes out at 24-32 GB depending on the generation. You'd need a Mac Studio or MacBook Pro with 48+ GB for a 70B model to run well.

What's the fastest LLM I can run on my Mac?

Speed depends on your chip's memory bandwidth and the model size. Smaller models (3-7B) run fastest — expect 40-70+ tokens per second on M2 Pro or better. Use the calculator above to see estimated speeds for your specific Mac.

What does quantization mean for model quality?

Quantization reduces model precision to use less memory. Q8 (8-bit) is nearly lossless. Q4 (4-bit) reduces memory by ~75% with minor quality loss — it's the sweet spot for most users. Q2 (2-bit) saves the most memory but noticeably degrades output quality.

How is Apple Silicon different from NVIDIA for LLMs?

Apple Silicon uses unified memory — CPU and GPU share the same RAM pool. A Mac with 32 GB can load a 28 GB model directly. On NVIDIA systems, you're limited by GPU VRAM (typically 8-24 GB on consumer cards), even if the PC has 64 GB of system RAM.

Does ToolPiper use GPU or CPU for inference?

ToolPiper uses Metal GPU acceleration via llama.cpp for LLM inference on Apple Silicon. The GPU and CPU share unified memory, so there's no data transfer overhead. The Neural Engine (ANE) is used for specific tasks like super-resolution and pose detection.

Can I run multiple models at the same time?

Yes, if you have enough RAM. ToolPiper manages model loading and can keep multiple models in memory simultaneously. When memory gets tight, it automatically evicts the least recently used model to make room for a new one.

What's the difference between GGUF and other formats?

GGUF is the standard format for running quantized models with llama.cpp (and ToolPiper). It supports all quantization levels and runs on CPU+GPU. MLX is Apple's format optimized for Apple Silicon. AWQ and GPTQ are NVIDIA-focused formats that don't run natively on Mac.

Model database updated: 2026-06-03 · 867 models