LLM Performance Benchmarks
Track and compare performance metrics across different LLM models and hardware configurations
Date Reported
LLM Model
CPU
GPU
Performance
Source
2/17/2025
Deepseek-r1:7b
Threadripper Pro 3975WX
Stack: Windows 11
RTX 3080
RAM: N/A
58.42 t/s
TTFT: 0 ms
2/17/2025
Deepseek-r1:14b
Threadripper Pro 3975WX
Stack: Windows 11
RTX 3080
RAM: N/A
18.61 t/s
TTFT: 0 ms
2/17/2025
DeepSeek 14b
Ryzen 9 7850X3D
Stack: Debian 12
4x3090 Dell OEM @ 4.0 x4
RAM: N/A
65.51 t/s
TTFT: 0 ms
2/17/2025
DeepSeek 14b
Xeon w2245-8core
Stack: Ubuntu LTS24
1xRTX3090+1xRTX4000
RAM: N/A
55.41 t/s
TTFT: 0 ms
2/17/2025
DeepSeek 14b
AMD Ryzen 5 7600X
Stack: Win 11 (via Ollama)
MSI Radeon RX 6950 XT 16GB
RAM: N/A
41.3 t/s
TTFT: 0 ms
2/17/2025
DeepSeek 14b
Intel Core i3 12100
Stack: Windows 10 LTSC x64
NVIDIA GeForce RTX 3060
RAM: N/A
29.79 t/s
TTFT: 0 ms
2/17/2025
Unsloth Phi 4 Q4
MacBook Pro M4 Max (128GB RAM)
Stack: MLX-community
N/A
RAM: N/A
51.5 t/s
TTFT: 0 ms
2/17/2025
Phi 4 fp16
MacBook Pro M4 Max (128GB RAM)
Stack: MLX-community
N/A
RAM: N/A
15.5 t/s
TTFT: 0 ms
2/17/2025
Mistral FP4
MacBook Pro M4 Max (128GB RAM)
Stack: MLX-community
N/A
RAM: N/A
34.5 t/s
TTFT: 0 ms
2/17/2025
Mistral FP6
MacBook Pro M4 Max (128GB RAM)
Stack: MLX-community
N/A
RAM: N/A
24.5 t/s
TTFT: 0 ms
2/17/2025
Mistral FP16
MacBook Pro M4 Max (128GB RAM)
Stack: MLX-community
N/A
RAM: N/A
7.5 t/s
TTFT: 0 ms
2/17/2025
mistral-small-24b-instruct-2501-Q8_0.gguf
Apple M2 MacBook (64GB)
Stack: llama.cpp
N/A
RAM: N/A
13 t/s
TTFT: 0 ms
2/17/2025
mistral-small-24b-instruct-2501
M3 36GB
Stack: LocalLLaMA
N/A
RAM: N/A
18 t/s
TTFT: 0 ms
2/17/2025
Meta-Llama 3.1-70B-Instruct.IQ1_M
N/A
Stack: Not specified
7900XTX
RAM: N/A
24.37 t/s
TTFT: 660 ms
2/17/2025
Mixtral 8x22B
M3 Max
Stack: 4-bit quantization
N/A
RAM: N/A
4.5 t/s
TTFT: 0 ms
2/10/2025
DeepSeek-R1:671b (1.58bit)
Threadripper 7960
Stack: llama.cpp with NVMe RAID offload
3090
RAM: N/A
2 t/s
TTFT: 0 ms