LLM Performance Benchmarks

Track and compare performance metrics across different LLM models and hardware configurations

Date Reported	LLM Model	CPU	GPU	Performance
2/17/2025	Deepseek-r1:7b	Threadripper Pro 3975WX Stack: Windows 11	RTX 3080 RAM: N/A	58.42 t/s TTFT: 0 ms
2/17/2025	Deepseek-r1:14b	Threadripper Pro 3975WX Stack: Windows 11	RTX 3080 RAM: N/A	18.61 t/s TTFT: 0 ms
2/17/2025	DeepSeek 14b	Ryzen 9 7850X3D Stack: Debian 12	4x3090 Dell OEM @ 4.0 x4 RAM: N/A	65.51 t/s TTFT: 0 ms
2/17/2025	DeepSeek 14b	Xeon w2245-8core Stack: Ubuntu LTS24	1xRTX3090+1xRTX4000 RAM: N/A	55.41 t/s TTFT: 0 ms
2/17/2025	DeepSeek 14b	AMD Ryzen 5 7600X Stack: Win 11 (via Ollama)	MSI Radeon RX 6950 XT 16GB RAM: N/A	41.3 t/s TTFT: 0 ms
2/17/2025	DeepSeek 14b	Intel Core i3 12100 Stack: Windows 10 LTSC x64	NVIDIA GeForce RTX 3060 RAM: N/A	29.79 t/s TTFT: 0 ms
2/17/2025	Unsloth Phi 4 Q4	MacBook Pro M4 Max (128GB RAM) Stack: MLX-community	N/A RAM: N/A	51.5 t/s TTFT: 0 ms
2/17/2025	Phi 4 fp16	MacBook Pro M4 Max (128GB RAM) Stack: MLX-community	N/A RAM: N/A	15.5 t/s TTFT: 0 ms
2/17/2025	Mistral FP4	MacBook Pro M4 Max (128GB RAM) Stack: MLX-community	N/A RAM: N/A	34.5 t/s TTFT: 0 ms
2/17/2025	Mistral FP6	MacBook Pro M4 Max (128GB RAM) Stack: MLX-community	N/A RAM: N/A	24.5 t/s TTFT: 0 ms
2/17/2025	Mistral FP16	MacBook Pro M4 Max (128GB RAM) Stack: MLX-community	N/A RAM: N/A	7.5 t/s TTFT: 0 ms
2/17/2025	mistral-small-24b-instruct-2501-Q8_0.gguf	Apple M2 MacBook (64GB) Stack: llama.cpp	N/A RAM: N/A	13 t/s TTFT: 0 ms
2/17/2025	mistral-small-24b-instruct-2501	M3 36GB Stack: LocalLLaMA	N/A RAM: N/A	18 t/s TTFT: 0 ms
2/17/2025	Meta-Llama 3.1-70B-Instruct.IQ1_M	N/A Stack: Not specified	7900XTX RAM: N/A	24.37 t/s TTFT: 660 ms
2/17/2025	Mixtral 8x22B	M3 Max Stack: 4-bit quantization	N/A RAM: N/A	4.5 t/s TTFT: 0 ms
2/10/2025	DeepSeek-R1:671b (1.58bit)	Threadripper 7960 Stack: llama.cpp with NVMe RAID offload	3090 RAM: N/A	2 t/s TTFT: 0 ms