llmpm
v0.1
Home
Models
Rankings
Docs
GitHub
Login
RANKINGS
Open LLM Leaderboard — 4,637 models ranked
Proprietary Models
Open Models
Showing 1–20 of 4,637
D
DeepSeek
G
Google
M
Meta
Ms
Microsoft
Mi
Mistral
O
OpenAI
Q
Qwen
X
xAI
#
Model
Avg
BBH
GPQA
IFEval
MATH
MMLU-Pro
MuSR
1
kimi-k2.5/
kimi-k2.5
77.3%
—
87.6%
—
—
—
—
2
kimi-k2-thinking-0905/
kimi-k2-thinking-0905
73.4%
—
84.5%
—
—
—
—
3
qwen3.5-397b-a17b/
qwen3.5-397b-a17b
70.2%
—
88.4%
—
—
—
—
4
glm-4.7/
glm-4.7
70.0%
—
85.7%
—
—
—
—
5
longcat-flash-thinking-2601/
longcat-flash-thinking-2601
66.4%
—
80.5%
—
—
—
—
6
mimo-v2-flash/
mimo-v2-flash
66.3%
—
83.7%
—
—
—
—
7
deepseek-reasoner/
deepseek-reasoner
65.0%
—
82.4%
—
—
—
—
8
minimax-m2.1/
minimax-m2.1
62.6%
—
81.0%
—
—
—
—
9
glm-4.6/
glm-4.6
61.0%
—
81.0%
—
—
—
—
10
deepseek-v3.2-exp/
deepseek-v3.2-exp
59.4%
—
79.9%
—
—
—
—
11
glm-4.7-flash/
glm-4.7-flash
56.6%
—
75.2%
—
—
—
—
12
minimax-m2/
minimax-m2
56.4%
—
78.0%
—
—
—
—
13
MaziyarPanahi/
calme-3.2-instruct-78b
Qwen2ForCausalLM · bfloat16
52.1%
62.6%
20.4%
80.6%
40.3%
70.0%
38.5%
14
MaziyarPanahi/
calme-3.1-instruct-78b
Qwen2ForCausalLM · bfloat16
51.3%
62.4%
19.5%
81.4%
39.3%
68.7%
36.5%
15
dfurman/
CalmeRys-78B-Orpo-v0.1
Qwen2ForCausalLM · bfloat16
51.2%
61.9%
20.0%
81.6%
40.6%
66.8%
36.4%
16
MaziyarPanahi/
calme-2.4-rys-78b
Qwen2ForCausalLM · bfloat16
50.8%
62.2%
20.4%
80.1%
40.7%
66.7%
34.6%
17
huihui-ai/
Qwen2.5-72B-Instruct-abliterated
Qwen2ForCausalLM · bfloat16
48.1%
60.5%
19.4%
85.9%
60.1%
50.4%
12.3%
18
Qwen/
Qwen2.5-72B-Instruct
Qwen2ForCausalLM · bfloat16
48.0%
61.9%
16.7%
86.4%
59.8%
51.4%
11.7%
19
MaziyarPanahi/
calme-2.1-qwen2.5-72b
Qwen2ForCausalLM · bfloat16
47.9%
61.7%
15.1%
86.6%
59.1%
51.3%
13.3%
20
newsbang/
Homer-v1.0-Qwen2.5-72B
Qwen2ForCausalLM · bfloat16
47.5%
62.3%
22.1%
76.3%
49.0%
57.2%
17.9%
1
2
3
4
5
…
232