Model Horizon
DashboardModelsCompareBenchmarks
© 2026 Model Horizon
About|Terms
SYS.v0.1.0
Skip to content
  1. Home
  2. /Benchmarks
  3. /MATH

MATH Leaderboard

Mathematics Problem Solving

Competition-level mathematics problems spanning algebra, geometry, number theory, and calculus. Tests multi-step mathematical reasoning.

18Models Tested
98%Highest Score
89.8%Average
27.9%Spread
#ModelProviderScore
1GPT-5.2OpenAI
OOpenAI
98%
Try
2Claude Sonnet 4.6Anthropic
AAnthropic
97.8%
Try
3Claude Opus 4.6Anthropic
AAnthropic
97.6%
Try
4DeepSeek R1DeepSeek
DDeepSeek
97.3%
Try
5Gemini 3.1 ProGoogle
GGoogle
96.8%
Try
6GPT-5.3 CodexOpenAI
OOpenAI
96%
Try
7Gemini 3 ProGoogle
GGoogle
95%
Try
8o4-miniOpenAI
OOpenAI
93.4%
Try
9Claude Opus 4.5Anthropic
AAnthropic
92%
Try
10Grok 4xAI
XxAI
91.7%
Try
11o3OpenAI
OOpenAI
91.6%
Try
12Gemini 2.5 ProGoogle
GGoogle
90.2%
Try
13Gemini 3 FlashGoogle
GGoogle
90%
Try
14Claude Sonnet 4.5Anthropic
AAnthropic
87%
Try
15Gemini 2.5 FlashGoogle
GGoogle
82.1%
Try
16Llama 4 MaverickMeta
MMeta
75.8%
Try
17GPT-4.1OpenAI
OOpenAI
73.8%
Try
18Llama 4 ScoutMeta
MMeta
70.1%
Try