Chatbot Arena Elo Rating
Crowdsourced ranking where users compare model outputs head-to-head. Higher Elo indicates stronger overall performance across diverse tasks.
| # | Model | Score | |
|---|---|---|---|
| 1 | Gemini 3.1 ProGoogle | 1,500 | Try |
| 2 | Claude Opus 4.6Anthropic | 1,496 | Try |
| 3 | Gemini 3 ProGoogle | 1,486 | Try |
| 4 | GPT-5.3 CodexOpenAI | 1,480 | Try |
| 5 | Claude Sonnet 4.6Anthropic | 1,470 | Try |
| 6 | Gemini 3 FlashGoogle | 1,470 | Try |
| 7 | Claude Opus 4.5Anthropic | 1,467 | Try |
| 8 | Grok 4.1xAI | 1,465 | Try |
| 9 | GPT-5.1OpenAI | 1,458 | Try |
| 10 | Claude Sonnet 4.5Anthropic | 1,450 | Try |
| 11 | Gemini 2.5 ProGoogle | 1,450 | Try |
| 12 | GPT-5.2OpenAI | 1,438 | Try |
| 13 | o3OpenAI | 1,433 | Try |
| 14 | DeepSeek V3.2DeepSeek | 1,419 | Try |
| 15 | DeepSeek R1DeepSeek | 1,418 | Try |
| 16 | Mistral Large 3Mistral | 1,414 | Try |
| 17 | GPT-4.1OpenAI | 1,413 | Try |
| 18 | Gemini 2.5 FlashGoogle | 1,410 | Try |
| 19 | Grok 4xAI | 1,410 | Try |
| 20 | Claude Haiku 4.5Anthropic | 1,404 | Try |
| 21 | o4-miniOpenAI | 1,380 | Try |
| 22 | Llama 4 MaverickMeta | 1,365 | Try |
| 23 | Llama 4 ScoutMeta | 1,330 | Try |
| 24 | GPT-4.1 miniOpenAI | 1,280 | Try |
| 25 | Mistral Small 3.2Mistral | 1,220 | Try |
| 26 | GPT-4.1 nanoOpenAI | 1,190 | Try |