Model Leaderboard

Compare key performance metrics for LLM APIs.

Updated at: 6/1/2025, 10:03:56 AM

ModelProvider
Gemini 1.5 Flash
google
Google AI Studio
#80
$0.080.30s152.8T/s99.95%
GPT-4o-mini
openai
Azure
#9
$0.151.04s217.8T/s99.86%
Mistral Nemo
mistralai
Mistral
-
$0.020.26s137.3T/s99.97%
Gemini 1.5 Flash 8B
google
Google AI Studio
#97
$0.040.22s206.5T/s97.32%
Llama 3.3 70B Instruct
meta-llama
Cerebras
#51
$0.070.20s2446.5T/s99.95%
Gemini 2.0 Flash
google
Google Vertex
#18
$0.100.50s165.3T/s99.96%
Gemini 2.0 Flash Lite
google
Google Vertex
#15
$0.080.43s164.1T/s99.99%
DeepSeek V3 0324
deepseek
SambaNova
#6
$0.312.70s122.6T/s99.53%
MythoMax 13B
gryphe
Together
-
$0.070.42s140.1T/s99.98%
Llama 3.1 70B Instruct
meta-llama
Fireworks
-
$0.100.39s96.3T/s99.53%
Llama 3.1 8B Instruct
meta-llama
Cerebras
#125
$0.020.30s3163.2T/s99.93%
Llama 3.2 3B Instruct
meta-llama
SambaNova
#141
$0.010.33s1867.0T/s99.99%
Mistral Small 3
mistralai
Mistral
#89
$0.060.38s142.4T/s99.99%
Gemma 3 27B
google
Parasail
#18
$0.101.11s71.0T/s99.87%
Llama 4 Maverick
meta-llama
Groq
#33
$0.160.55s1134.6T/s99.99%
GPT-4.1 Nano
openai
OpenAI
#45
$0.100.44s106.2T/s99.47%
GPT-4.1
openai
OpenAI
#6
$2.060.84s57.3T/s99.52%
Gemini 2.5 Pro Preview
google
Google AI Studio
#1
$1.332.42s240.9T/s99.72%
Claude Sonnet 4
anthropic
Google Vertex (Europe)
-
$3.121.57s82.9T/s98.55%
Mixtral 8x7B Instruct
mistralai
DeepInfra
#132
$0.080.56s121.2T/s99.98%
Mistral Tiny
mistralai
Mistral
-
$0.250.29s130.7T/s100.00%
Claude 3 Haiku
anthropic
Google Vertex
#100
$0.261.75s183.0T/s99.97%
WizardLM-2 8x22B
microsoft
Parasail
-
$0.501.11s75.6T/s99.93%
GPT-4o
openai
Azure
-
$2.582.88s116.2T/s97.16%
Hermes 2 Pro - Llama-3 8B
nousresearch
Lambda
-
$0.030.29s155.4T/s99.98%
GPT-4o-mini (2024-07-18)
openai
OpenAI
#52
$0.150.44s70.1T/s99.92%
Llama 3 8B Lunaris
sao10k
NovitaAI
-
$0.020.81s86.0T/s100.00%
Hermes 3 405B Instruct
nousresearch
Lambda
-
$0.711.06s34.3T/s99.76%
Hermes 3 70B Instruct
nousresearch
Lambda
-
$0.120.50s48.6T/s99.99%
Qwen2.5 72B Instruct
qwen
Together
#67
$0.120.80s100.5T/s99.97%
Rocinante 12B
thedrummer
Infermatic
-
$0.250.40s76.1T/s99.89%
Ministral 8B
mistralai
Mistral
#108
$0.100.24s126.1T/s100.00%
Claude 3.5 Sonnet
anthropic
Google Vertex
-
$3.121.45s66.2T/s99.75%
UnslopNemo 12B
thedrummer
Infermatic
-
$0.450.52s94.9T/s99.92%
Qwen2.5 Coder 32B Instruct
qwen
Together
#88
$0.060.82s68.2T/s99.96%
GPT-4o (2024-11-20)
openai
OpenAI
-
$2.580.46s68.7T/s99.83%
DeepSeek V3
deepseek
Fireworks
#22
$0.390.86s56.9T/s99.63%
MiniMax-01
minimax
Minimax
-
$0.211.69s27.5T/s99.85%
R1
deepseek
SambaNova
#9
$0.474.62s112.2T/s99.68%
R1 Distill Llama 70B
deepseek
Cerebras
-
$0.100.45s2415.5T/s99.90%
LFM 3B
liquid
Liquid
-
$0.020.98s20.1T/s99.79%
LFM 7B
liquid
Lambda
-
$0.010.42s108.7T/s99.99%
Skyfall 36B V2
thedrummer
Parasail
-
$0.510.90s40.9T/s99.98%
Gemma 3 4B
google
DeepInfra
#63
$0.020.33s81.2T/s93.28%
Llama 4 Scout
meta-llama
Cerebras
#51
$0.080.26s2369.5T/s98.98%
Grok 3 Beta
x-ai
xAI Fast
#6
$3.120.71s59.1T/s99.85%
Grok 3 Mini Beta
x-ai
xAI Fast
-
$0.300.35s184.7T/s99.96%
GPT-4.1 Mini
openai
OpenAI
#15
$0.410.69s62.9T/s99.52%
o4 Mini
openai
OpenAI
#6
$1.145.43s165.0T/s99.53%
Qwen3 235B A22B
qwen
Fireworks
#17
$0.140.83s81.0T/s96.41%
Qwen3 14B
qwen
Nebius AI Studio
-
$0.075.30s329.8T/s98.43%
Gemini 2.5 Flash Preview 05-20 (thinking)
google
Vertex Thinking
#3
$0.181.76s140.1T/s99.19%
R1 0528
deepseek
Baseten
#9
$0.520.40s132.0T/s99.59%
GPT-3.5 Turbo
openai
OpenAI
#123
$0.510.33s52.7T/s99.74%
ReMM SLERP 13B
undi95
Mancer (private)
-
$0.810.72s41.6T/s99.99%
Mistral Large
mistralai
Mistral
#55
$2.050.46s45.3T/s99.26%
Gemini 1.5 Pro
google
Google AI Studio
#49
$1.290.59s75.3T/s99.92%
Llama 3 70B Instruct
meta-llama
Groq
#76
$0.300.21s406.4T/s99.98%
Llama 3 8B Instruct
meta-llama
Groq
#119
$0.030.42s3705.9T/s99.66%
Mistral 7B Instruct
mistralai
Together
#158
$0.030.37s208.5T/s99.97%
Gemma 2 9B
google
Groq
#97
$0.200.41s867.0T/s14.62%
Llama 3.1 405B Instruct
meta-llama
SambaNova
-
$0.812.46s103.4T/s100.00%
ChatGPT-4o
openai
OpenAI
#2
$5.120.48s93.4T/s99.69%
Llama 3.1 Euryale 70B v2.2
sao10k
DeepInfra
-
$0.710.42s38.6T/s99.98%
Command R (08-2024)
cohere
Cohere
#98
$0.150.28s48.0T/s99.82%
Lumimaid v0.2 8B
neversleep
Mancer (private)
-
$0.210.78s57.2T/s98.94%
Llama 3.2 1B Instruct
meta-llama
SambaNova
#167
$0.010.26s7692.3T/s99.75%
Qwen2.5 7B Instruct
qwen
Together
-
$0.040.40s188.9T/s100.00%
Ministral 3B
mistralai
Mistral
-
$0.040.18s231.8T/s99.99%
Claude 3.5 Haiku
anthropic
Google Vertex
-
$0.832.03s76.0T/s98.87%
Gemini 2.0 Flash Experimental (free)
google
Google AI Studio
#15
$0.000.45s167.3T/s37.01%
Grok 2 Vision 1212
x-ai
xAI
-
$2.080.86s77.4T/s99.94%
Llama 3.3 Euryale 70B
sao10k
Infermatic
-
$0.710.62s49.3T/s99.97%
Phi 4
microsoft
Nebius AI Studio
#100
$0.070.20s119.6T/s99.23%
Codestral 2501
mistralai
Mistral
-
$0.310.27s172.7T/s99.71%
o3 Mini
openai
OpenAI
#26
$1.147.67s470.8T/s99.35%
Qwen-Turbo
qwen
Alibaba
-
$0.050.52s107.9T/s99.88%
Claude 3.7 Sonnet (thinking)
anthropic
Anthropic
#14
$3.121.68s54.9T/s96.62%
QwQ 32B
qwen
Groq
#142
$0.150.43s571.4T/s99.49%
Gemma 3 12B
google
Cloudflare
#26
$0.050.30s75.8T/s99.15%
Mistral Small 3.1 24B
mistralai
Mistral
#63
$0.050.24s78.5T/s99.85%
Llama 3.3 Nemotron Super 49B v1
nvidia
Nebius AI Studio
#33
$0.131.39s45.4T/s91.92%
o3
openai
OpenAI
#1
$10.325.55s220.8T/s99.73%
o4 Mini High
openai
OpenAI
-
$1.145.43s1030.6T/s98.55%
Gemini 2.5 Flash Preview 04-17 (thinking)
google
AI Studio Thinking
#6
$0.181.33s182.6T/s99.18%
GLM Z1 32B (free)
thudm
Chutes
-
$0.001.77s50.4T/s94.57%
MAI DS R1 (free)
microsoft
Chutes
-
$0.001.28s71.2T/s99.75%
DeepSeek R1T Chimera (free)
tngtech
Chutes
-
$0.001.84s62.4T/s99.74%
Qwen3 32B
qwen
Cerebras
#24
$0.100.93s1806.5T/s98.99%
Qwen3 30B A3B
qwen
Parasail
#49
$0.080.69s143.9T/s99.61%
DeepSeek Prover V2 (free)
deepseek
Chutes
-
$0.001.47s62.8T/s99.64%
Mistral Medium 3
mistralai
Mistral
#13
$0.420.76s82.7T/s99.92%
Claude Opus 4
anthropic
Google Vertex
-
$15.602.05s46.5T/s97.79%
GPT-3.5 Turbo 16k
openai
OpenAI
#123
$0.510.44s132.5T/s99.95%
Dolphin 2.9.2 Mixtral 8x22B 🐬
cognitivecomputations
NovitaAI
-
$0.911.93s13.6T/s99.94%
Claude 3.5 Sonnet (2024-06-20)
anthropic
Google Vertex
#29
$3.122.31s113.1T/s98.48%
GPT-4o (2024-08-06)
openai
Azure
#38
$2.580.92s148.7T/s99.30%
Pixtral 12B
mistralai
Mistral
-
$0.100.49s74.8T/s99.97%
Llama 3.1 Nemotron 70B Instruct
nvidia
Together
#68
$0.120.61s80.1T/s99.91%
Claude 3.5 Haiku (2024-10-22)
anthropic
Google Vertex
#52
$0.832.24s59.4T/s99.71%
Mistral Large 2411
mistralai
Mistral
#67
$2.050.55s48.6T/s99.94%
Nova Pro 1.0
amazon
Amazon Bedrock
#76
$0.830.60s110.8T/s96.00%
Nova Micro 1.0
amazon
Amazon Bedrock
#108
$0.040.27s275.5T/s91.84%
Nova Lite 1.0
amazon
Amazon Bedrock
#95
$0.060.47s149.1T/s96.00%
Grok 2 1212
x-ai
xAI
-
$2.080.29s78.8T/s99.78%
Sonar
perplexity
Perplexity
-
$1.012.07s100.4T/s99.95%
Qwen2.5 VL 72B Instruct (free)
qwen
Chutes
-
$0.002.25s68.7T/s92.85%
R1 Distill Llama 8B
deepseek
NovitaAI
-
$0.041.48s41.8T/s99.84%
DeepSeek R1 Zero (free)
deepseek
Chutes
-
$0.001.09s72.7T/s99.71%
Sonar Pro
perplexity
Perplexity
-
$3.122.38s61.8T/s99.98%
Anubis Pro 105B V1
thedrummer
Parasail
-
$0.811.09s26.5T/s79.41%
GPT-4o-mini Search Preview
openai
OpenAI
-
$0.151.94s212.1T/s98.87%
DeepSeek V3 Base (free)
deepseek
Chutes
-
$0.001.21s73.8T/s99.05%
Llama 3.1 Nemotron Ultra 253B v1 (free)
nvidia
Chutes
-
$0.001.66s28.2T/s99.35%
GLM 4 32B (free)
thudm
Chutes
-
$0.006.23s51.4T/s95.06%
Qwen3 8B
qwen
NovitaAI
-
$0.040.62s54.0T/s95.81%
DeepHermes 3 Mistral 24B Preview (free)
nousresearch
Chutes
-
$0.001.02s224.3T/s98.73%
Llama 3.3 8B Instruct (free)
meta-llama
Meta
-
$0.000.48s236.7T/s95.44%
Devstral Small (free)
mistralai
Chutes
-
$0.001.14s101.2T/s97.52%
Deepseek R1 0528 Qwen3 8B
deepseek
NovitaAI
-
$0.060.82s63.8T/s98.88%
R1 Distill Qwen 7B
deepseek
GMICloud
-
$0.1016.47s144.8T/s98.56%