Model Leaderboard
Compare key performance metrics for LLM APIs.
Updated at: 6/22/2025, 5:03:33 AM
Model | Provider | |||||||
---|---|---|---|---|---|---|---|---|
Gemini 1.5 Flash google | Google AI Studio | #83 | $0.08 | 0.94s | 157.2T/s | 99.97% | ||
GPT-4o-mini openai | Azure | #12 | $0.15 | 1.08s | 176.7T/s | 99.90% | ||
Mistral Nemo mistralai | Parasail | - | $0.01 | 0.39s | 145.0T/s | 100.00% | ||
Gemini 1.5 Flash 8B google | Google AI Studio | #102 | $0.04 | 0.21s | 185.6T/s | 94.66% | ||
Llama 3.3 70B Instruct meta-llama | Cerebras | #57 | $0.05 | 0.18s | 3783.0T/s | 99.91% | ||
Gemini 2.0 Flash google | Google AI Studio | #25 | $0.10 | 0.51s | 176.1T/s | 99.99% | ||
Gemini 2.0 Flash Lite google | Google Vertex | #15 | $0.08 | 0.42s | 172.0T/s | 93.95% | ||
Gemma 3 27B google | kluster.ai | #26 | $0.10 | 1.25s | 64.6T/s | 99.01% | ||
DeepSeek V3 0324 deepseek | SambaNova | #12 | $0.31 | 1.90s | 182.3T/s | 99.41% | ||
Llama 4 Maverick meta-llama | Groq | #43 | $0.15 | 0.23s | 1259.3T/s | 99.82% | ||
Claude Sonnet 4 anthropic | Google Vertex (Europe) | #12 | $3.12 | 2.00s | 67.6T/s | 99.69% | ||
Gemini 2.5 Flash Lite Preview 06-17 google | Google AI Studio | #12 | $0.10 | 0.33s | 289.7T/s | 99.62% | ||
MythoMax 13B gryphe | NovitaAI | - | $0.07 | 1.17s | 85.1T/s | 99.99% | ||
Llama 3.1 70B Instruct meta-llama | Together | - | $0.10 | 0.93s | 140.1T/s | 99.82% | ||
Qwen2.5 72B Instruct qwen | Fireworks | #72 | $0.12 | 3.82s | 39.6T/s | 80.57% | ||
Llama 3.2 3B Instruct meta-llama | SambaNova | #144 | $0.01 | 0.34s | 3117.6T/s | 99.99% | ||
Qwen2.5 7B Instruct qwen | Together | - | $0.04 | 0.32s | 156.9T/s | 99.99% | ||
GPT-4.1 Mini openai | OpenAI | #21 | $0.41 | 0.48s | 69.3T/s | 99.83% | ||
GPT-4.1 openai | OpenAI | #6 | $2.06 | 0.49s | 75.3T/s | 99.79% | ||
Gemini 2.5 Pro Preview 05-06 google | Google Vertex | #2 | $1.33 | 4.50s | 82.8T/s | 99.55% | ||
Mixtral 8x7B Instruct mistralai | DeepInfra | #135 | $0.08 | 0.36s | 119.3T/s | 99.81% | ||
Mistral Tiny mistralai | Mistral | - | $0.25 | 0.29s | 164.4T/s | 100.00% | ||
WizardLM-2 8x22B microsoft | Parasail | - | $0.48 | 1.18s | 56.2T/s | 99.99% | ||
GPT-4o openai | Azure | - | $2.58 | 1.95s | 143.2T/s | 99.87% | ||
Hermes 2 Pro - Llama-3 8B nousresearch | Lambda | - | $0.03 | 0.29s | 152.8T/s | 99.97% | ||
GPT-4o-mini (2024-07-18) openai | OpenAI | #57 | $0.15 | 0.36s | 81.5T/s | 99.98% | ||
Llama 3.1 8B Instruct meta-llama | Cerebras | #128 | $0.02 | 0.14s | 4852.9T/s | 99.99% | ||
Llama 3 8B Lunaris sao10k | NovitaAI | - | $0.02 | 1.11s | 68.6T/s | 99.99% | ||
Hermes 3 405B Instruct nousresearch | Lambda | - | $0.71 | 1.16s | 34.5T/s | 99.78% | ||
Hermes 3 70B Instruct nousresearch | Lambda | - | $0.12 | 0.43s | 49.2T/s | 99.99% | ||
Rocinante 12B thedrummer | Infermatic | - | $0.25 | 0.42s | 64.9T/s | 99.90% | ||
Ministral 8B mistralai | Mistral | #111 | $0.10 | 0.25s | 149.3T/s | 99.87% | ||
Claude 3.5 Sonnet anthropic | Google Vertex | - | $3.12 | 1.27s | 58.8T/s | 99.78% | ||
Claude 3.5 Haiku anthropic | Anthropic | - | $0.83 | 1.33s | 70.1T/s | 98.48% | ||
UnslopNemo 12B thedrummer | Infermatic | - | $0.45 | 0.54s | 93.9T/s | 99.89% | ||
GPT-4o (2024-11-20) openai | OpenAI | - | $2.58 | 0.41s | 92.4T/s | 99.85% | ||
DeepSeek V3 deepseek | Fireworks | #30 | $0.39 | 1.03s | 76.8T/s | 99.95% | ||
MiniMax-01 minimax | Minimax | - | $0.21 | 1.63s | 27.3T/s | 98.37% | ||
R1 deepseek | DeepInfra Turbo | #12 | $0.47 | 0.41s | 147.4T/s | 99.93% | ||
R1 Distill Llama 70B deepseek | Cerebras | - | $0.10 | 0.20s | 2829.8T/s | 99.99% | ||
LFM 3B liquid | Liquid | - | $0.02 | 1.04s | 18.2T/s | 99.84% | ||
LFM 7B liquid | Lambda | - | $0.01 | 0.44s | 117.0T/s | 99.99% | ||
Mistral Small 3 mistralai | Mistral | #92 | $0.05 | 0.31s | 82.6T/s | 99.96% | ||
Gemma 3 4B google | DeepInfra | #66 | $0.02 | 0.27s | 106.0T/s | 99.87% | ||
Mistral Small 3.1 24B mistralai | Mistral | #69 | $0.05 | 0.23s | 101.3T/s | 99.98% | ||
Llama 4 Scout meta-llama | Cerebras | #51 | $0.08 | 0.54s | 2000.0T/s | 99.98% | ||
Grok 3 Beta x-ai | xAI | #8 | $3.12 | 0.54s | 63.3T/s | 99.69% | ||
Grok 3 Mini Beta x-ai | xAI Fast | #26 | $0.30 | 0.32s | 189.1T/s | 99.93% | ||
GPT-4.1 Nano openai | OpenAI | #50 | $0.10 | 0.25s | 244.0T/s | 99.68% | ||
Qwen3 235B A22B qwen | Fireworks | #22 | $0.13 | 0.62s | 77.7T/s | 97.98% | ||
Qwen3 32B qwen | Cerebras | #24 | $0.10 | 0.64s | 721.1T/s | 99.88% | ||
Gemini 2.5 Flash Preview 05-20 (thinking) google | AI Studio Thinking | #6 | $0.18 | 1.68s | 131.9T/s | 99.52% | ||
R1 0528 deepseek | Baseten | #12 | $0.52 | 0.36s | 132.3T/s | 99.51% | ||
Gemini 2.5 Pro Preview 06-05 google | Google Vertex | #1 | $1.33 | 2.64s | 93.0T/s | 99.67% | ||
Gemini 2.5 Pro google | Google Vertex | #1 | $1.33 | 2.20s | 85.7T/s | 99.78% | ||
Gemini 2.5 Flash google | Google AI Studio | #6 | $0.32 | 0.52s | 114.5T/s | 99.89% | ||
GPT-3.5 Turbo openai | OpenAI | #125 | $0.51 | 0.34s | 172.7T/s | 99.58% | ||
ReMM SLERP 13B undi95 | Mancer (private) | - | $0.81 | 0.73s | 43.6T/s | 99.98% | ||
Claude 3 Haiku anthropic | Google Vertex | #103 | $0.26 | 1.22s | 168.7T/s | 99.53% | ||
Gemini 1.5 Pro google | Google Vertex | #54 | $1.29 | 1.31s | 66.0T/s | 99.93% | ||
Llama 3 70B Instruct meta-llama | Groq | #79 | $0.30 | 0.20s | 427.6T/s | 99.99% | ||
Llama 3 8B Instruct meta-llama | Groq | #122 | $0.03 | 0.39s | 5269.7T/s | 99.77% | ||
Mistral 7B Instruct mistralai | Together | #161 | $0.03 | 0.53s | 219.1T/s | 99.95% | ||
Gemma 2 9B google | Groq | #101 | $0.20 | 0.39s | 1018.9T/s | 98.14% | ||
Llama 3.1 405B Instruct meta-llama | SambaNova | - | $0.81 | 1.79s | 93.1T/s | 99.76% | ||
ChatGPT-4o openai | OpenAI | #2 | $5.12 | 0.44s | 99.9T/s | 99.54% | ||
Llama 3.1 Euryale 70B v2.2 sao10k | DeepInfra | - | $0.71 | 0.32s | 39.7T/s | 99.98% | ||
Lumimaid v0.2 8B neversleep | Mancer (private) | - | $0.21 | 0.77s | 66.0T/s | 99.98% | ||
Ministral 3B mistralai | Mistral | - | $0.04 | 0.19s | 243.1T/s | 99.91% | ||
Qwen2.5 Coder 32B Instruct qwen | Together | #89 | $0.06 | 0.23s | 342.0T/s | 99.86% | ||
Gemini 2.0 Flash Experimental (free) google | Google Vertex | #15 | $0.00 | 1.03s | 173.2T/s | 45.09% | ||
Grok 2 Vision 1212 x-ai | xAI | - | $2.08 | 0.97s | 76.6T/s | 99.91% | ||
Llama 3.3 Euryale 70B sao10k | Infermatic | - | $0.71 | 0.52s | 45.9T/s | 99.93% | ||
Phi 4 microsoft | Nebius AI Studio | #103 | $0.07 | 0.10s | 125.5T/s | 98.20% | ||
Codestral 2501 mistralai | Mistral | - | $0.31 | 0.30s | 293.4T/s | 99.89% | ||
R1 Distill Qwen 32B deepseek | DeepInfra | - | $0.12 | 0.50s | 46.9T/s | 99.04% | ||
o3 Mini openai | OpenAI | #32 | $1.14 | 7.95s | 361.0T/s | 99.23% | ||
Qwen2.5 VL 72B Instruct qwen | Hyperbolic | - | $0.26 | 1.60s | 36.5T/s | 98.18% | ||
Qwen-Turbo qwen | Alibaba | - | $0.05 | 0.65s | 110.1T/s | 99.71% | ||
Claude 3.7 Sonnet (thinking) anthropic | Anthropic | #21 | $3.12 | 1.65s | 57.4T/s | 98.92% | ||
GPT-4.5 (Preview) openai | OpenAI | #4 | $76.20 | 1.48s | 11.5T/s | 95.43% | ||
QwQ 32B qwen | Groq | #145 | $0.15 | 1.05s | 524.0T/s | 99.92% | ||
Skyfall 36B V2 thedrummer | Parasail | - | $0.51 | 0.65s | 53.5T/s | 99.88% | ||
Gemma 3 12B google | Cloudflare | #33 | $0.05 | 0.55s | 71.7T/s | 87.92% | ||
o4 Mini openai | OpenAI | #12 | $1.14 | 5.18s | 99.7T/s | 63.03% | ||
MAI DS R1 (free) microsoft | Chutes | - | $0.00 | 1.12s | 67.3T/s | 97.72% | ||
DeepSeek R1T Chimera (free) tngtech | Chutes | - | $0.00 | 2.46s | 32.0T/s | 70.51% | ||
Qwen3 30B A3B qwen | Fireworks | #49 | $0.08 | 0.84s | 140.9T/s | 99.72% | ||
Claude Opus 4 anthropic | Anthropic | - | $15.60 | 2.70s | 32.5T/s | 96.95% | ||
Valkyrie 49B V1 thedrummer | Parasail | - | $0.51 | 0.67s | 44.9T/s | 99.89% | ||
Deepseek R1 0528 Qwen3 8B deepseek | Parasail | - | $0.05 | 0.62s | 99.9T/s | 99.18% | ||
Grok 3 Mini x-ai | xAI Fast | #26 | $0.30 | 0.40s | 193.9T/s | 96.64% | ||
GPT-3.5 Turbo 16k openai | OpenAI | #125 | $0.51 | 0.43s | 173.4T/s | 99.80% | ||
GPT-4 Turbo (older v1106) openai | OpenAI | #57 | $10.24 | 1.40s | 9.7T/s | 95.25% | ||
Mistral Large mistralai | Mistral | #60 | $2.05 | 0.29s | 90.0T/s | 99.30% | ||
GPT-4 Turbo openai | OpenAI | - | $10.24 | 0.81s | 47.7T/s | 98.67% | ||
Llama 3.1 Sonar 70B Online perplexity | Perplexity | - | $1.01 | 1.66s | 87.0T/s | 93.18% | ||
GPT-4o (2024-08-06) openai | Azure | #44 | $2.58 | 0.82s | 152.3T/s | 99.17% | ||
Command R (08-2024) cohere | Cohere | #103 | $0.15 | 0.83s | 82.2T/s | 99.96% | ||
Pixtral 12B mistralai | Hyperbolic | - | $0.10 | 1.55s | 85.9T/s | 99.91% | ||
Claude 3.5 Haiku (2024-10-22) anthropic | Google Vertex | #57 | $0.83 | 2.28s | 78.6T/s | 95.97% | ||
Mistral Large 2411 mistralai | Mistral | #72 | $2.05 | 0.54s | 38.2T/s | 99.80% | ||
Sonar perplexity | Perplexity | - | $1.01 | 1.88s | 113.1T/s | 99.86% | ||
Anubis Pro 105B V1 thedrummer | Parasail | - | $0.81 | 1.05s | 26.9T/s | 98.21% | ||
DeepSeek V3 Base (free) deepseek | Chutes | - | $0.00 | 1.28s | 81.9T/s | 99.71% | ||
Gemini 2.5 Flash Preview 04-17 (thinking) google | AI Studio Thinking | #8 | $0.18 | 1.39s | 169.4T/s | 97.04% | ||
Qwen3 14B qwen | Nebius AI Studio | - | $0.06 | 0.63s | 90.2T/s | 98.59% | ||
Mistral Medium 3 mistralai | Mistral | #18 | $0.42 | 0.53s | 59.0T/s | 95.82% | ||
Devstral Small (free) mistralai | Chutes | - | $0.00 | 1.50s | 82.6T/s | 94.71% |