Model Leaderboard
Compare key performance metrics for LLM APIs.
Updated at: 12/15/2025, 1:02:07 PM
| Model | Provider | |||||||
|---|---|---|---|---|---|---|---|---|
GPT-4o-mini openai | Azure | #14 | $0.15 | 1.32s | 520.8T/s | 99.99% | ||
Gemini 2.0 Flash google | Google AI Studio | #31 | $0.10 | 0.51s | 171.1T/s | 99.79% | ||
Gemini 2.0 Flash Lite google | Google AI Studio | - | $0.08 | 0.45s | 284.4T/s | 99.95% | ||
Gemini 2.5 Flash google | Google Vertex | #7 | $0.32 | 0.93s | 97.3T/s | 99.92% | ||
Gemini 2.5 Flash Lite google | Google AI Studio | - | $0.10 | 0.52s | 86.3T/s | 99.93% | ||
gpt-oss-120b openai | Cerebras | - | $0.04 | 0.25s | 2129.7T/s | 99.43% | ||
Mistral Nemo mistralai | Mistral | - | $0.02 | 0.33s | 196.8T/s | 99.99% | ||
Llama 3.1 70B Instruct meta-llama | Hyperbolic | - | $0.40 | 0.67s | 73.2T/s | 99.99% | ||
Llama 3.1 8B Instruct meta-llama | Cerebras | - | $0.02 | 0.21s | 2800.0T/s | 99.98% | ||
Llama 3.3 70B Instruct meta-llama | Cerebras | #66 | $0.10 | 0.37s | 2533.3T/s | 99.97% | ||
Gemma 3 4B google | DeepInfra | - | $0.02 | 0.48s | 125.0T/s | 99.98% | ||
DeepSeek V3 0324 deepseek | SambaNova | #14 | $0.21 | 0.55s | 288.5T/s | 99.87% | ||
Llama 4 Scout meta-llama | Groq | - | $0.08 | 0.13s | 1000.0T/s | 99.99% | ||
GPT-4.1 Nano openai | Azure | - | $0.10 | 0.60s | 220.4T/s | 99.99% | ||
GPT-4.1 Mini openai | Azure | - | $0.41 | 0.49s | 89.5T/s | 99.99% | ||
Qwen3 32B qwen | Cerebras | - | $0.08 | 1.26s | 618.7T/s | 99.98% | ||
Gemini 2.5 Pro google | Google Vertex (Global) | #1 | $1.33 | 2.62s | 92.7T/s | 99.85% | ||
DeepSeek R1T2 Chimera (free) tngtech | Chutes | - | $0.00 | 2.42s | 28.2T/s | 99.98% | ||
Qwen3 235B A22B Instruct 2507 qwen | Cerebras | - | $0.07 | 0.40s | 1117.6T/s | 99.64% | ||
gpt-oss-20b openai | Groq | - | $0.03 | 0.24s | 1478.5T/s | 99.82% | ||
GPT-5 Mini openai | OpenAI | - | $0.27 | 6.30s | 54.6T/s | 99.89% | ||
DeepSeek V3.1 deepseek | SambaNova | - | $0.16 | 5.16s | 174.1T/s | 99.96% | ||
Grok Code Fast 1 x-ai | xAI | - | $0.21 | 0.74s | 90.8T/s | 72.00% | ||
Kimi K2 0905 moonshotai | Groq | - | $0.41 | 0.32s | 320.0T/s | 99.98% | ||
Grok 4 Fast x-ai | xAI | - | $0.20 | 3.63s | 103.0T/s | 99.80% | ||
Gemini 2.5 Flash Lite Preview 09-2025 google | Google AI Studio | - | $0.10 | 0.37s | 123.6T/s | 99.79% | ||
Gemini 2.5 Flash Preview 09-2025 google | Google AI Studio | - | $0.32 | 0.77s | 140.5T/s | 99.65% | ||
Claude Sonnet 4.5 anthropic | Amazon Bedrock | - | $3.12 | 2.96s | 74.8T/s | 99.94% | ||
GLM 4.6 z-ai | Cerebras | - | $0.41 | 0.58s | 285.2T/s | 99.78% | ||
Nemotron Nano 12B 2 VL (free) nvidia | NVIDIA | - | $0.00 | 3.32s | 65.1T/s | - | ||
KAT-Coder-Pro V1 (free) kwaipilot | StreamLake | - | $0.00 | 0.80s | 54.4T/s | 99.91% | ||
Grok 4.1 Fast x-ai | xAI | - | $0.20 | 4.25s | 69.2T/s | 99.16% | ||
DeepSeek V3.2 deepseek | SiliconFlow | - | $0.24 | 4.48s | 58.4T/s | 98.60% | ||
Devstral 2 2512 (free) mistralai | Mistral | - | $0.00 | 9.72s | 66.9T/s | 88.50% | ||
GPT-4o openai | Azure | - | $2.58 | 1.36s | 258.2T/s | 99.99% | ||
Ministral 3B mistralai | Mistral | - | $0.04 | 0.33s | 313.9T/s | - | ||
DeepSeek V3 deepseek-ai | Chutes | - | $0.31 | 1.31s | 60.7T/s | 99.99% | ||
Mistral Small 3 mistralai | Together | - | $0.03 | 0.12s | 111.5T/s | 99.75% | ||
Gemma 3 27B google | Nebius Token Factory | - | $0.04 | 0.16s | 68.7T/s | 99.15% | ||
Gemma 3 12B google | Crusoe | - | $0.03 | 0.45s | 138.9T/s | 99.43% | ||
Llama 4 Maverick meta-llama | SambaNova | - | $0.15 | 1.62s | 661.8T/s | 99.86% | ||
GPT-4.1 openai | OpenAI | - | $2.06 | 0.56s | 57.5T/s | 99.98% | ||
Gemma 3n 4B google | Together | - | $0.02 | 0.17s | 33.6T/s | 99.89% | ||
Claude Sonnet 4 anthropic | Amazon Bedrock | - | $3.12 | 1.27s | 81.8T/s | 99.97% | ||
Grok 3 Mini x-ai | xAI Fast | - | $0.30 | 0.97s | 97.1T/s | 99.87% | ||
Mistral Small 3.2 24B mistralai | Mistral | - | $0.06 | 0.26s | 119.2T/s | 99.81% | ||
GLM 4 32B z-ai | Z.AI | - | $0.10 | 1.09s | 3000.0T/s | 16.00% | ||
GPT-5 Nano openai | Azure | - | $0.05 | 3.86s | 106.2T/s | 99.93% | ||
GPT-5 openai | Azure | - | $1.33 | 5.97s | 64.8T/s | 99.89% | ||
Qwen3 Next 80B A3B Instruct qwen | Google Vertex | - | $0.10 | 0.29s | 277.9T/s | 99.98% | ||
Claude Haiku 4.5 anthropic | Google Vertex | - | $1.04 | 0.83s | 138.1T/s | 99.93% | ||
MiniMax M2 minimax | Google Vertex | - | $0.21 | 0.42s | 154.9T/s | 99.90% | ||
GPT-5.1 openai | OpenAI | - | $1.33 | 2.99s | 44.8T/s | 99.90% | ||
Gemini 3 Pro Preview google | Google Vertex | - | $2.10 | 4.47s | 77.7T/s | 99.33% | ||
Claude Opus 4.5 anthropic | Amazon Bedrock | - | $5.20 | 2.27s | 77.0T/s | 99.51% | ||
GPT-5.2 openai | OpenAI | - | $1.86 | 3.41s | 40.3T/s | 99.97% |