Model Leaderboard
Compare key performance metrics for LLM APIs.
Updated at: 10/25/2025, 5:01:58 PM
| Model | Provider | |||||||
|---|---|---|---|---|---|---|---|---|
Gemini 2.0 Flash google | Google AI Studio | #31 | $0.10 | 0.48s | 168.9T/s | 99.99% | ||
Gemma 3 12B google | Crusoe | - | $0.03 | 0.42s | 171.0T/s | 99.96% | ||
Gemini 2.5 Flash google | Google Vertex | #7 | $0.32 | 0.84s | 103.1T/s | 99.98% | ||
Gemini 2.5 Flash Lite google | Google Vertex | - | $0.10 | 0.37s | 100.0T/s | 99.91% | ||
Grok Code Fast 1 x-ai | xAI | - | $0.21 | 1.06s | 99.7T/s | 100.00% | ||
GPT-4o-mini openai | Azure | #14 | $0.15 | 0.69s | 269.0T/s | 99.96% | ||
Mistral Nemo mistralai | Mistral | - | $0.02 | 0.33s | 169.5T/s | 99.99% | ||
Llama 3.3 70B Instruct meta-llama | Cerebras | #66 | $0.13 | 0.22s | 2600.0T/s | 99.89% | ||
Gemini 2.0 Flash Lite google | Google AI Studio | - | $0.08 | 0.41s | 363.2T/s | 99.99% | ||
Gemma 3 27B google | Chutes | - | $0.09 | 1.35s | 67.8T/s | 99.60% | ||
Gemma 3 4B google | Chutes | - | $0.02 | 1.19s | 60.4T/s | 99.84% | ||
DeepSeek V3 0324 deepseek | SambaNova | #14 | $0.25 | 0.48s | 325.8T/s | 99.96% | ||
Llama 4 Maverick meta-llama | Groq | - | $0.15 | 0.24s | 687.4T/s | 99.97% | ||
GPT-4.1 Mini openai | Azure | - | $0.41 | 0.64s | 70.6T/s | 99.97% | ||
Gemini 2.5 Pro google | Google Vertex (Global) | #1 | $1.33 | 2.54s | 88.5T/s | 99.78% | ||
Gemini 2.5 Flash Lite Preview 06-17 google | Google AI Studio | - | $0.10 | 1.04s | 14.4T/s | 100.00% | ||
Mistral Small 3.2 24B mistralai | Chutes | - | $0.06 | 0.61s | 217.9T/s | 99.41% | ||
Qwen3 235B A22B Instruct 2507 qwen | Cerebras | - | $0.08 | 2.18s | 2116.5T/s | 99.77% | ||
gpt-oss-20b openai | Groq | - | $0.03 | 0.12s | 4692.3T/s | 99.60% | ||
gpt-oss-120b openai | Cerebras | - | $0.04 | 0.33s | 2898.5T/s | 98.80% | ||
DeepSeek V3.1 deepseek | SambaNova | - | $0.28 | 2.79s | 199.2T/s | 99.56% | ||
Grok 4 Fast x-ai | xAI | - | $0.20 | 2.13s | 135.7T/s | 100.00% | ||
Gemini 2.5 Flash Lite Preview 09-2025 google | Google AI Studio | - | $0.10 | 0.57s | 167.6T/s | 99.73% | ||
Gemini 2.5 Flash Preview 09-2025 google | Google AI Studio | - | $0.32 | 3.58s | 174.5T/s | 99.58% | ||
DeepSeek V3.2 Exp deepseek | DeepSeek | - | $0.27 | 1.55s | 27.1T/s | 99.71% | ||
Claude Sonnet 4.5 anthropic | Anthropic | - | $3.12 | 2.60s | 61.8T/s | 99.92% | ||
GLM 4.6 z-ai | Z.AI | - | $0.51 | 0.67s | 91.5T/s | 99.84% | ||
Mistral Tiny mistralai | Mistral | - | $0.25 | 0.22s | 300.0T/s | - | ||
Llama 3.1 8B Instruct meta-llama | Cerebras | - | $0.02 | 0.32s | 2269.2T/s | 99.97% | ||
DeepSeek V3 deepseek | Chutes | - | $0.31 | 1.35s | 81.6T/s | 99.96% | ||
Mistral Small 3 mistralai | Mistral | - | $0.05 | 0.24s | 203.0T/s | 99.98% | ||
Claude 3.7 Sonnet anthropic | Google Vertex (Global) | - | $3.12 | 0.38s | 57.3T/s | 99.89% | ||
Llama 4 Scout meta-llama | Cerebras | - | $0.08 | 0.30s | 1802.4T/s | 99.97% | ||
GPT-4.1 Nano openai | Azure | - | $0.10 | 0.76s | 136.5T/s | 99.99% | ||
GPT-4.1 openai | Azure | - | $2.06 | 0.71s | 137.9T/s | 99.97% | ||
Qwen3 32B qwen | Groq | - | $0.05 | 0.26s | 640.6T/s | 99.94% | ||
Gemma 3n 4B google | Together | - | $0.02 | 0.34s | 45.7T/s | 99.99% | ||
Claude Sonnet 4 anthropic | Amazon Bedrock | - | $3.12 | 1.78s | 102.7T/s | 99.94% | ||
Grok 3 Mini x-ai | xAI Fast | - | $0.30 | 0.42s | 92.6T/s | 99.96% | ||
DeepSeek R1T2 Chimera (free) tngtech | Chutes | - | $0.00 | 3.13s | 17.7T/s | 99.88% | ||
Qwen3 Coder 480B A35B qwen | Cerebras | - | $0.23 | 0.35s | 2614.1T/s | 99.12% | ||
GPT-5 Mini openai | Azure | - | $0.27 | 7.10s | 79.6T/s | 99.55% | ||
Qwen3 Next 80B A3B Instruct qwen | Hyperbolic | - | $0.11 | 0.49s | 433.0T/s | 99.53% | ||
Qwen3 VL 235B A22B Instruct qwen | Chutes | - | $0.31 | 1.95s | 72.4T/s | 95.05% | ||
Claude Haiku 4.5 anthropic | Anthropic | - | $1.04 | 1.37s | 164.6T/s | 99.93% |