Model Leaderboard
Compare key performance metrics for LLM APIs.
Updated at: 9/2/2025, 9:01:59 PM
Model | Provider | |||||||
---|---|---|---|---|---|---|---|---|
Gemini 2.0 Flash google | Google AI Studio | #31 | $0.10 | 0.52s | 152.3T/s | 99.98% | ||
Gemma 3 12B google | Cloudflare | - | $0.05 | 0.32s | 78.8T/s | 91.48% | ||
DeepSeek V3 0324 deepseek | GMICloud | #14 | $0.21 | 3.65s | 810.5T/s | 99.91% | ||
Gemini 2.5 Flash google | Google AI Studio | #7 | $0.32 | 0.43s | 111.1T/s | 99.99% | ||
GPT-4o-mini openai | Azure | #14 | $0.15 | 1.05s | 214.5T/s | 99.94% | ||
Mistral Nemo mistralai | Nineteen | - | $0.01 | 0.50s | 271.5T/s | 100.00% | ||
Qwen2.5 7B Instruct qwen | NovitaAI | - | $0.04 | 0.61s | 171.5T/s | 99.97% | ||
Llama 3.3 70B Instruct meta-llama | Cerebras | #66 | $0.04 | 0.41s | 3426.9T/s | 99.99% | ||
Gemini 2.0 Flash Lite google | Google Vertex | - | $0.08 | 0.38s | 153.3T/s | 100.00% | ||
Gemma 3 4B google | DeepInfra | - | $0.02 | 0.88s | 246.8T/s | - | ||
Llama 4 Maverick meta-llama | Cerebras | - | $0.15 | 0.39s | 1155.4T/s | 99.98% | ||
GPT-4.1 Nano openai | OpenAI | - | $0.10 | 0.43s | 102.5T/s | 99.93% | ||
GPT-4.1 Mini openai | OpenAI | - | $0.41 | 0.51s | 86.2T/s | 99.91% | ||
GPT-4.1 openai | OpenAI | - | $2.06 | 0.71s | 80.2T/s | 99.88% | ||
Qwen3 32B qwen | Groq | - | $0.02 | 0.33s | 629.0T/s | 99.89% | ||
Qwen3 30B A3B qwen | Chutes | - | $0.02 | 1.69s | 75.8T/s | 97.75% | ||
Claude Sonnet 4 anthropic | Amazon Bedrock | - | $3.12 | 2.76s | 69.8T/s | 99.98% | ||
Grok 3 Mini x-ai | xAI Fast | - | $0.30 | 0.67s | 175.9T/s | 99.58% | ||
Gemini 2.5 Pro google | Google AI Studio | #1 | $1.33 | 2.78s | 98.5T/s | 99.83% | ||
Gemini 2.5 Flash Lite Preview 06-17 google | Google Vertex | - | $0.10 | 0.34s | 91.2T/s | 100.00% | ||
Mistral Small 3.2 24B mistralai | Mistral | - | $0.05 | 0.36s | 150.3T/s | 99.43% | ||
Gemini 2.5 Flash Lite google | Google Vertex | - | $0.10 | 0.34s | 202.4T/s | 99.99% | ||
DeepSeek V3.1 deepseek | SambaNova | - | $0.21 | 2.56s | 185.1T/s | 99.89% | ||
Grok Code Fast 1 x-ai | xAI | - | $0.21 | 1.33s | 88.2T/s | 92.00% | ||
Mistral Tiny mistralai | Mistral | - | $0.25 | 0.27s | 94.3T/s | 4.00% | ||
Gemini 1.5 Flash google | Google AI Studio | - | $0.08 | 0.40s | 161.2T/s | 100.00% | ||
Llama 3.1 8B Instruct meta-llama | Cerebras | - | $0.02 | 0.22s | 1500.0T/s | 100.00% | ||
Gemini 1.5 Flash 8B google | Google AI Studio | - | $0.04 | 0.46s | 192.3T/s | 99.98% | ||
Nova Lite 1.0 amazon | Amazon Bedrock | - | $0.06 | 0.47s | 351.6T/s | - | ||
Claude 3.7 Sonnet anthropic | Google Vertex (Europe) | - | $3.12 | 2.59s | 61.6T/s | 99.66% | ||
Gemma 3 27B google | Nebius AI Studio | - | $0.07 | 0.47s | 66.0T/s | 98.63% | ||
Llama 4 Scout meta-llama | Cerebras | - | $0.08 | 0.50s | 1455.7T/s | 99.89% | ||
R1 0528 deepseek | Nebius (Fast) | - | $0.21 | 1.20s | 217.0T/s | 99.68% | ||
Kimi K2 moonshotai | Groq | - | $0.16 | 1.48s | 336.5T/s | 99.75% | ||
Qwen3 235B A22B Instruct 2507 qwen | Cerebras | - | $0.08 | 0.57s | 1187.8T/s | 99.92% | ||
gpt-oss-120b openai | Cerebras | - | $0.07 | 0.35s | 2568.4T/s | 99.36% | ||
GPT-5 Nano openai | OpenAI | - | $0.05 | 3.03s | 60.7T/s | 99.65% | ||
GPT-5 Mini openai | OpenAI | - | $0.27 | 4.54s | 52.7T/s | 99.88% |