Model Leaderboard
Compare key performance metrics for LLM APIs.
Updated at: 3/16/2026, 2:02:39 PM
| Model | Provider | |||||||
|---|---|---|---|---|---|---|---|---|
MiniMax M2.5 minimax | SambaNova | - | $0.26 | 1.11s | 202.0T/s | 100.00% | ||
Gemini 3.1 Flash Lite Preview google | Google AI Studio | - | $0.26 | 0.93s | 71.0T/s | 100.00% | ||
Healer Alpha openrouter | Stealth | - | $0.00 | 1.52s | 47.0T/s | 100.00% | ||
Hunter Alpha openrouter | Stealth | - | $0.00 | 2.18s | 38.0T/s | 100.00% | ||
Grok 4.1 Fast x-ai | xAI | - | $0.20 | 3.55s | 116.0T/s | 100.00% | ||
DeepSeek V3.2 deepseek | NovitaAI | - | $0.26 | 1.58s | 24.0T/s | 100.00% | ||
Gemini 3 Flash Preview google | Google AI Studio | - | $0.52 | 1.21s | 83.0T/s | 100.00% | ||
GPT-4o-mini openai | Azure | - | $0.15 | 0.78s | 56.0T/s | 100.00% | ||
Llama 3.1 8B Instruct meta-llama | Groq | - | $0.02 | 0.15s | 250.0T/s | 100.00% | ||
Gemini 2.0 Flash google | Google AI Studio | - | $0.10 | 0.61s | 85.0T/s | 100.00% | ||
Gemma 3 12B google | Cloudflare | - | $0.04 | 0.28s | 21.0T/s | 100.00% | ||
GPT-4.1 Mini openai | Azure | - | $0.41 | 3.48s | 68.0T/s | 100.00% | ||
Gemini 2.5 Flash google | Google Vertex (Global) | - | $0.32 | 0.68s | 67.0T/s | 100.00% | ||
Gemini 2.5 Flash Lite google | Google AI Studio | - | $0.10 | 0.52s | 133.0T/s | 100.00% | ||
gpt-oss-120b openai | SambaNova | - | $0.04 | 0.78s | 316.0T/s | 100.00% | ||
Kimi K2.5 moonshotai | Inceptron | - | $0.47 | 0.59s | 54.0T/s | 100.00% | ||
Trinity Large Preview (free) arcee-ai | Arcee (Prime Intellect) | - | $0.00 | 0.32s | 37.0T/s | 100.00% | ||
Step 3.5 Flash (free) stepfun | StepFun | - | $0.00 | 3.18s | 50.0T/s | 100.00% | ||
Claude Opus 4.6 anthropic | Google Vertex | - | $5.20 | 1.59s | 42.0T/s | 100.00% | ||
GLM 5 z-ai | Fireworks | - | $0.74 | 0.91s | 93.0T/s | 100.00% | ||
Claude Sonnet 4.6 anthropic | Google Vertex (Global) | - | $3.12 | 1.06s | 40.0T/s | 100.00% | ||
Gemini 3.1 Pro Preview google | Google AI Studio | - | $2.10 | 4.18s | 69.0T/s | 100.00% | ||
GPT-5.4 openai | OpenAI | - | $2.62 | 3.35s | 48.0T/s | 100.00% | ||
Nemotron 3 Super (free) nvidia | NVIDIA | - | $0.00 | 5.37s | 17.0T/s | 100.00% | ||
GLM 5 Turbo z-ai | Z.ai | - | $0.99 | 3.48s | 27.0T/s | 100.00% | ||
Grok 4 Fast x-ai | xAI | - | $0.20 | 2.86s | 149.0T/s | 100.00% | ||
Gemini 2.5 Flash Lite Preview 09-2025 google | Google AI Studio | - | $0.10 | 0.45s | 152.0T/s | 100.00% | ||
Claude Sonnet 4.5 anthropic | Amazon Bedrock | - | $3.12 | 2.39s | 43.0T/s | 100.00% | ||
Claude Haiku 4.5 anthropic | Google Vertex | - | $1.04 | 0.64s | 87.0T/s | 100.00% | ||
Ministral 3 3B 2512 mistralai | Mistral | - | $0.10 | 0.23s | 60.0T/s | 100.00% | ||
GPT-5.2 openai | OpenAI | - | $1.86 | 3.18s | 43.0T/s | 100.00% | ||
MiMo-V2-Flash xiaomi | Xiaomi | - | $0.09 | 2.13s | 44.0T/s | 100.00% | ||
Llama 3 8B Instruct meta-llama | DeepInfra | - | $0.03 | 0.20s | 45.0T/s | 100.00% | ||
GPT-4o openai | OpenAI | - | $2.58 | 0.54s | 47.0T/s | 100.00% | ||
Mistral Nemo mistralai | Mistral | - | $0.02 | 0.23s | 122.0T/s | 100.00% | ||
Llama 3.1 70B Instruct meta-llama | DeepInfra | - | $0.40 | 0.26s | 20.0T/s | 100.00% | ||
Llama 3.3 70B Instruct meta-llama | Groq | - | $0.10 | 0.25s | 189.0T/s | 100.00% | ||
Gemini 2.0 Flash Lite google | Google Vertex | - | $0.08 | 0.55s | 67.0T/s | 100.00% | ||
Gemma 3 27B google | DeepInfra | - | $0.03 | 0.47s | 46.0T/s | 100.00% | ||
DeepSeek V3 0324 deepseek | SambaNova | - | $0.21 | 0.59s | 77.5T/s | 100.00% | ||
Llama 4 Scout meta-llama | Google Vertex | - | $0.08 | 0.46s | 42.5T/s | 100.00% | ||
Llama 4 Maverick meta-llama | Parasail | - | $0.15 | 0.38s | 94.0T/s | 100.00% | ||
GPT-4.1 Nano openai | Azure | - | $0.10 | 0.61s | 85.0T/s | 100.00% | ||
GPT-4.1 openai | Azure | - | $2.06 | 0.80s | 52.0T/s | 100.00% | ||
Qwen3 32B qwen | Groq | - | $0.08 | 0.30s | 410.0T/s | 100.00% | ||
Gemini 2.5 Pro google | Google AI Studio | - | $1.33 | 3.05s | 107.0T/s | 100.00% | ||
Qwen3 235B A22B Instruct 2507 qwen | Weights & Biases | - | $0.07 | 0.36s | 67.0T/s | 100.00% | ||
gpt-oss-20b openai | Groq | - | $0.03 | 0.15s | 720.0T/s | 100.00% | ||
GPT-5 Nano openai | Azure | - | $0.05 | 11.37s | 95.0T/s | 100.00% | ||
GPT-5 Mini openai | Azure | - | $0.27 | 20.12s | 79.0T/s | 100.00% | ||
GPT-5 Chat openai | OpenAI | - | $1.33 | 0.70s | 95.0T/s | 100.00% | ||
DeepSeek V3.1 deepseek | Google Vertex | - | $0.16 | 0.78s | 115.0T/s | 100.00% | ||
Grok 4.20 Beta x-ai | xAI | - | $2.05 | 0.41s | 108.0T/s | 100.00% | ||
GLM 4.7 Flash z-ai | Venice | - | $0.06 | 0.74s | 52.0T/s | 100.00% | ||
Qwen3.5 397B A17B qwen | Together | - | $0.41 | 0.47s | 72.0T/s | 100.00% | ||
Qwen3.5-Flash qwen | Alibaba Cloud Int. | - | $0.10 | 0.50s | 67.0T/s | 100.00% | ||
Qwen3 VL 8B Instruct qwen | Alibaba Cloud Int. | - | $0.08 | 0.66s | 73.0T/s | 100.00% | ||
gpt-oss-safeguard-20b openai | Groq | - | $0.08 | 0.18s | 789.0T/s | 100.00% | ||
GPT-5.1 Chat openai | OpenAI | - | $1.33 | 1.50s | 54.0T/s | 100.00% | ||
GPT-5.1 openai | Azure | - | $1.33 | 2.14s | 63.0T/s | 100.00% | ||
Nemotron 3 Nano 30B A3B nvidia | DeepInfra | - | $0.05 | 0.60s | 93.0T/s | 100.00% | ||
Mistral Small Creative mistralai | Mistral | - | $0.10 | 0.16s | 41.0T/s | 100.00% | ||
GLM 4.7 z-ai | Google Vertex | - | $0.40 | 0.82s | 126.0T/s | 100.00% | ||
GPT-4o-mini (2024-07-18) openai | OpenAI | - | $0.15 | 0.57s | 34.0T/s | 100.00% | ||
Claude 3.5 Haiku anthropic | Amazon Bedrock (US-WEST) | - | $0.83 | 0.84s | 40.0T/s | 100.00% | ||
DeepSeek V3 deepseek-ai | DeepInfra | - | $0.33 | 0.41s | 25.0T/s | 100.00% | ||
Qwen-Turbo qwen | Alibaba Cloud Int. | - | $0.03 | 0.44s | 30.0T/s | 100.00% | ||
Gemma 3n 4B google | Together | - | $0.02 | 0.27s | 29.0T/s | 100.00% | ||
Claude Sonnet 4 anthropic | Amazon Bedrock | - | $3.12 | 1.51s | 68.0T/s | 100.00% | ||
Mistral Small 3.2 24B mistralai | DeepInfra | - | $0.06 | 0.49s | 72.0T/s | 100.00% | ||
GLM 4 32B z-ai | Z.ai | - | $0.10 | 1.11s | 3.0T/s | 100.00% | ||
GLM 4.5 Air z-ai | Nebius Token Factory | - | $0.14 | 0.33s | 51.0T/s | 100.00% | ||
Codestral 2508 mistralai | Mistral | - | $0.31 | 0.28s | 55.0T/s | 100.00% | ||
GPT-5 openai | Azure | - | $1.33 | 15.19s | 68.0T/s | 100.00% | ||
Kimi K2 0905 moonshotai | Groq | - | $0.42 | 0.28s | 161.0T/s | 100.00% | ||
Nano Banana (Gemini 2.5 Flash Image) google | Google AI Studio | - | $0.32 | 6.96s | 154.0T/s | 100.00% | ||
GPT-5.3-Codex openai | OpenAI | - | $1.86 | 4.09s | 44.0T/s | 100.00% |