Model | Provider | |||||||
---|---|---|---|---|---|---|---|---|
Gemini 1.5 Flash 8B google | Google AI Studio | #83 | $0.04 | 0.38s | 345.3T/s | 99.06% | ||
Qwen2.5 7B Instruct qwen | Together | - | $0.05 | 0.18s | 177.1T/s | 99.99% | ||
Gemini 2.0 Flash google | Google AI Studio | #14 | $0.10 | 0.52s | 183.0T/s | 99.87% | ||
Gemini 1.5 Flash google | Google AI Studio | #65 | $0.08 | 1.01s | 176.2T/s | 99.29% | ||
GPT-4o-mini openai | Azure | #6 | $0.15 | 1.27s | 176.7T/s | 98.83% | ||
Mistral Nemo mistralai | Parasail | - | $0.04 | 0.70s | 138.3T/s | 99.97% | ||
Llama 3.1 8B Instruct meta-llama | Groq | #109 | $0.02 | 0.81s | 1631.6T/s | 99.74% | ||
Llama 3.3 70B Instruct meta-llama | Groq | #42 | $0.10 | 0.42s | 372.1T/s | 99.98% | ||
Claude 3.7 Sonnet (thinking) anthropic | Google Vertex | #11 | $3.12 | 1.70s | 56.2T/s | 99.28% | ||
Claude 3.7 Sonnet anthropic | Google Vertex | #11 | $3.12 | 1.70s | 56.2T/s | 98.51% | ||
Gemini 2.0 Flash Lite google | Google AI Studio | #15 | $0.08 | 0.87s | 152.0T/s | 99.76% | ||
DeepSeek V3 0324 deepseek | SambaNova | #4 | $0.28 | 1.49s | 186.8T/s | 99.57% | ||
MythoMax 13B gryphe | Together | - | $0.07 | 0.44s | 137.1T/s | 99.99% | ||
OpenChat 3.5 7B openchat | Lepton | - | $0.07 | 0.31s | 96.3T/s | 99.99% | ||
Claude 3 Haiku anthropic | Google Vertex | #85 | $0.26 | 0.77s | 156.1T/s | 99.96% | ||
WizardLM-2 8x22B microsoft | Together | - | $0.50 | 0.44s | 79.3T/s | 99.98% | ||
GPT-4o openai | Azure | - | $2.58 | 1.61s | 180.6T/s | 98.67% | ||
Hermes 2 Pro - Llama-3 8B nousresearch | NovitaAI | - | $0.03 | 0.95s | 145.9T/s | 99.97% | ||
Mistral 7B Instruct mistralai | NovitaAI | #142 | $0.03 | 1.19s | 79.3T/s | 99.98% | ||
Llama 3.1 70B Instruct meta-llama | Fireworks | - | $0.12 | 0.68s | 113.8T/s | 99.94% | ||
Hermes 3 405B Instruct nousresearch | Nebius AI Studio | - | $0.81 | 0.99s | 29.1T/s | 99.82% | ||
Qwen2.5 72B Instruct qwen | Together | #57 | $0.12 | 0.51s | 90.6T/s | 99.96% | ||
Llama 3.2 3B Instruct meta-llama | SambaNova | #124 | $0.02 | 0.40s | 3590.9T/s | 100.00% | ||
Claude 3.5 Sonnet anthropic | Google Vertex | - | $3.12 | 1.13s | 59.1T/s | 99.82% | ||
Claude 3.5 Sonnet (self-moderated) anthropic | Anthropic | - | $3.12 | 0.90s | 56.3T/s | 99.61% | ||
Nova Lite 1.0 amazon | Amazon Bedrock | #80 | $0.06 | 0.43s | 189.6T/s | 98.60% | ||
DeepSeek V3 deepseek | Fireworks | #17 | $0.39 | 1.30s | 44.3T/s | 99.90% | ||
DeepSeek V3 (free) deepseek | Chutes | #17 | $0.00 | 1.49s | 43.0T/s | 94.04% | ||
R1 deepseek | SambaNova | #6 | $0.52 | 4.82s | 404.7T/s | 99.61% | ||
R1 (free) deepseek | Chutes | #6 | $0.00 | 9.81s | 67.1T/s | 98.40% | ||
LFM 3B liquid | Liquid | - | $0.02 | 0.76s | 29.7T/s | 99.76% | ||
Mistral Small 3 mistralai | Mistral | #74 | $0.07 | 0.31s | 131.2T/s | 99.96% | ||
Gemma 3 27B google | Parasail | #14 | $0.10 | 1.53s | 53.6T/s | 98.98% | ||
Gemma 3 4B google | DeepInfra | - | $0.02 | 0.34s | 79.0T/s | 95.43% | ||
DeepSeek V3 0324 (free) deepseek | Chutes | #4 | $0.00 | 1.68s | 39.0T/s | 88.36% | ||
Llama 4 Maverick meta-llama | SambaNova | #24 | $0.18 | 1.13s | 641.0T/s | 99.47% | ||
GPT-4.1 Mini openai | OpenAI | - | $0.41 | 0.46s | 89.6T/s | 99.76% | ||
GPT-4.1 openai | OpenAI | - | $2.06 | 0.54s | 59.8T/s | 99.56% | ||
Mixtral 8x7B Instruct mistralai | Fireworks | #116 | $0.24 | 0.23s | 210.0T/s | 100.00% | ||
Mistral Tiny mistralai | Mistral | - | $0.25 | 0.27s | 148.3T/s | 100.00% | ||
Claude 3 Sonnet anthropic | Anthropic | #71 | $3.12 | 0.53s | 71.3T/s | 96.88% | ||
Claude 3 Sonnet (self-moderated) anthropic | Anthropic | #71 | $3.12 | 0.53s | 71.3T/s | 98.02% | ||
Claude 3 Haiku (self-moderated) anthropic | Anthropic | #85 | $0.26 | 0.69s | 148.2T/s | 99.82% | ||
Gemini 1.5 Pro google | Google AI Studio | #40 | $1.29 | 0.69s | 71.2T/s | 95.18% | ||
Llama 3 70B Instruct meta-llama | Groq | #63 | $0.30 | 0.26s | 394.0T/s | 100.00% | ||
Llama 3 8B Instruct meta-llama | Groq | #101 | $0.03 | 0.36s | 2869.6T/s | 99.39% | ||
Llama 3 Lumimaid 8B neversleep | Mancer (private) | - | $0.10 | 0.58s | 64.2T/s | 99.98% | ||
Llama 3 Lumimaid 8B (extended) neversleep | Mancer (private) | - | $0.10 | 0.58s | 64.2T/s | 99.98% | ||
GPT-4o-mini (2024-07-18) openai | OpenAI | #43 | $0.15 | 0.39s | 81.8T/s | 99.98% | ||
Llama 3 8B Lunaris sao10k | NovitaAI | - | $0.02 | 0.61s | 77.9T/s | 100.00% | ||
ChatGPT-4o openai | OpenAI | #2 | $5.12 | 0.55s | 104.2T/s | 99.40% | ||
Hermes 3 70B Instruct nousresearch | Lambda | - | $0.12 | 1.07s | 38.2T/s | 99.97% | ||
Llama 3.1 Euryale 70B v2.2 sao10k | DeepInfra | - | $0.71 | 0.33s | 36.7T/s | 99.98% | ||
Command R (08-2024) cohere | Cohere | #84 | $0.15 | 0.41s | 70.9T/s | 99.95% | ||
Llama 3.2 11B Vision Instruct meta-llama | SambaNova | - | $0.05 | 1.75s | 573.1T/s | 99.42% | ||
Rocinante 12B thedrummer | Infermatic | - | $0.25 | 0.39s | 84.3T/s | 99.92% | ||
Qwen2.5 7B Instruct (free) qwen | Nineteen | - | $0.00 | 0.67s | 283.8T/s | 99.94% | ||
Ministral 3B mistralai | Mistral | - | $0.04 | 0.17s | 237.6T/s | 99.99% | ||
Claude 3.5 Haiku (2024-10-22) anthropic | Google Vertex | #43 | $0.83 | 1.29s | 51.8T/s | 99.86% | ||
Claude 3.5 Haiku (2024-10-22) (self-moderated) anthropic | Anthropic | #43 | $0.83 | 6.43s | 51.2T/s | 95.13% | ||
Claude 3.5 Haiku anthropic | Anthropic | - | $0.83 | 6.82s | 61.7T/s | 99.82% | ||
Unslopnemo 12B thedrummer | Infermatic | - | $0.50 | 0.58s | 79.5T/s | 99.99% | ||
Qwen2.5 Coder 32B Instruct qwen | Together | #71 | $0.07 | 0.61s | 72.2T/s | 99.77% | ||
Mistral Large 2411 mistralai | Mistral | #57 | $2.05 | 0.39s | 50.9T/s | 99.97% | ||
GPT-4o (2024-11-20) openai | OpenAI | - | $2.58 | 0.50s | 100.7T/s | 99.64% | ||
Gemini 2.0 Flash Experimental (free) google | Google Vertex | #8 | $0.00 | 0.92s | 200.6T/s | 84.08% | ||
Grok 2 Vision 1212 x-ai | xAI | - | $2.08 | 0.82s | 62.0T/s | 99.52% | ||
Llama 3.3 Euryale 70B sao10k | Infermatic | - | $0.71 | 0.51s | 48.2T/s | 99.98% | ||
Phi 4 microsoft | Nebius AI Studio | #84 | $0.07 | 0.34s | 124.8T/s | 99.34% | ||
MiniMax-01 minimax | Minimax | - | $0.21 | 1.82s | 28.4T/s | 99.75% | ||
R1 Distill Llama 70B deepseek | SambaNova | - | $0.10 | 3.17s | 1302.0T/s | 99.90% | ||
LFM 7B liquid | Liquid | - | $0.01 | 0.69s | 63.1T/s | 100.00% | ||
R1 Distill Qwen 32B deepseek | NovitaAI | - | $0.12 | 27.03s | 65.4T/s | 99.88% | ||
o3 Mini openai | OpenAI | #20 | $1.14 | 6.15s | 148.4T/s | 99.69% | ||
Qwen-Max qwen | Alibaba | - | $1.65 | 1.01s | 38.7T/s | 96.23% | ||
Claude 3.7 Sonnet (self-moderated) anthropic | Anthropic | #11 | $3.12 | 1.72s | 49.0T/s | 98.11% | ||
QwQ 32B (free) qwen | Nineteen | #126 | $0.00 | 5.33s | 438.0T/s | 98.37% | ||
Gemma 3 12B google | DeepInfra | - | $0.05 | 1.00s | 34.5T/s | 98.93% | ||
Mistral Small 3.1 24B mistralai | Mistral | - | $0.10 | 0.27s | 143.2T/s | 99.79% | ||
Llama 4 Scout meta-llama | Groq | - | $0.08 | 0.39s | 756.5T/s | 99.97% | ||
Llama 4 Maverick (free) meta-llama | Chutes | - | $0.00 | 1.05s | 73.6T/s | 99.69% | ||
Grok 3 Beta x-ai | xAI Fast | #4 | $3.12 | 0.74s | 65.3T/s | 99.45% | ||
Grok 3 Mini Beta x-ai | xAI Fast | - | $0.30 | 4.24s | 187.2T/s | 74.85% | ||
GPT-4.1 Nano openai | OpenAI | - | $0.10 | 0.40s | 245.2T/s | 99.92% | ||
o4 Mini openai | OpenAI | - | $1.14 | 4.45s | 81.9T/s | 82.78% | ||
Gemini 2.5 Flash Preview google | AI Studio Non-Thinking | - | $0.15 | 0.86s | 168.4T/s | 41.64% | ||
Gemini 2.5 Pro Preview google | Google Vertex | #1 | $1.33 | 9.79s | 438.1T/s | 97.69% | ||
GPT-3.5 Turbo openai | OpenAI | #106 | $0.51 | 0.38s | 146.4T/s | 99.65% | ||
ReMM SLERP 13B undi95 | Mancer | - | $0.57 | 1.68s | 43.6T/s | 99.99% | ||
Mistral Large mistralai | Mistral | #43 | $2.05 | 0.42s | 45.7T/s | 99.97% | ||
Dolphin 2.9.2 Mixtral 8x22B 🐬 cognitivecomputations | NovitaAI | - | $0.91 | 2.24s | 12.1T/s | 99.95% | ||
Gemma 2 9B google | Groq | #82 | $0.07 | 0.25s | 1046.2T/s | 100.00% | ||
Llama 3.1 405B Instruct meta-llama | Fireworks | - | $0.81 | 0.61s | 75.1T/s | 99.93% | ||
GPT-4o (2024-08-06) openai | Azure | #29 | $2.58 | 1.46s | 124.3T/s | 99.56% | ||
Llama 3.2 1B Instruct meta-llama | SambaNova | #152 | $0.01 | 0.14s | 2666.7T/s | 99.96% | ||
Claude 3.5 Haiku (self-moderated) anthropic | Anthropic | #43 | $0.83 | 6.82s | 61.7T/s | 89.96% | ||
Mistral Large 2407 mistralai | Mistral | #43 | $2.05 | 0.88s | 45.0T/s | 95.83% | ||
Codestral 2501 mistralai | Mistral | - | $0.31 | 0.27s | 186.3T/s | 99.96% | ||
Gemini 2.0 Flash Thinking Experimental 01-21 (free) google | Google AI Studio | #9 | $0.00 | 3.87s | 190.8T/s | 80.41% | ||
Qwen2.5 VL 72B Instruct qwen | Together | - | $0.71 | 1.16s | 37.9T/s | 99.51% | ||
Qwen-Turbo qwen | Alibaba | - | $0.05 | 1.09s | 105.5T/s | 99.87% | ||
R1 Distill Llama 8B deepseek | NovitaAI | - | $0.04 | 16.71s | 170.0T/s | 97.58% | ||
QwQ 32B qwen | Groq | #126 | $0.15 | 1.21s | 3558.6T/s | 99.71% | ||
Sonar Reasoning Pro perplexity | Perplexity | - | $2.06 | 10.75s | 112.5T/s | 95.32% | ||
Phi 4 Multimodal Instruct microsoft | Parasail | - | $0.05 | 1.91s | 99.0T/s | 88.18% | ||
Skyfall 36B V2 thedrummer | Parasail | - | $0.51 | 3.86s | 20.4T/s | 98.92% | ||
Gemma 3 27B (free) google | Chutes | - | $0.00 | 1.34s | 70.7T/s | 85.64% | ||
Qwen2.5 VL 32B Instruct qwen | Fireworks | - | $0.91 | 0.34s | 68.2T/s | 94.07% | ||
Gemini 2.5 Pro Experimental (free) google | Google Vertex | #1 | $0.00 | 13.51s | 417.8T/s | 57.08% | ||
o4 Mini High openai | OpenAI | - | $1.14 | 8.52s | 91.3T/s | 69.37% |
Created by JC, data from OpenRouter and Chatbot Arena