YPerf

Track & Compare LLM API Performance Metrics

Last updated: 1/13/2025, 7:01:22 AM

ModelProvider
MythoMax 13B
gryphe
Fireworks
-
$0.070.22s108.8T/s99.95%
Gemini Flash 1.5
google
Google Vertex
#28
$0.082.81s172.4T/s99.86%
GPT-4o-mini
openai
OpenAI
#23
$0.150.65s76.3T/s97.97%
Llama 3.1 70B Instruct
meta-llama
Together
#39
$0.120.43s83.6T/s100.00%
Llama 3.2 1B Instruct
meta-llama
Lepton
#149
$0.010.31s298.6T/s100.00%
Gemini Flash 1.5 8B
google
Google AI Studio
#67
$0.040.48s222.9T/s99.99%
DeepSeek V3
deepseek
DeepSeek
#8
$0.141.13s46.4T/s99.53%
Mistral Tiny
mistralai
Mistral
#138
$0.251.23s50.9T/s100.00%
GPT-4o-mini (2024-07-18)
openai
OpenAI
#23
$0.150.40s64.2T/s100.00%
Mistral Nemo
mistralai
Mistral
-
$0.041.36s79.5T/s99.99%
Llama 3.1 8B Instruct
meta-llama
Avian.io
#103
$0.020.48s272.3T/s99.99%
Llama 3.3 70B Instruct
meta-llama
Avian.io
#23
$0.120.50s88.1T/s99.97%
OpenChat 3.5 7B
openchat
Lepton
#121
$0.060.40s83.4T/s100.00%
WizardLM-2 7B
microsoft
Lepton
-
$0.060.24s56.0T/s100.00%
WizardLM-2 8x22B
microsoft
Together
-
$0.500.46s66.0T/s100.00%
Llama 3 70B Instruct
meta-llama
Fireworks
#56
$0.230.50s160.8T/s100.00%
Llama 3 8B Instruct
meta-llama
Fireworks
#94
$0.030.26s213.1T/s99.98%
Mistral 7B Instruct v0.3
mistralai
Together
-
$0.030.14s153.2T/s96.00%
Llama 3 Euryale 70B v2.1
sao10k
DeepInfra
-
$0.710.41s32.0T/s99.99%
Hermes 3 405B Instruct
nousresearch
Lambda
#15
$0.811.24s18.5T/s100.00%
Command R (08-2024)
cohere
Cohere
#69
$0.150.25s90.4T/s100.00%
Qwen2.5 72B Instruct
qwen
Together
#38
$0.231.61s93.6T/s100.00%
Llama 3.2 3B Instruct
meta-llama
Lambda
#121
$0.020.34s356.6T/s100.00%
Ministral 8B
mistralai
Mistral
#82
$0.101.25s60.7T/s100.00%
Claude 3.5 Sonnet
anthropic
Anthropic
#7
$3.122.42s57.3T/s99.90%
Qwen2.5 Coder 32B Instruct
qwen
Fireworks
#56
$0.070.64s95.8T/s100.00%
Mistral Large 2411
mistralai
Mistral
#40
$2.051.69s46.0T/s95.99%
GPT-3.5 Turbo 16k
openai
OpenAI
-
$0.510.31s123.5T/s100.00%
GPT-3.5 Turbo
openai
OpenAI
#99
$0.510.34s127.4T/s99.97%
Hermes 13B
nousresearch
NovitaAI
-
$0.170.59s81.5T/s100.00%
OpenHermes 2.5 Mistral 7B
teknium
NovitaAI
#122
$0.171.04s61.6T/s99.93%
Mixtral 8x7B Instruct
mistralai
Fireworks
#113
$0.240.38s142.7T/s100.00%
Gemini Pro 1.5
google
Google AI Studio
#9
$1.290.83s53.7T/s100.00%
Mixtral 8x22B Instruct
mistralai
Together
#89
$0.910.35s67.9T/s100.00%
Llama 3 8B Instruct (extended)
meta-llama
Mancer (private)
-
$0.200.47s33.9T/s100.00%
Hermes 2 Pro - Llama-3 8B
nousresearch
NovitaAI
-
$0.030.63s145.2T/s100.00%
Mistral 7B Instruct (nitro)
mistralai
Lepton
-
$0.070.25s69.1T/s68.00%
Mistral 7B Instruct
mistralai
Together
#138
$0.030.27s171.7T/s100.00%
Mistral 7B Instruct (free)
mistralai
Lepton
-
$0.000.36s84.5T/s100.00%
Dolphin 2.9.2 Mixtral 8x22B 🐬
cognitivecomputations
NovitaAI
-
$0.916.39s9.2T/s99.65%
Gemma 2 27B
google
Together
#46
$0.270.37s64.6T/s99.97%
Llama 3.1 405B Instruct
meta-llama
Fireworks
-
$0.810.87s53.8T/s99.97%
GPT-4o (2024-08-06)
openai
Azure
#16
$2.583.28s128.4T/s100.00%
Hermes 3 70B Instruct
nousresearch
Hyperbolic
#43
$0.120.98s31.9T/s100.00%
Llama 3.1 Euryale 70B v2.2
sao10k
DeepInfra
-
$0.710.40s36.4T/s99.98%
Command R+ (08-2024)
cohere
Cohere
#48
$2.450.38s59.0T/s99.95%
Llama 3.2 1B Instruct (free)
meta-llama
SambaNova
#149
$0.000.70s2148.5T/s98.77%
Rocinante 12B
thedrummer
Infermatic
-
$0.250.83s22.7T/s100.00%
Ministral 3B
mistralai
Mistral
-
$0.041.19s94.1T/s99.99%
Claude 3.5 Sonnet (self-moderated)
anthropic
Anthropic
-
$3.122.52s59.2T/s99.18%
Claude 3.5 Haiku
anthropic
Google Vertex
#22
$0.832.82s64.8T/s100.00%
Unslopnemo 12b
thedrummer
Infermatic
-
$0.500.55s68.9T/s99.50%
GPT-4o (2024-11-20)
openai
OpenAI
#1
$2.580.72s122.5T/s99.98%
Gemini Flash 2.0 Experimental (free)
google
Google Vertex
#5
$0.001.29s130.6T/s72.22%
Grok 2 1212
x-ai
xAI
-
$2.080.50s62.3T/s99.68%
Llama 3.3 Euryale 70B
sao10k
Infermatic
-
$1.5120.70s10.1T/s100.00%
MythoMax 13B (nitro)
gryphe
Fireworks
-
$0.200.22s108.8T/s100.00%
ReMM SLERP 13B
undi95
Mancer (private)
-
$0.811.47s27.0T/s100.00%
Mistral Small
mistralai
Mistral
-
$0.201.51s51.3T/s100.00%
Mistral Large
mistralai
Mistral
#28
$2.051.54s28.9T/s99.92%
Claude 3 Haiku
anthropic
Google Vertex
-
$0.261.00s148.1T/s100.00%
Claude 3 Haiku (self-moderated)
anthropic
Anthropic
-
$0.260.73s132.6T/s99.62%
Command R
cohere
Cohere
-
$0.490.08s119.4T/s100.00%
Llama 3 8B Instruct (free)
meta-llama
Together (lite)
-
$0.000.49s154.0T/s100.00%
GPT-4o (extended)
openai
OpenAI
-
$6.140.53s110.2T/s80.00%
GPT-4o
openai
OpenAI
-
$2.580.53s110.2T/s99.98%
LlamaGuard 2 8B
meta-llama
Together
-
$0.180.25s41.0T/s100.00%
Claude 3.5 Sonnet (2024-06-20)
anthropic
Anthropic
-
$3.121.32s59.7T/s99.84%
Llama 3.1 70B Instruct (free)
meta-llama
SambaNova
-
$0.001.66s363.1T/s80.00%
Llama 3.1 Sonar 70B Online
perplexity
Perplexity
#132
$1.011.44s56.4T/s100.00%
ChatGPT-4o
openai
OpenAI
-
$5.120.52s122.0T/s99.88%
o1-mini
openai
OpenAI
#14
$3.103.90s80.5T/s99.84%
Llama 3.2 11B Vision Instruct
meta-llama
Together
-
$0.061.01s118.8T/s99.98%
Llama 3.1 Nemotron 70B Instruct
nvidia
Together
#39
$0.120.59s75.7T/s99.93%
LearnLM 1.5 Pro Experimental (free)
google
Google AI Studio
-
$0.000.57s58.8T/s99.99%
Nova Pro 1.0
amazon
Amazon BedRock
-
$0.830.11s63.8T/s100.00%
Command R7B (12-2024)
cohere
Cohere
-
$0.040.10s102.9T/s98.42%
Phi 4
microsoft
DeepInfra
-
$0.070.31s79.3T/s99.83%
Created by JCdata from OpenRouter and Chatbot Arena