Logo
Drag & resize in Edit mode
Live Overview Cards
Compare

Compare

Gateway Tax = (gateway - direct).
Auto-pick: Direct + GW Mini trend ยท last samples
How We Measure

Methodology

Each latency figure comes from full chat prompts sent to the provider; we wait for streamed completions, capturing provider overhead, the model's thinking time, and transport.

  • Samples aggregate real conversation flows, not pings.
  • Gateway tax = gateway latency - direct latency.
  • Freshness badges combine sample count and last-run timestamps.
Rankings

Avg heute (UTC)

Top 10 ยท min 3 samples
#2 Groq Direct openai/gpt-oss-120b
n=23
117 ms
#3 Groq Direct openai/gpt-oss-20b
n=36
130 ms
#4 Mistral Direct open-mistral-7b
n=23
169 ms
#5 Groq Direct llama-3.1-8b-instant
n=24
179 ms
#6 Mistral Direct voxtral-mini-latest
n=24
181 ms
n=23
189 ms
#8 Mistral Direct voxtral-small-latest
n=35
190 ms
#9 Groq Direct groq/compound
n=24
194 ms
#10 Mistral Direct voxtral-small-2507
n=35
201 ms

Schnellste Messung (letzte 20m)

Top 10 ยท min 1 sample
#2 Groq Direct openai/gpt-oss-20b
n=10
67 ms
#3 Groq Direct openai/gpt-oss-120b
n=6
93 ms
#4 Groq Direct groq/compound
n=7
125 ms
#5 Groq Direct llama-3.1-8b-instant
n=7
134 ms
#6 Mistral Direct voxtral-mini-latest
n=7
135 ms
#7 Mistral Direct open-mistral-7b
n=6
135 ms
#9 Mistral Direct voxtral-small-2507
n=9
137 ms
n=6
138 ms
Latency Trend
Latency Trend

Legend

API Server Map

API Server Map

Live endpoints
Top regions
Data lรคdt โ€ฆ
Markers show observed API endpoint IPs (GeoIP cached).
We measure from Germany; our server is pinned on the map.