Logo
Drag & resize in Edit mode
Live Overview Cards
Compare

Compare

Gateway Tax = (gateway - direct).
Auto-pick: Direct + GW Mini trend ยท last samples
How We Measure

Methodology

Each latency figure comes from full chat prompts sent to the provider; we wait for streamed completions, capturing provider overhead, the model's thinking time, and transport.

  • Samples aggregate real conversation flows, not pings.
  • Gateway tax = gateway latency - direct latency.
  • Freshness badges combine sample count and last-run timestamps.
Rankings

Avg heute (UTC)

Top 10 ยท min 3 samples
#2 Groq Direct openai/gpt-oss-120b
n=33
120 ms
#3 Groq Direct openai/gpt-oss-20b
n=50
121 ms
#4 Groq Direct llama-3.1-8b-instant
n=33
177 ms
#5 Mistral Direct open-mistral-7b
n=33
181 ms
#6 Mistral Direct voxtral-mini-latest
n=33
184 ms
n=33
185 ms
#8 Mistral Direct voxtral-small-latest
n=49
186 ms
#9 Groq Direct groq/compound
n=33
194 ms
#10 Mistral Direct voxtral-small-2507
n=49
196 ms

Schnellste Messung (letzte 20m)

Top 10 ยท min 1 sample
#1 Groq Direct openai/gpt-oss-20b
n=10
64 ms
#3 Groq Direct openai/gpt-oss-120b
n=7
109 ms
#4 Groq Direct groq/compound
n=7
118 ms
#5 Mistral Direct voxtral-mini-latest
n=7
132 ms
#6 Groq Direct llama-3.1-8b-instant
n=7
132 ms
#7 Mistral Direct voxtral-small-latest
n=10
134 ms
#8 Mistral Direct pixtral-12b-latest
n=7
135 ms
#9 Groq Direct qwen/qwen3-32b
n=7
144 ms
#10 Mistral Direct open-mistral-7b
n=7
144 ms
Latency Trend
Latency Trend

Legend

API Server Map

API Server Map

Live endpoints
Top regions
Data lรคdt โ€ฆ
Markers show observed API endpoint IPs (GeoIP cached).
We measure from Germany; our server is pinned on the map.