xAI · closed · 2026-04
Grok 4.3
not ranked n/a
the takeCheap, fast, and genuinely good at tool use. Off the board because xAI now publishes almost nothing you can compare, leaning on their own agentic metrics instead. Trust it for agents, verify everything else.
Not ranked: xAI doesn't publish enough comparable public benchmarks to place it fairly. How ranking works.
GPQA Diamond
90.1
MMLU-Pro
—
- Context
- 1M
- Input
- $1.25/M
- Output
- $2.5/M
- Speed
- 114 tok/s
- Modality
- text, image
- Best-in-family agentic tool-calling
- Very low price for a frontier model
- Strong instruction-following, 1M context
- No standard academic per-benchmark scores
- Non-hallucination slipped vs Grok 4.20
- Below GPT-5.5 / Gemini on the intelligence index