xAI · closed · 2025-07

Grok 4

#6 index 86.3
the take

xAI's best-documented model, which is faint praise given how little they publish now. Solid math and coding for its era, but the newer Groks are cheaper and this one's 256K context is small by 2026 standards.

benchmarks
specs
Context
256K
Input
$3/M
Output
$15/M
Speed
Modality
text, image
strengths
  • Strong reasoning and math (AIME 91.7)
  • Decent coding (LiveCodeBench 79)
  • Mature, widely integrated
weaknesses
  • Small 256K context, high price
  • Superseded by Grok 4.20 / 4.3
  • No official MMLU-Pro or SWE-bench

Sources: [1][2]