4 Comments
User's avatar
Jonathan Pohl's avatar

If you ask me, benchmarks generally irrelevant for user experience.

Expand full comment
Jonathan Pohl's avatar

Apologize for being critical, but you often make a claim, that Grok 4 is the best model on planet., However i cannot understand best in what? I use it occasionally on various task, and I find it inferior to Gpt 4o in terms of accuracy, relevance of answers and understanding broad context.

Expand full comment
Jurgen Appelo's avatar

I agree. Grok 4 sucks. But it scored best in the benchmarks in July. See the link in the article.

No need to be critical of me because I didn't design those benchmarks. 🙂

Expand full comment
Jonathan Pohl's avatar

Ah, then it’s fine. When digital cameras first emerged, people placed excessive importance on megapixels.

Expand full comment