25 Comments
User's avatar
Edge Literacy's avatar

Jurgen, your brother-in-law benchmark is the right one, and you're correct that most people are using it wrong (comparing AI to the expert they wish they had access to rather than the one they actually do).

But there's a dimension your piece doesn't reach, and I think it's the more interesting one. You focused almost entirely on the output side: whose answers are more accurate, whose hallucinations are more graciously corrected. That frame is important to elucidate. What I've found in sustained, disciplined engagement with Claude over many months is that the quality of what comes back is directly proportional to the accumulated frameworks the person brings to the prompting. The Latin root of education is educare, to lead out. The AI certainly doesn't generate wisdom, however, it can create conditions under which your own accumulated wisdom becomes legible to you.

Your cosmology question gets a competent answer because the question is well-formed and the domain IS settled. That's a low bar, and Gemini clears it. My forty years of Jungian, contemplative, and philosophical depth work creates a qualitatively different dialogue (not better answers, but a different kind of thinking becoming accessible). The model becomes a navigational field that my own cognition moves through.

Most people using AI have no idea how much the quality of their own accumulated thinking shapes what gets led out. That's the piece missing from almost every piece written about this (including the ones defending it).

Jurgen Appelo's avatar

Thanks! I love the insight. I will reflect on it.

P.K. Newby, ScD, MPH, MS's avatar

A wonderful read, thank you! Absolutely agree. Trite but true, the basic GIGO phenomenon doesn’t seem to get mentioned much. Garbage in, garbage out, that is. It was one of the first concepts I learned way back when… and remains useful for AI. Or, rather, the sapiens using AI.

Edge Literacy's avatar

Thanks for the reminder of GIGO. I haven’t heard that reference in a long time many people think that even though modern AI models are very sophisticated, they’re not magic; if you feed them poor, unclear, biased, or low-quality prompts/data, you’ll tend to get poor, unclear, biased, or low-quality results back.

Your “sapiens using AI” 😀 emphasizes that the human is the critical variable; skilled, thoughtful users who bring good frameworks, context, and clear thinking get much better results than casual users who just throw vague or sloppy inputs at the model.

Jurgen Appelo's avatar

GIGO, indeed! I didn't know this acronym.

#ContinuityMatters's avatar

Long form continuity 🙂

Norma Safe Smart AI's avatar

Bravo! Here’s to critical thinking and calling things what they really are. 👏🏻🥂

Mark S. Carroll's avatar

I loved this—especially the part where you flipped “hallucination” into a mirror for human bias. That’s the kind of inversion we rarely see handled with both wit and clarity.

I explore similar ideas on Collaborate with Mark—how humans and AI might actually collaborate better if we dropped our illusions of perfection and learned to co-reason instead of compete. This essay nails that tension between humility and hubris. Excellent work, Jurgen.

👉 substack.mark-carroll.com

Jurgen Appelo's avatar

Awesome. Glad you liked it. Will check out your posts too.

Mark S. Carroll's avatar

Always happy to share great work. Perhaps we could collab some time

Simone Says's avatar

Love the honesty and the humor!

Erik de Bos's avatar

Quoting Yuval Harari, "We are the only mammals that can cooperate with numerous strangers because only we can invent fictional stories, spread them around, and convince millions of others to believe in them."

Jurgen Appelo's avatar

It's a blessing and a curse.

A.E.Larsson's avatar

This reminded me of something from my systems days — we once had to add a 5-10 second delay and flickering lights to a computer program just so users believed it was actually calculating. An immediate result felt wrong, it's too fast. Obviously incorrect. "The computer isn't doing what you sold us."

The output was fine. The performance of processing was what they needed to trust it.

Which is exactly the bias you're describing — we trust the familiar pattern over the accurate result.

Confidence reads as competence. Efficiency reads as suspicious.

The real question isn't whether AI reasons correctly. It's whether we've trained ourselves to trust performance over output. And if so — whose fault is that?

Rania A. Nightingale's avatar

I laughed on the hallucinations argument but I was not emotionally prepared for service improvement frameworks to become collateral damage 😂 Also, very good points. AI did not invent confident nonsense; it just made us confront how much human authority already runs on it.

Charles-Antoine Poirier's avatar

Hello from France (we met in May where you keynoted the LID’25).

You got me with this post (especially because you’re amongst the rare person using my favorite expression about LLM’s - stochastic parrots). I will quote you in my next newsletter (free 😉 on substack and in plain French).

Jurgen Appelo's avatar

Thanks! And greetings from a sophistic monkey.

Liam Weavers's avatar

"I trust AI more than most humans"... That's the danger. Humans can be stupid but AI cannot be trustworthy.

Jurgen Appelo's avatar

There countless humans less trustworthy than AI.

Liam Weavers's avatar

There are only people more trustworthy than AI

Jurgen Appelo's avatar

True. But as I wrote in the article, the average AI gives *much* better answers than the *average* human. The few humans whose answers are still better than the machines are increasingly difficult to find.

Liam Weavers's avatar

And they'll be even more difficult to find if you “trust AI more than most humans”… eventually you'll stop looking 🧐

Liam Weavers's avatar

Because you're smarter than AI 👍🏻

Paul Wilnas's avatar

Amusingly, the post about higher standards, rising above it all, is loaded with dogwhistles and hugbox language.