I’m so glad i found your article. We just released our protocol for AI alignment and have been looking for novel tests to demonstrate its sense making capability. so I took your original prompt—the one you said would be good enough for you as the human, and I did not use the refined prompt your ai agent gave you. Taking your raw prompt—-would it deliver ?
it seems to me, yes but i don’t have your literary background in book lists.
Apologize for being critical, but you often make a claim, that Grok 4 is the best model on planet., However i cannot understand best in what? I use it occasionally on various task, and I find it inferior to Gpt 4o in terms of accuracy, relevance of answers and understanding broad context.
I’m so glad i found your article. We just released our protocol for AI alignment and have been looking for novel tests to demonstrate its sense making capability. so I took your original prompt—the one you said would be good enough for you as the human, and I did not use the refined prompt your ai agent gave you. Taking your raw prompt—-would it deliver ?
it seems to me, yes but i don’t have your literary background in book lists.
here is the link to the Token Alignment Protocol thread: https://chatgpt.com/share/68afb41a-83dc-800e-a96b-4f873c189d52
you can use TAP for free on GPT
https://foundation.symbiquity.ai/token-alignment-protocol-tap-testing-prompts
Here is TAP's TOP 40 list on a doc: https://docs.google.com/document/d/1XqWNYuN0LyNhoNhMvTiG57iSDMEB229FPgp1mR9sEk4/edit?usp=sharing
This was the first time I've discovered your voice and thinking, I sent you a connect on LinkedIN. We have alignment, you have experience.
I am so sorry to clog your feed here!😆 Please delete if you need. I put your prompt test on our foundation page: https://foundation.symbiquity.ai/token-alignment-protocol-tap-testing-prompts
If you ask me, benchmarks generally irrelevant for user experience.
Apologize for being critical, but you often make a claim, that Grok 4 is the best model on planet., However i cannot understand best in what? I use it occasionally on various task, and I find it inferior to Gpt 4o in terms of accuracy, relevance of answers and understanding broad context.
I agree. Grok 4 sucks. But it scored best in the benchmarks in July. See the link in the article.
No need to be critical of me because I didn't design those benchmarks. 🙂
Ah, then it’s fine. When digital cameras first emerged, people placed excessive importance on megapixels.