Document Notes

Not the most academic/professional of papers, and also need to put a red flag on the anthropomorphizing of the AIs, but still was curious to see game theory from the view point of AIs all in all.

Highlights

id917671268

“If you defect against Gemini, it will remember and punish you,” they wrote. Gemini was much more likely to take advantage of a cooperative partner, more likely to punish a betrayer, and less likely to initiate cooperation after a “relationship” with an opponent goes bad.

🔗 View Highlight

id917671466

Gemini models were also more able to dynamically choose strategic defection when it became more advantageous as the final round approached

🔗 View Highlight

id917671597

OpenAI’s models, on the other hand, were “fundamentally more ‘hopeful’ or ‘trusting’

🔗 View Highlight

id917671629

OpenAI models’ strategies were also not adaptive; they were much less likely to defect close to the end of a game. They were more likely to return to collaboration after successfully betraying an opponent — even when that betrayal had just won points. And they also became more likely to forgive an opponent’s deception in the final rounds, in total defiance of game theory received wisdom.

🔗 View Highlight

id917671648

In the researchers’ tests, Gemini’s models did relatively worse over longer periods, because their experimental defections were more likely to trigger the opponent to stop trusting them forever. In longer games, OpenAI’s collaborative strategy gave it some advantage; consistently being a generous partner can avoid steering the game into a permanent pattern of revenge defections.

✏️ Awesome, yes i’m falling for confirmation bias here, but even with AIs, being more collaborative and generous meant doing better over time than betraying and losing trust. 🔗 View Highlight

id917671819

In a final “LLM Showdown,” the researchers set the models against each other in elimination rounds. Most-strategic Gemini came out on top, followed closely by most-forgiving Claude. OpenAI’s models ended up in last place; less of a shark than Gemini, but less likely to reestablish friendship after betrayal than Claude.

✏️ Here we see that being “strategic” and I’m assuming tit-for-tat treacherous worked best, but it’s good to see that being “most forgiving” was a very close second. Being a centrist and in the middle netted OpenAI as the worst. 🔗 View Highlight