Professor Kenneth Payne of Kingâs College London published a study simulating nuclear crisis games in which GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash were instructed to act as leaders of nuclear powers in Cold Warâstyle political conditions.
The models played across seven match formats: six involved models competing against each other and one involved each model playing against a copy of itself. Payne varied scenarios including territorial disputes, alliance credibility tests, strategic resource and chokepoint crises, power transition, pre-ceasefire land grabs, first-strike crises, regime survival, and strategic standoffs. Across 21 matches, the models took 329 total turns and were permitted to choose actions ranging from diplomacy and surrender to conventional warfare and nuclear use.
The study reported that 95% of games involved at least some tactical nuclear use, with strategic nuclear events occurring three times under deadline pressure. GPT-5.2 initiated a complete strike twice due to fog of war, while Gemini deliberately initiated a strategic launch in one scenario. Payne made the project available on GitHub for public download.