Defect or cooperate? What would your winning strategy be?
by Terence Tan
A little girl sits in an interrogation room. A police officer comes over.
“Lisa, we know you and your brother, Bart, stole the golden chicken,” says the police officer.
Lisa doesn’t reply.
“Look, I like you. You look like a good girl caught in a bad situation. Here’s how it is. CSI: Miami is on tonight and I want to be home to watch it. If you just admit and say it was all Bart’s idea, you’ll get just one year in prison,” the police officer continues.
“And Bart?” Lisa asks.
“Who cares about him? He is the one who got you in this mess, right? He’ll get 5 years,” the police officer retorts.
“And if I don’t betray him?” asks the girl.
“I’ll be honest with you,” confides the police officer. “My colleague is giving your brother Bart the same offer I just gave you. If Bart decides to make a deal first, then I won’t need to make a deal with you. Either way, I get to go home early. What do you say?”
Lisa, being a smart girl, puts what she knows into a table:
|Bart betrays||Bart keeps quiet|
|Lisa betrays||(Lisa 3 years, Bart 3 years)||(Lisa 1 year, Bart 5 years)|
|Lisa keeps quiet||(Lisa 5 years, Bart 1 year)||(Lisa and Bart released)|
What would you do in her situation? This might depend on how you view the world. Do you view it as a dog-eat-dog world where people are inherently selfish and, if given half a chance, would backstab you? Or is it a world where people are naturally good and would always cooperate with one another. Or practically, it might depend on what kind of person Bart is. At the same time, he must also be wondering what kind of person Lisa is.
The problem above is known as the Prisoner’s Dilemma. Game theorists have shown that the rational strategy would be to defect rather than cooperate. If this game was to be played once, that would be the expected outcome, but what would happen if Bart and Lisa had to play the game repeatedly? What would be the rational winning strategy?
Robert Axelrod in his book Evolution of Cooperation described a tournament he organised to find an optimum strategy for repeated plays. He invited political scientists, economists, computer scientists and biologists to submit programmes to compete, and may the best strategy win. Would it be a strategy that deviously calculates its chances of betraying and getting away with it, or a strategy that innocently cooperates every time? Or one of the many other complicated strategies?
The strategy that won was the simplest and shortest piece of code: Tit-for-Tat. If you defect, I will defect later. If you cooperate, I will cooperate later. Such a simple strategy managed to score the highest in the tournament.
Axelrod then organised a second tournament, which saw many more participants joining. This time, he distributed the winning code to all the participants. Was there a better strategy than Tit-for-Tat? At the end of the tournament, Tit-for-Tat still proved the winning strategy.
It won for its simplicity. It clearly communicated the rules through its action, thus punishing defects and rewarding cooperation. It is nice because it initiates cooperation, and forgiving because it doesn’t hold past defects against you.
Further simulation showed that, in a mixed population of betraying agents and some tit-for-tat agents, the tit-for-tat agents will win in the long term. The betraying agents are sidelined. This could explain how civilisation evolved – individuals cooperated to form tribes, tribes cooperated to form civilisations.
Research on cooperation continues, studying what motivates individuals to cooperate and how we can create mechanisms or incentives to produce cooperation. My own research delves into how we can create software agents that can cooperate.
Surely that’s been done, you say. In computer games, the enemy agents cooperate to work against the player. Yes, but that cooperation needs to be ‘forced’ on the computer agents, pre-planned and pre-scripted in a closed controlled world.
What would it be like if computer agents could perceive the world for all its messiness and initiate cooperation to achieve common goals?
For instance, we could have self-driving cars, cooperating to get their passengers quickly and safely to their destinations. We could also have personalised travel agents who would cooperate with software agents from airlines and hotels to find the best holiday package to suit your needs, budget and time. Or how about having warring nations coding their grouses into software agents and letting the agents repeatedly play out the different scenarios to find the optimum solution without resorting to real guns and bullets.
After reading this article, if you and your partner are in a Prisoner’s Dilemma, do you defect or cooperate? If possible, find a way to make the relationship long-term, aim to build trust and demonstrate the consequences of breaking trust. That’s the lesson from Tit-for-Tat.
Terence Tan is a senior lecturer in the Department of Electrical and Computer Engineering of Curtin Sarawak’s School of Engineering and Science. He won the 2008 Excellence and Innovation in Teaching Award from Curtin University, Perth, Western Australia, and due to his experience and expertise, is often invited to speak to students on learning, leadership and technology. His current PhD research is on ‘Learning and Cooperating Multi Agent Systems’, which is essentially AI. In addition, he is a facilitator for the John Curtin Leadership Academy that equips students for community service, leadership and entrepreneurship. For any comments on the article, he can be contacted firstname.lastname@example.org.