the prisoner's dilemma


The prisoner’s dilemma and poker, starting with Liv Boeree

If you are a poker player, the odds say that you are sufficiently interested in and intrigued by game theory.
After all, the very father of game theory John Von Neumann considered poker in his book “Theory of games and economic behavior” back in 1944, and the first poker theorist David Sklanksy drew heavily from game theory for his strategies. And let us not forget the poker importance of John Nash and his balance.

The prisoner’s dilemma

All this was to introduce you to the prisoner’s dilemma, a problem in game theory that many of you will already be familiar with.
The problem is apparently very simple: two criminals are charged with a crime, and once arrested they can no longer communicate with each other. Investigators tell each of the two that they will have the option to cooperate or not cooperate, with these possible outcomes:
If both cooperate, they will serve 6 years in prison.
If both do not cooperate, they will serve 1 year in prison.
If only one of the two cooperates, this will avoid punishment, but the other will be sentenced to 7 years
This is an interesting problem, because on balance intuition suggests that the best choice is not to cooperate, so both criminals will serve only one year in prison.

Is the solution disadvantageous?

The mathematical answer, however, indicates exactly the opposite, and the reason is that we cannot be certain what the other will choose to do.
The best strategy for everyone is to minimize their sentence, and it is noticeable that by not cooperating you risk 1 to 7 years in prison, which is higher than the risk of 0 to 6 years you get by cooperating.
In other words, the best outcome is not an equilibrium, as the “do not cooperate” strategy is dominated by the “cooperate” strategy.
For some, this amounts to a paradox, in that the result that occurs (6 years in prison) if they both play balanced, causes more harm to both players than the other possibility.
In fact, there is an unrecognized solution of not cooperating as the only option, if we assume that both have perfect logical ability that leads to completely excluding other possibilities.

Liv Boeree and Split or Steal: a poker player grappling with the prisoner’s dilemma

Liv Boeree should need no introduction: highly successful British poker player, the face of PokerStars for long years and winner of a WSOP bracelet in tag team with life partner Igor Kurganov. In addition to poker player Liv is also a model and, above all, an astrophysicist, and her solidly mathematical educational background can only have benefited her career as a poker player… and quiz show participant! Shortly before her success as a player, in 2007 to be exact, Liv participated in the program “Split or Steal,” grappling precisely with the prisoner’s dilemma (albeit in an alternative form).

In this game the two contestants compete for a certain sum, which in this case was £6,500.50, and after a brief discussion they can choose between Split or Steal.
If both choose “Split” they will go home with $3,250.25 each, if both choose “Steal” they will come away with empty pockets. But if only one of the two chooses “Steal,” he or she will get his or her hands on the entire sum, leaving the other player holding the bag.

Poker psychology in Split or Steal

It is clear that Liv has brought more than a few useful knowledge in poker, starting with the psychological game. Leaving aside for a second that Liv is a beautiful girl, and that aesthetics have no small bias value, Boeree was able to bluff really well by convincing poor Stuart to make the wrong choice. In hindsight it is also all very obvious: smiling and nodding, pointing to the sphere containing (spoiler: it wasn’t true) the “Split,” complimenting the opponent (“I really like you”), playing on guilt (“Please don’t let me down”) and clapping as pattern interruption, a classic move used in hypnosis.

Differences with the prisoner’s dilemma

Between Split or Steal and the prisoner’s dilemma, however, there is a very big difference (although the basic idea is the same): here it is cooperating that brings the risk of losing everything and the least gain, while stealing gives the possibility of the greatest gain and the risk of losing everything.
Split gives a chance to win £3,250 or £0, Steal £6,500 or £0. Here the right move might be steal, but in the case of two perfectly prepared players, this could only lead to a loss of both. Which is the worst situation of all. Paradoxically, it would be better if a player decided to Splittare even though he knew he would lose: nothing would change for him, but at least someone would take that money…

Collaborate Uncooperative
Collaborate 6 – 6 0 – 7
Uncooperative 7 – 0 1 – 1
Split Steal
Split 3 – 3 0 – 6
Steal 6 – 0 0 – 0


The real big difference between the two games, however, is that in Split or Steal the protagonists can communicate, a most important detail.
Usually, as in Liv’s video, you try to reassure the other person and convince him or her that we will split, only to occasionally betray him or her and blow the whole jackpot.
Probably the best strategy is one that in vulgar poker is called a “toughness move.” Telling the other person that no matter what happens, we will choose steal, but we will give him his slice of the jackpot out of the game.
In this way the only way for the opponent to make some money is to have faith and choose split, otherwise he will surely come out the loser. For the record, this happened once on the broadcast, and eventually the player determined to choose steal surprised everyone with a split. He just wanted to make sure that the other person could not choose steal at any cost.

Will matches between humans and artificial intelligence be the norm in the near future?

(We will tell you more: we could also secure a larger prize by convincing the opponent that we will choose steal and then give him 40 percent of the prize pool. In that case it will still be better for him to take that amount than nothing).

The prisoner’s dilemma and poker: where can it be applied?

The prisoner’s dilemma cannot be directly applied to poker for several factors, not the least of which is that the prisoner’s dilemma is a full-information game (in poker it is imperfect) and that the dilemma involves the situation occurring only once (in poker it is often repeated).

You may have noticed that the prisoner’s dilemma situation is not easy to find in poker, but we can always learn something from it.

For example, the value of a randomizer and balancing frequencies. Theoretically, if both players flipped a coin to decide what to do, they would get an average of 3.5 years in prison, much better than the 6 said by theory, and would therefore be the better choice.

It is too bad that if two players could agree and trust each other to such an extent, they might as well decide for collective noncooperation.

What it does teach of value, however, is the importance of understanding your opponent and how much to trust what you know. In fact, the dilemma becomes much easier if we know the other person’s decision, and the expected value changes if we know that the opponent has particular tendencies. It is for all intents and purposes a transition from GTO to exploit.

The Levels of Thought

If mathematics has an answer, the prisoner’s dilemma, in fact, is by no means exempt from what we know as leveling, or levels of thinking.

Level 1. We both do not confess
Level 2. If he doesn’t confess, I confess and get away with it
Level 3. If he confesses, I must also confess
Level 4. If you understood level 3, you know that not confessing is the best choice for both of you
Level 5. …

And in poker as in this dilemma, game theory was born precisely to not have to worry about these things (or defend against them).

If we consider the overall EV of a move, and feed it to a solver, it will find the best possible balance. It should tend toward parity between the two, and is broken if either adopts suboptimal play.

We will then find ourselves making moves predicted by Game Theory that will turn out to be losers, but will be useful in balancing the game by making us successful in the long term.

An “impossible” theory

This article was drawing to a close when we came up with a possible poker representation of the prisoner’s dilemma-let us know if you agree:

Playing from the blinds means losing money, but letting opponents steal it all the time is much more costly. The best choice not to lose would be to agree that no one steals blinds and only plays for value, but the moment one of them takes advantage, the other has to defend himself. And thus is born all the aggressiveness of the steal dynamics.


We focus on value

© Top1Percent Jul 21, 2024 All rights reserved

Who we are

The poker school that provides you with all the tools you need to begin your path to professionalism.

Useful Pages