Breakthrough

The first AI program to beat professional human players in no-limit Texas Hold’em

Skill Ratings

Incorporates an “action informed value assessment tool” (AIVAT) for cutting edge in-depth poker analysis

Creators

Developed by the Computer Poker Research Group at the University of Alberta, Canada

The Power of AIVAT

Poker skill under the microscope – the perfect tool to hone one’s craft

Developments in poker AI will lead to a completely new way of understanding the game. Just like Doyle Brunson’s “Super System” book on strategy raised the bar in terms of mainstream poker ability in the late 70s, a new “action informed value assessment tool” (AIVAT) developed by the Computer Poker Research Group at the University of Alberta is set to have a similar and perhaps even greater impact on the world’s most popular eSport.

The trouble with poker is you often get away with your mistakes. A clear example is getting your chips ‘all-in’ in bad shape (let’s say queen-jack suited against two kings before the flop). If you get lucky with the five community cards dealt out (they contain an eight, nine and ten to make you a ‘straight’ for example) you will scoop a huge reward regardless. A more obscure example is getting away with a ‘bad bluff’. With complete ‘air’ you might get your opponent to fold their superior holding on the river (you holding just an unpaired ‘high card’ against their ‘top pair’ say).

But maybe your opponent’s game strategy is such that they call 99 times out of 100 in that exact same spot, and you simply got lucky that this was that 1 in a 100 occasion they folded. What’s more, given the action of the hand, maybe a single pair is right at the bottom of your opponent’s ‘range’ of likely holdings, and they would have called with practically anything else they were likely to have. You’ve netted a positive result in chips, but the play is clearly a mistake theoretically.

The team in Alberta have developed a new AI called “DeepStack”, which is the first program to beat professional human players in no-limit poker. An off-shoot from the way DeepStack has been built means we now have a way of analysing and rating poker play which overcomes the shortcomings of just using chips. Not only does AIVAT return adjusted scores for each hand (more proportionate to the skill involved and less dependent on chance), but the data is incredibly insightful for identifying strengths and weaknesses in poker strategy.

Below are a few key hands from a recent match between DeepStack and Austria’s Poker Federation president Martin Sturc, helping us take a closer look at how AIVAT works in greater detail…

Hand 71

Here we see the biggest possible swing in chips (a full double-up for Martin), yet AIVAT rates the action to have gone roughly as expected, in fact with a small positive gain for DeepStack. How can this be?

The first thing to note about AIVAT is that it does not only factor in DeepStack’s exact holdings (the ace-eight in this case). Instead, the technique computes a “range value” for Martin’s hand, given all possible cards that DeepStack could hold, weighted by how likely DeepStack would act in the same way in each case. Since DeepStack is a computer with a fixed strategy, this range of possible holdings is known, and the value is a calculable figure and a true measure of the expected gain or loss in chips given Martin’s hand, the community cards and DeepStack’s betting line.

But there are additional sources of luck that come from these specific dealt cards and DeepStack’s chosen actions. The AIVAT technique therefore makes further “luck adjustments” to more accurately determine how skilfully Martin played his hand. Knowing DeepStack’s strategy for all possible holdings, plus looking at other lines of play that DeepStack could have followed, yields this immensely powerful tool for generating meaningful scores.

In this example, the range value of Martin’s ace-deuce (combined with the board that made him a straight) is a gain of 11134 chips. This is the average result of playing his hand the same way against DeepStack’s entire range. Two obviously lucky occurrences in this hand include the miraculous river card (making Martin’s straight) and the fact that DeepStack elected to call Martin’s all-in bet instead of folding.

So firstly, to assess how lucky the random cards were for Martin, AIVAT estimates their value in each case. Here we see that AIVAT attributes an average loss of 26 chips from simply being dealt ace-deuce on the button compared with any other two cards. However, the flop containing an ace certainly helps, and AIVAT attributes an average gain of 74 chips given this specific flop over any other random possibility. The turn is a slightly unlucky card (an average loss of 84 chips compared to any other possible turn card), but the river is like manna from heaven (a whopping average gain of 5707 chips compared to any other river card that could have been dealt in its place).

Next, AIVAT also determines whether Martin was fortunate from each of DeepStack’s randomized actions. When compared to folding or raising pre-flop, which DeepStack will indeed do some of the time in this spot, AIVAT suggests that Martin gained an average of 22 chips from a call instead. Again, this is not just factoring being up against ace-eight, but being up against any possible two cards that DeepStack will hold. Each of DeepStack’s actions is analysed in this way and we again see a very large value at the river; the dramatic effect of Martin’s all-in bet being called. Against all possible holdings that DeepStack would call 300 pre-flop, check and then re-raise to 900 on the flop, lead out 1200 on the turn, and then check the river, the average difference in results from Martin going all-in and getting called, as opposed to Martin going all-in and DeepStack folding, is a gain of 6997 chips.

And so in summary for Hand 71, we would expect Martin to gain 11134 chips from the way he played his ace-deuce, given the community cards and DeepStack’s actions. But AIVAT additionally estimates that he got lucky by a total of 11434 chips. Martin’s net result is therefore an overall rating of -300.

So how good are these AIVAT estimates? The short answer is ‘pretty damn good’. But for those seeking further comfort, DeepStack’s intuition (or ‘heuristic evaluation of states’) comes from hundreds, perhaps thousands, of CPU years of computation. Advances in the understanding of deep neural networks have allowed DeepStack to teach itself the value of specific poker scenarios. By additionally being able to simulate a vast number of games against itself in real-time, this new value assessment tool, of cards and of actions, surpasses what human beings could possibly hope to achieve in several lifetimes of gameplay experience.

While the actual chips won and lost give a somewhat distorted view of the actual skill involved in a single hand of poker, over time this measure will tend to the same conclusion that the AIVAT ratings will. The DeepStack research team has proven that AIVAT significantly reduces the variance in poker results in a fair and unbiased fashion. What AIVAT therefore gives us is the ability to assess a player’s true win-rate after vastly fewer hands than if we were just using chips, with additional in-depth insights into the merits and drawbacks of individual plays. By analyzing trends in one’s own AIVAT data one can appreciate the game of poker and improve one’s own ability like never before. The International Federation of Match Poker is working with the University of Alberta’s Computer Poker Research Group to power an official skill rating system adopting these revolutionary techniques that AI can provide. It is hoped that this data will be used to determine the finalists at the next IFMP World Poker Championships, and ultimately who is the best poker player on the planet.

Four more interesting hands from the Martin Sturc v DeepStack match:

Hand 223

“How many times shall we run it?”

Here’s another example of the players getting their chips all-in, but this time before the flop. We again see a huge difference between the chips and AIVAT score.

Martin is clearly fortunate to be dealt pocket queens, but in this all-in scenario given the betting we see that he is a massive statistical underdog against the full range of hands DeepStack will take all-in pre-flop in this way (one must assume mainly with KK and AA). Additionally we can deduce that the actual holding of ace-five suited is right at the bottom of DeepStack’s range here.

The AIVAT “all-in equity” chance correction is equivalent to running the remaining board cards infinitely many times, against each of DeepStack’s possible holdings, and taking the average.

Martin is expected to lose 16546 chips given this betting pattern against DeepStack and holding QQ. But AIVAT deems him to have been very unlucky in this scenario, to the tune of 16658 chips. Hence, the hand played out roughly as expected, with just a slight positive score for Martin overall. As ever the chips alone neither tell the complete story nor give us a means of accurately rating the play.

Hand 225

“No more cold decks”

The poker term being “cold decked” refers to being dealt a very good hand, only to be up against an even better one. Martin rivering trip jacks against a full house here is one such example. However, the beauty of AIVAT analyzing the situation against DeepStack’s full range means that these unlucky occasional cold deck scenarios are eliminated. While the jack on the river is disastrous in the actual KJ vs AJ showdown (hence why Martin lost a big pot here), it is actually a lucky card for Martin given all of the holdings DeepStack could have had at this stage.

The less obvious insight here is that Martin got quite unlucky with DeepStack’s final re-raise. The AIVAT values can really help players understand and appraise the merits and drawbacks of their own strategies.

Hand 103

“Make sense of that!”

Here’s a hand example that would be very hard to unravel without AIVAT. DeepStack flops ‘the nuts’ with a straight, but hitting top two pair couldn’t be much better for Martin’s hand either (another cold deck example). Against DeepStack’s full range Martin is in fact expected to win 981 chips with 68 and this betting pattern. And it is actually the turn card that affects the score by the greatest amount. Martin’s two-pair is devalued substantially with four cards to a straight on the board. Martin is lucky that DeepStack’s randomized actions are to check down the turn and river, but on balance the expected gain in chips, despite the extremely unlucky turn card, nets Martin a very healthy score for this hand.

Hand 66

“To bluff or not to bluff…”

Martin was probably very pleased to get away with a cheeky re-raise bluff in this hand. But the large AIVAT value against the fold indicates that DeepStack is more likely to call or re-raise here. So Martin was fortunate, and the play is not a winning one long-term.