nonpareil
25 points
Recently signed up here, and am a big fan of all your PLO videos so far. I really enjoy hearing you reason through the hands, put people on ranges, and decide whether or not to attack certain textures. You've given me a lot to think about in my own game. Thanks for taking the time to share your thought process with us!
Oct. 29, 2013 | 9:13 p.m.
Load more
Hello all. As a poker programming enthusiast who has coded and experimented with several flavors of the CFR algorithm (which was used to build both bots), I hope you'll permit me to highlight a few things about these bots and some of their potential weaknesses.
Both bots use a pre-computed strategy that does not change between hands. At every decision point, they just lookup their strategy and follow it, e.g., after this betting sequence, with this strength hand, check 75% and bet the other 25% of the time. They do not adapt to their opponent's ranges or sizing frequencies. (There's a trivial exception for hyperborean_iro that's safe to ignore here.)
Instead of seeing specific cards, the bots will group situations with similar board texture, hand strength, and hand potential. For example, they do not explicitly realize they have a nut flush draw. It's more like they know they have some medium hand strength with significant potential to beat most hands by river. The nut flush draw is just one instance of that situation. A key result of this is that the bots will not fully appreciate the importance of blockers. For example, Ah2s and As2h are likely to be played identically on the river when the board has three hearts, since they have the same hand strength.
The bots have "imperfect recall" with regards to card information as well. This means they have little (possibly zero) memory of board textures from previous streets, and no information on how their hand strength has changed since then. Roughly speaking, after preflop ends, they "forget" their hole cards and only look at the current hand strength/potential postflop (as mentioned above). This also means that by the river, they will not really understand how many draws missed, or whether a straight was possible on the flop or not. They look at the present and future, but not the past when it comes to the cards.
A big difference between the bots is which betting sequences they "recognize". There are too many sizing possibilities, so a strategy will only be generated for a few of them at each point (the choice is up to the designer). As a simple example, a bot may expect either a 1/2 pot bet or full pot bet in a situation, but let's say its opponent chooses an in-between sizing of 3/4 pot. Here the bot will probabilistically "map" the bet to either 1/2 pot or full pot and assume its opponent bet that exact amount for the remainder of the hand. Min-betting can induce some computer opponents to make terrible assumptions if the smallest sizing they expected was much larger. The authors of hyperborean_iro were careful about this and used a lot of min-betting themselves as seen in the video.
The strength of the bots is how the CFR algorithm works to essentially balance their frequencies. I would call it GTO, but in a more pure form than what most humans expect. I use the word "pure" because we all make implicit assumptions about our opponents even before the first hand is dealt, which is technically exploitative. We as humans don't usually plan our ranges for opponents that might donk lead turns for 4x pot, checkraise 50% of rivers all-in, or slowplay top set two streets. However, the bots have to give consideration to the possibility of playing against an extreme outlier strategy, the likes of which you'd (assume to) never see on PokerStars. I believe this is a big factor in why they play highly mixed strategies and can take so many different lines in the same spot.
Anyway, I'll stop rambling here for now, but as this topic is very interesting to me I was eager to chime in. For fun, here's a pic of the final results of the match (from slumbot's perspective): http://imgur.com/Hvx2QdP
Nov. 25, 2013 | 5:18 p.m.