Brains vs. AI poker rematch

Posted by

Posted by posted in Gen. Poker

Brains vs. AI poker rematch

No Doug Polk this time.

But still "Big Dick" Donger Kim, Jason Les. They are joined by Jimmy Chou (ForTheSwaRMm) and Daniel McAuley (dougiedan).

http://triblive.com/local/allegheny/11727206-74/poker-computer-sandholm

Watch it live here:

https://www.riverscasino.com/pittsburgh/BrainsVsAI/

Lets hope the humans will crush these bots are more than 9bb/100 so that we have no more "statistical ties" anymore:

https://www.youtube.com/watch?v=gz9FJfe2YGE

Who do you think will win? Place your bets!

152 Comments

Loading 152 Comments...

oakton55 8 years, 2 months ago

https://pokershares.com/pokershares/#/pokershares/4/eventOdds/110-113-114-0,

GameTheory 8 years, 2 months ago

Since there will be 120k hands played and the variance will be around 180bb/100, these odds of 1.45 indicate around a 5bb/100 winrate for the humans. According to a simulation I did on http://pokerdope.com/poker-variance-calculator/

Last time humans were not playing to maximize their winrate. The bot is very slow on the river (waiting 30 seconds to check the river in a checkdown pot) so Doug Polk tried to avoid playing rivers by playing with more aggression on earlier streets.

This time the humans will be playing 2 tables so they will be less bored, it could help them.

Quido 8 years, 2 months ago

Why don't they battle the best NLHE players? This lineup seems quite weak. Imagine if the lineup was something like Sauce/Polk/Haxton/Jungleman/OTB. Nevertheless I am still rooting for humans, it will be a sad day when computers become better than real players.

GameTheory 8 years, 2 months ago

The top humans have an incentive to not give bot creators valuable information about their play. These guys are from Doug Polk's evil empire so they are one step below elite.

Also, the humans have to play for 20 days for 8 hours a day. The prizepool for the four players is only $200k, so not enough to make these elite players interested.

What could happen is that these players lose and that they make the elite players an offer to beat them for a much larger prizepool.

PEACE of MIND 8 years, 2 months ago

it seems like how to prevent bots in ol poker is a problem which must be solved in the future.

Quido 8 years, 2 months ago

The programmer of Libratus said on twitch it runs on a 50 x 28 core CPU per game so it might still be a little far from utilizable and also said that Libratus won't be available for public.

Kalupso 8 years, 2 months ago

He also said the algorithm is a general algorithm that is not designed to only play poker, so if the only thing it had to do was play poker they could simplify it a lot.

Kalupso 8 years, 2 months ago

"Sandholm noted that the Libratus AI is not specifically a poker program. Its algorithm could be applied to any number of situations that involve incomplete and misleading information, such as business negotiations, military strategy, cybersecurity and even medical treatment design.

Libratus developed its knowledge of the game and its strategy by analyzing the rules of the game, not by trying to copy the play of humans. The AI calculated its poker strategy using about 15 million core hours of computation on the Pittsburgh Supercomputing Center's Bridges computer. Ralph Roskies, scientific director of the PSC, said this use of Bridges already has generated 2 1/2 petabytes of data."

Source: http://www.cmu.edu/news/stories/archives/2017/january/poker-play.html

GameTheory 8 years, 2 months ago

Interesting hand for Jason Les, he 3-bets with T5o to 800, bot calls.
Flop TT6dd, Jason bets 640 and gets raised to 3520, call.
Turn Js, bot bets half pot for 4320, call.
River 8h, bot bets half pot (not all-in) and Jason tank calls.

I don't see why he tanked, clearly his T blocker is extremely valuable in this spot. And if the bot had a better T, going all-in for the bot would make more sense than this bet. So it is implied that the bot has weak fundamentals and/or bad blockers here. That makes every T a clear call.

He called and the bot had K9o.

I like how Doug Polk is correct (unlike tons of his video commentary) on this spot:

DougPolkPoker : YES
DougPolkPoker : also wtf at that slowroll
DougPolkPoker : bots are people too

GameTheory 8 years, 2 months ago

3betting T5off is good fundamentals? :D

Well he was facing a 2.5x open. Against an 60% uncapped (the 2.5x range is probably somewhat stronger than the limp or 2x range so this is a rough estimation) range T5o has 36.3% equity.
You need realize 150/(150+100+250) = 30% equity. Since T5o plays poorly postflop OOP against a range that is somewhat strong it will realize less than its estimated 36.3%. It could be around break even or slightly losing to call with T5o.

So it makes a lot of sense to 3-bet these hands with mediocre equity and little playability, because these hands benefit the most from getting a fold to the 3-bet.

Also, you seem to suggest that T5o is a 'bad' hand. It is the only hand in no limit hold'em that can make EVERY straight and TWO flushes. That is not bad in my book!

Disharmonist 8 years, 2 months ago

Yeah, but getting called by K9 off pre suggests that you probably only fold out weak hands that you block. And you can and should fold the very bottom of your range against raises. Just bc you cannot call doesnt mean that you should raise in any frequency. I do understand the strategy behind, but reversly you would make therefore 3betting 80% of your range or so profitable with this justification which cannot be a good plan long run wise.

GameTheory 8 years, 2 months ago

K9o might be quite high up in the bot's range when it makes a 2.5x raise.

T5o is a 70th percentile hand against a random hand. So this hand is not the very bottom of your range. It is also a hand with more equity than hands like 52o, so it will have higher EV when getting called.

Also the bot folds some hands to a 3-bet. The bot (implicitly) designs its defense ranges against 3-bets to make certain hands breakeven between calling. Naturally, the worser a hand gets the lower its EV for calling will be, so relative to calling the EV of 3-betting (and getting a fold) will get higher.

This probably goes down all the way to the very worst hands, but since you are not forced to call (you can fold) these hands will often have folding as their preferred action to a raise.

72o has 30.4% equity against the top 60% and it needs to realize 30% to call a 2.5x, so it almost certainly not a call. But it might be a call or a just slightly losing call against a 2x raise where it only needs to realize 25%. So against a minraise it would more sense to 3-bet hands like 72o, whereas they would still be folds against a 2.5x.

Btw, I'm not claiming that it is optimal or profitable to 3-bet T5o against this bot. I'm only claiming that it is reasonable to 3-bet some hands that are roughly as strong as T5o. I'm also not claiming that the humans have a correct 3-bet frequency. Just claiming that this is not an obvious and large fundamental mistake.

GameTheory 8 years, 2 months ago

Interesting hand. Jason Les starts to try different opening sizes. He opened up to 450 with T4o, got a call and checked back on T96cxx.
Turn Tc, bot leads pot for 900, call.
River 2c and bot pots again for 2700, call.

The bot had Jd4d, so no relevant flush blocker and no T blocker either but he had negative blockers for QJ and J8/J7. Seems like a minor leak that the humans should focus on to exploit.

Disharmonist 8 years, 2 months ago

But back to the AIs play; I also agree What hands should be raise OTF theoretically that isnt a Ten, a combo draw or the A or K high flushdraw. I guess not that many if any at all. At least the bot picks up a double gutter OTT and usually has more T x in his range given preflop action.

GameTheory 8 years, 2 months ago

K9 didn't have a double gutter on the TT6J turn.

If you only raise a ten or a strong draw with a flush, you have no bluffs on a diamond turn. And in general your range will be so strong that Jason can fold almost anything, so the bot would face many folds to him raise, so he gets no value.

Also it gives the bot an incentive to add a lot of bluffs to its range. On the river hands with no showdown value that don't block missed flushdraws are probably best to bluff.

GameTheory 8 years, 2 months ago

Interesting hand, 5-bet pot:
Action goes Jason opens to 272 with KQs, gets 3-bet to 680, 4-bets to 1530 and gets 5-bet to 3060, call.

Flop JT6ss, bot bets half pot 3060, Jason "takes his equity" and shoves. gets called after a little tanking.
Equitychop:

GameTheory 8 years, 2 months ago

Dong did very similar. He made a massive 24x 3-bet, and overbet 6000 into 4800 on the flop.
Shove turn (almost) on a blank turn and the bot folds.

GameTheory 8 years, 2 months ago

Another loss for Jason, he bet 2399 into 3808 on this river and called the shove and the bot had QTo:

GameTheory 8 years, 2 months ago

Dong doesn't manage to win a stack in the mirrored spot. The bot didn't 3-bet AJo.
Flop checked through, normal bet on the flop (bot traps).
River Dong bets normals, gets raised to 6252 and thinks a while how large he should raise, goes with 18k and the bot snapfolds.

Did humans got outplayed because the AJo didn't 3-bet?

GameTheory 8 years, 1 month ago

Yes all hands that Jason gets are mirrored for Dong. So when Jason got AcJs in the big blind on Kh6hQhTsQc against QsTc in hand 221/400, Dong got QsTc in the small blind on Kh6hQhTsQc against AcJs. This goes the same for all hands they get.

And Dan and Jimmy are paired/mirrored in the exact same way. The idea behind this to reduce the variance that you get from a player getting better hands and/or runouts than the bot or vice versa.

GameTheory 8 years, 2 months ago

Another hand from Jason.
It was 356A with two hearts on the turn, bot bets 2.6k in 2k on the turn.
River A and the bot shoves with T7hh. Jason called A7cc without hesitation.
Another bad bluffing hand from the bot, negative blockers!

GameTheory 8 years, 2 months ago

Jason just owned the bot. He 3-bet with QJ and called down overbets on turn and river:

Quido 8 years, 2 months ago

He finished up 58k for the session. Kind of interested how did the rest of the guys do. I actually still think there is a chance for humanity in this one.

GameTheory 8 years, 2 months ago

Dan faces a river overbet after the flop and turn go pot pot.
He decides to jam the remaining 7.5k, will he be ahead when called?

Bot folded btw.

Quido 8 years, 2 months ago

Bot is up 181140 after 18040 hands, which roughly translates to 10bb/100
https://www.twitch.tv/libratus_extra

Quido 8 years, 2 months ago

They very well might, but I still think it is too early to call. It definitely looks like the players have an idea how is libratus playing.

GameTheory 8 years, 2 months ago

Yeah I'm glad the humans are making some adjustments.
For instance the T5o and A7cc calls, I assume they would now always call with their blockers for trips, no more (contemplating) ubernit folds.

GameTheory 8 years, 2 months ago

The Dong just outplayed the pot.
Very weird play by the bot, potting the turn and 2x pot overbetting the river. It is hard to see what t he bot is repping, maybe TTTT?

Dong tanked for really long, looking at his notes, discussing. He was happy after he called.

GameTheory 8 years, 2 months ago

Bot called 3-bet and barreld it off: half pot on flop, pot on turn and overbet shove river:
Again poor play by the bot, blocking some straightdraws that missed and probably running into a straight too often. It seems like everytime you're at the river with a bluffcatcher and some card removal you can just click call.

GameTheory 8 years, 2 months ago

Dumping more handhistories.
Very loose call on the river here vs the overbet. It might have trouble figuring out ranges on this weird limp/raise-check-bet-raise spot:

Jimmy started limp/raising (with AA) also:

Bot 3-bet 5-bet jamming AKo:

Large 3-bet from Jason, followed by a check and check/fold. Might've been a slight mistake.

GameTheory 8 years, 1 month ago

Hundreds of hands from yesterday were invalid because the bot owners made a mistake. The humans were winning 50k+ in those hands. These hands have to be played over.

Dong is up 40k+ after 200 hands.

Meanwhile Jimmy has reached the high hand qualifier:

Bot folded to the raise.

GameTheory 8 years, 1 month ago

Quoting Jason:

At the start of each session and each table we enter a seed number to randomly generate our hands. Dong and I use the same seed numbers so our hands are mirrored.

I was instructed to enter a several digit seed number and I asked for the CMU representative to confirm it was correct. Coincidentally today I also asked for them to be confirmed on server side so we never make the mistake of playing hands that have to be invalidated.

Unfortunately that happened today on my Session 2 Table 1. As a result Dong and I played two separate strings of 350 hands. The CMU team and us are in agreement the hands have to be invalidated in order to preserve the integrity of the competition. Therefore, Dong and I are now 350 hands behind Jimmy/Dan and we will have to make them up at some point.

From a results oriented point of view it sucks because I won approximately 28k off that table and Dong won approximately 30k. We still had a winning day but unfortunately not as good as we INITIALLY thought. I wanted you guys to know this so you understood why we only gained 40k today when it would appear much more if you were watching on stream.

In order to prevent his from happening again, CMU will now be responsible for setting up every match. Therefore we will no longer have any responsibility on a match setup so if this happens again, it won't be our fault and the hands will have to stand as is.

GameTheory 8 years, 1 month ago

Strange play by Dong, 4-bet pot and he made small bets on the flop turn and river. He got raised on the flop and just called, bet 3k on the river.

Quido 8 years, 1 month ago

Libratus is up 159288 after 23340 hands, which is around 6.8bb/100. Looks like we are in the realm of a tie and a statistical error.

Quido 8 years, 1 month ago

https://www.twitch.tv/libratus_extra
Libratus is up just 1207bb after 26540 hands, which is 4.5bb/100. Looks like we are onto another tie.

Kalupso 8 years, 1 month ago

Remember that there is a reason they play 120k hands. Variance is still massive over the current sample compared to the edge that is to be expected. Because the bot just plays its approximation of Nash equilibrium the humans should be expected to get most of the edge after they find and exploit weaknesses in its programming.

In short it's too early to tell much about the final outcome from the results alone.

Quido 8 years, 1 month ago

I was just making fun of CMU's press release where they called a 9bb/100 loss over 80k hands a tie last time they played.

GameTheory 8 years, 1 month ago

To be fair to CMU, these bot vs bot matches often were played over much larger samples than 80k. So it is natural there to use 95% certainty to decide if a match is a draw or a win. But elite poker player don't want to play millions of hands HU against a bot that tanks for a minute in a checkdown pot for any amount of money that is afforable to CMU.

For humans beating someone for 9bb/100 over 80k hands is just crushing.

GameTheory 8 years, 1 month ago

Ok, interesting hand here. Jason folded to this river overbet, Doug Polk said in the chat that the fold was good.

But guess what the bot had?

(Dong folded to the large 3-bet)

GameTheory 8 years, 1 month ago

Looks like Jason won 40k and Dong won 30k the second part today.
Dan and Jimmy are not playing today due to being sick and unable to play.

X0RR0 8 years, 1 month ago

CMU has said that this year instead of two sigma for 95% confidence interval, they will use just one sigma, for 68% confidence interval.
What is not clear to me is what they take for the standard deviation. Using mirroring hands will decrease the variance for each pair, so I guess this should be taken into account. On the stream, Jason mentions only the std dev for individuals being around 160bb/100h.
Also as the std dev is proportional so the square root of the number of samples, this figure should be multiplied by sqrt(1200) (which is 120K/100) and multiplied by $100 for a bb, yielding magical number 3464.
So for example if std dev is 160bb/100, brains should be down $554 240 to be under one std dev.
If you want the figure in bb/100 over 120K hands as in http://pokerdope.com/poker-variance-calculator/, you divide the std dev by 34.64: 160/34.64 is 4.62bb/100 over 120K hands.

Disharmonist 8 years, 1 month ago

No matter how you interpret the data, it can be said that there is at least a tendency that the bot is better than the humans. I d say lets see how the entire thing turns out and then judge. Players will probibably present the final outcome than the researchers and there will always be room for discussion and interpretation.

GameTheory 8 years, 1 month ago

Wow, humans are getting crushed by almost 12bb/100 now. These bot developers have really improved the bot overnight, or is this just luck?

Disharmonist 8 years, 1 month ago

If you turn a selflearning AI that has a set goal, it wont stop calculating and learning until the goal is completly fullfilled. In this case, it probably calculates all possible permutations of boards, ranges and lines.

Quido 8 years, 1 month ago

Bot ended up losing a little yesterday and is up 459154 after 49240 hands, which is around 9.3bb/100.

GameTheory 8 years, 1 month ago

Dong is down 40k, while Jason is up 10k. It must be the massage that Jason is getting now that is making him win. Humans should've started with massages from the start!

Quido 8 years, 1 month ago

There seems to be absolutely no way to stop Libratus. It is up 794392 after 64312 hands, which is around 12.4bb/100.

Kalupso 8 years, 1 month ago

It's starting to look like the bots true win rate is somewhere between 6bb/100 and 16bb/100. I haven't watched the match much the last week, but I've been wondering where the bot gets its edge from. Using multiple turn and river bet sizes to extract max value from the range should give a few extra bb/100, but against a bot that doesn't adjust the humans should be able to do the same without fear of playing too unbalanced. The other main area I can think of is picking the best possible bluffs and bluff catchers based on equity, blockers etc.

Where do you think the edge is coming from?

Tyler Forrester 8 years, 1 month ago

Its all in the balance. You shouldn't see large differences in strategy between the bot and the humans . It just happens that math is better at solving equilibriums than intuition and bots will always be better at math.

(I'm of the personal opinion (quasi-backed up by PIO) that those big flop overbets are actually mistakes)

JimJon 8 years, 1 month ago

'Doesnt adjust'?
Isnt it a self-learning AI that's trying to find the best strategy against these specific players? I wonder does it search for 1 single best strategy against all 4 players or is it adjusting individually? And then, if it is a single one and Dong Kim knows some leaks of other human players he can play more exploitatively in some spots against AI, right? :)

Kalupso 8 years, 1 month ago

It's playing an approximation of Nash equilibrium that doesn't change during a match, but it's also supposed to be able to learn from the matches over night. From watching Jason's stream for a couple of hours yesterday I got the impression that not even the players know exactly what kind of learning and adjustments it makes to it's strategy from day to day.

Quido 8 years, 1 month ago

Bot is up 754009 after 79508 hands, which is 9.5bb/100. Interestingly Dong Kim is up 126k.

Kalupso 8 years, 1 month ago

Impressive by Dong! It's going to be interesting to see if he can keep the lead for the rest of the challenge. His mirror partner Jason is down the most...

JimJon 8 years, 1 month ago

Seems like a big difference in results compared to Dong, since they have been paired to be in same spots against AI. If Dong manage to stay outside parentheses can it really be a great result for AI team if the goal is to show that this AI can outperform any human? Dong isnt the best HU player, other seem worse than him and from the old chess challenge we know it only counts when you can beat the best, right? :)

Michael 8 years, 1 month ago

I don't know, 11 hours a day and hudless - this doesn't seem like a very fair contest to me and is going to get written about in the most exaggerated manner by the media. I think it's going to be very bad for poker and am worried the Bot team will hit and run and book the win for bragging rights. After all, they did call last time's loss a by 10bb/100 a statistical tie.

Disharmonist 8 years, 1 month ago

Winning by double digit bb per 100 is basically melting your HU opponent. And considering that it didnt take that much time from losing 10bb/100 in the previous test to winning 10/100bb amplifies that effect.

Still poker is not dying as long as recreational players come and stay long enough. These are also more likely not to even aknowledge this challenge. I mean rationally, they shouldnt play the game at all when they are losing anyways.

The biggest challenge is on the pokersites preventing bots to play in the game and protecting the intergrity of the game.

Quido 8 years, 1 month ago

The challenge is coming to an end. Libratus is now up 1,560,189 after 115,756 hands, which is around 13.5bb/100. A huge blow to humans.
https://www.twitch.tv/libratus_extra

What are the implications for online poker? Since the bot runs on 1400 CPU cores per game, I'd say we still have some time. It unfortunately sure looks this game will be dead in couple decades though, at least online.

ZenFish 8 years, 1 month ago

Ye, if it was that easy, everyone would go to uni and get a stellar job after, because the curriculum is available to anyone.

Quido 8 years, 1 month ago

I've never said it would be easy for an average programmer to do so, but for an entire team from one of the most prestigious universities that already created Libratus it should not be that difficult. Of course I am assuming that based upon what the programmer said on twitch.

GameTheory 8 years, 1 month ago

It is over now, humans have officially lost at 14.7bb/100.
Even Big Dick Donger Kim couldn't save humanity. Sad.

Mancuso 8 years, 1 month ago

I've heard from a HU pro that this will affect strategies/bet sizings/etc. For example, he said about the crazy way the bot was overbetting the river.

But I don't believe that we're about to see lots of bots at online poker too soon.
We already argue about this.

ZenFish 8 years, 1 month ago

Very sad news imo. Hope it won't affect online poker much.

Or we can see it as good news! It indicates that the gap between human play and perfect play is still huge. There should still be good money in the game for those who keep improving.

miami002 8 years, 1 month ago

Those arent even the best HU players in the world! I mean where are players like Dan "mademymillionsplayingpoker" Bilzerian and other high level dudes like him. I dont get it!!!

GameTheory 8 years, 1 month ago

This bot sucks, but these subelite humans don't know how to exploit it. That I can tell ya. If I could team up with TrueTeller, OtB_RedBaron and Isildur1 we could crush this bot so hard that CMU never will attempt another Brains vs AI challenge again. Believe me folks!

PEACE of MIND 8 years, 1 month ago

I think even the bot can't beat these players, this is meaningless. it shows that the bot can already beat 99% human players, it can cause enough threats to online poker.

GameTheory 8 years, 1 month ago

When Deep Blue faced Kasparov in 1996 and 1997 they took on the world champion, and not, for instance, the #8 ranked player Boris Gelfand.

There is no question that this bot can beat many human players. But to make the claim that AI is better than humans, it needs to beat the best humans. And not just some mi(d)stake regs.

Quido 8 years, 1 month ago

They can't afford the likes of jungle and sauce. Polk said it himself the lack of money was the reason he did not play.

GameTheory 8 years, 1 month ago

They can't afford the likes of jungle and sauce. Polk said it himself the lack of money was the reason he did not play.

If you pay peanuts, you get monkeys. IBM didn't pay peanuts for Kasparov.

Quido 8 years, 1 month ago

But come on, these are still good high stakes regs. 15bb/100 is such a large difference I am not sure whether jungle would be able to beat it.

Also it could be good news for us. Maybe they will call poker 'solved' and move onto another project.

Be the first to add a comment

Runitonce.com uses cookies to give you the best experience. Learn more about our Cookie Policy