A Quick Question on Game Theory
Posted by midori
Posted by
midori
posted in
Low Stakes
A Quick Question on Game Theory
This might have been asked before by others, but let me shoot it again: what is the ultimate purpose of GTO play?
Let me elaborate a bit. Let's say we have K2o in BB and flop comes K96hh and our opponent c-bets. I have a feeling that if we x/r K2o here, that might be more +EV than x/c'ing it in a vacuum for whatever reason. However, if my opponents know that I x/r every King here, he can exploit it by a) 3betting on flop for thin value, or b) barreling wide when I x/c because my range is capped to mostly 9x or worse. That will result in me losing money with other hands, so it kind of feels natural to think the GTO strategy should suggest I have some Kx in my flatting range.
However, I'm not sure if I really understand why.
They usually say I can "protect" other hands in my x/c range, but I don't quite get it either. Does it mean I can play those "other" hands more profitably, at the cost of playing K2o less profitably to a lesser extent, so that the overall strategy becomes more +EV? If this is the case, am I correct in understanding that GTO allows us to play our overall range in the most +EV manner, but not necessarily each individual hand?
A slightly different question is, I hear lots of people saying "this is exploitative play but I'm gonna make it" in order to maximise profit, for example betting really small on KK2r board. Now, what happens if they play every hand in their range in the so-called exploitative manner? Do we ever lose money with any of those hands? Well, I doubt it, because if we did we wouldn't have used the exploitative strategy to begin with (correct me if I'm wrong). But if we switch to GTO, now, do we win more money with any of these hands compared to when we play exploitatively? Or these are really the same thing and I'm just confused at the semantics?
At any rate, going back to the K2o example: if I want to x/c K2o instead of x/r, should it be because I think that's the most +EV way of playing this specific hand? Or most +EV for my overall range, even though it might be -EV for K2o?
Loading 21 Comments...
The purpose of GTO play, in a nutshell, is to make yourself unbeatable. It isn't about winning the most money, it's about ensuring your opponent cannot, regardless of his play, beat you (unless you're playing in a game where he has an inherent advantage in which case your GTO approach limits his potential margin of victory).
Consider rock/paper/scissors - the GTO strategy for this game is very easy to work out: perfect randomization where you play each of the three symbols exactly 1/3 of the time. Using this strategy there is nothing an opponent can do to beat you. He will win 1/3 of hands, tie 1/3, and lose 1/3. There is no pattern he can exploit, no imbalance, nothing. However, you also can't win as you will also win, draw, and lose exactly 1/3 of the time.
Now imagine you played against a guy whose strategy was to play rock 100% regardless of history. An exploitative player could win here 100% of the time, but the GTO player would chop. This is what we are looking to achieve when we deviate from a GTO(ish) strategy in poker - we have found someone who is making huge mistakes which will perform worse against a strategy which isn't our GTO(ish) standard.
One last thing, I don't want you to think that GTO means we'll always break even as we do in rock/paper/scissors. Unlike that very simple game, there are some in which GTO will break even against another GTO approach but will beat sub-optimal strategies. Poker is one such game. Another would be rock/paper/scissors/fold. This game (which I just made up) is exactly the same as rock/paper/scissors except that you can choose to fold by which you guarantee a loss of 0.5 any loss you incur by picking the defeated option. Quite clearly folding is retarded (no reason to do it), however if you played someone who utilised the fold option, your GTO approach would show a profit.
Does that kind of explain the questions you have?
Note: You and I don't use the phrase "quick question" in the same way :p
Note 2: I say GTO(ish) 'cas no one in poker has gotten anywhere near a GTO strategy. The game is just too complex.
Thanks for your reply Tom, but I'm afraid my question still remains unanswered. Maybe I'm missing something here :(
Let me ask you a real quick question this time. Let's say you think x/r'ing K2o in the above example is more +EV than x/c'ing it in a vacuum. In this case, would there be any reason why you would x/c this hand? Like, my default play here is to x/c because I want to have some strong hands in my x/c'ing range, and whilst I think that's usually a good reason, I can't really explain it well in terms of the profit of each hand.
I know x/c'ing K2o would be good for my overall range, and I usually play GTO(ish) as a default and then deviate from it whenever necessary, but does that also mean I might be playing each hand less profitably when playing GTO(ish) instead of exploitable? That was my original question.
Lets consider an opponent who will shove all hands worse than K2 OTT and open-fold all hands better (yes, he's just quitting the hand with the nuts) whilst 3-bet/shoving all hands better than K2 OTF and bet/folding all hands worse (a stupid strategy obv). Our vacuum play here is to call and snap turn which results in a HUUUUUUUUUUUUGE profit.
Against an opponent who will shove all hands worse than K2 and fold all hands better to the check/raise OTF, the vacuum play is to check/raise.
Hopefully these examples show how thinking about what is a good vacuum play doesn't really impact GTO as GTO is unconcerned with our opponent as it's an unbeatable strategy regardless of what villain does.
O i loved the fold option added to RPS Tom, just explains well usage of suboptimal strategies in poker and GTO profit of those, great example,
To add, midori, you can try watch newest bens vid, its on the subject of GTO and its applications, is called toy games
The purpose is to play a strategy that prevents an opponent from exploiting you. Playing a "GTO strategy" guarantees that you get your due value from the game, regardless of what your opponent does.
There are some different ways in which this is expressed in the poker community. Some call it optimal play, GTO play, nemesis or optimal strategy.
I personally find this terminology confusing. In game theory, this is generally called minimax. In zero-sum games, the minimax solution is the same as the equilibrium point (or Nash equilibrium). The equilibrium point is a stable game condition where neither player can improve by changing their strategy.
Against opponents who don't play the minimax strategy, a non-minimax strategy will generally be more profitable. This is what people think of as exploitative play. Note that this goes both ways. In order to play exploitative, you also need to open yourself up to being exploited in return. Also note that playing minimax means you always maximise your EV against a rational opponent. In poker there is always a 0 EV option, so there's never a reason to play a hand in a -EV way. (no loss leaders)
To bring this back to the K2o example, it comes back to how you would construct your flop ranges and how often you show up with certain types of hands. And this will be related to the pot size and remaining stacks (for no limit or pot limit games).
Yes, so when I construct those ranges, am I focusing on maximising the profit of my overall range? Or each individual hand? I have a feeling that they might not be the same but I might be wrong, this is quite confusing.
The way I see it, the strategy must always be built range vs range (or range vs multiple ranges if it's more than two players). You're essentially building a strategy for the whole game tree from that point. Each hand will have its own place in that strategy -- i.e. will show up in different ranges (x/r, x/c, fold, ...). And with all likelihood, hands will show up in several ranges with different probabilities. This is called mixed strategy, or mixed strategy equilibrium -- you take different actions with the same hand, with a probability assigned to each action. All these different actions/probabilities will have the same EV though.
As I understand it...
The answer to your question is that the 'ultimate purpose' of a GTO strategy is to play each hand in a way that maximises its EV against an opponent who plays perfectly against you. That is, if raising K2o produces a higher EV than flatting, then you will always raise it. If raising and calling both have equal EV, then you play a mixed strategy.
There's no such thing as a loss leader, because in an equilibrium strategy, neither play can unilaterally deviate to increase their EV. So if c/r had a higher ev than c/c (against an opponent playing perfectly against your strategy), you could increase your ev by c/r K2o if the opponent's strategy remained static.
So assuming that c/c is the GTO strategy for playing K2o in that situation, and you decide that against a specific opponent c/r has a higher EV, that's because you're exploiting some specific tendency that pushes the ev of c/r over that of c/c.
If you could play perfect GTO, the only time you would deviate is when you identify some specific trait of your opponent that allows you to play some hand in a way that is higher EV than your GTO strategy (e.g. if an opponent folds too much to 3b, you can flat some of your stronger hands and 3b more hands you would normally fold).
It may be that the GTO strategy for playing K2o is to c/c in those particular circumstances against a perfect opponent.
It may also be true that c/r K2o has a higher EV than c/c against basically all the opponents in your player pool. This is because some part of their strategy is sub-optimal.
Midori
reason to constriuct ranges based on GTO strategy is that you become unexploitable, not to maximise your profits. Any deviation from this is not GTO anymore. This sound like not most profitable, but in reality it is. Reason why you base your range construction on GTO is, that if you deviate from it, you will have significant leaks that wil be open to counterexploatation by other players that wil make you significant losses, and thats where you actually see benefit of GTO based range construction.
Same goes with individual hand.
Anytime you chose to deviate from GTO and use exploitative strategy, you have to be aware it is open for counterexploatation. That doesn't mean you can't use exploitative strategy when u find it profitable. Thing is, you just need to be aware of upsides and downsides, and be sure you do know what you are doing.
Lets say you face tight player pool and decide to construct looser range, cause opening more and cbetting more should make a profit. Now, if you overdo it, players will notice and start 3 betting you wider, raising you more on flop, and there you get a consequence of not being balanced. On other hand, if you can get away with it unnoticed, you for sure can and should do it, but moment you start to be counter exploited you need to switch back to GTO. Same goes with playing a hand , if player is folding to 4 bet 90 percent you should start 4 betting him any 2 cards, of course, until he notices and do something about it.
Umm..i hope this helps u a bit in comprehending GTO and its benefits .
Thanks for your replies, guys. It helped me greatly, and I agree with you for the most part.
One thing still bothers me, though. In retrospect, I think I was more concerned about how putting one hand into a certain range would affect the EV of other hands that were already in that range. For example, even if we know x/r'ing K2o is gonna be more +EV than x/c'ing it, can't we make other hands in x/c'ing range more profitably by adding K2o into it, i.e. strengthening our range? Or is this a myth?
Something like this:
Suppose we put hands A and B into x/r and x/c range, respectively. The EV of these hands are +0.5bb and 0bb. Now we move A from x/r to x/c range, and let's say the EV becomes +0.3bb and +0.3bb for both of them. If this were possible at all, which one would be a better strategy?
That's a very good point! But I think in this (overly) simple example we can assume we no longer have a x/r range. I wonder if this is a possible scenario at all, anyway :(
But wait, I think we're getting there. Let's say we have two different strategies with two different range constructions.
a) we x/r A and B, x/c C and D; EV = 0.5bb, 0.2bb, 0bb, 0bb for A-D
b) we x/r A and x/c B-D; EV = 0.4bb, 0.1bb, 0.2bb, 0.2bb for A-D
So if we move B from x/r to x/c range, the EV of A and B will decrease by 0.1bb each but that of C and D will increase by 0.2bb, hence making the "overall" strategy more profitable. In this case, is b) the better strategy because it's more +EV in overall, despite it's less profitable for hands A and B?
I know this might not be the most realistic example, but hopefully you got the idea.
Basically, I'm curious to know if playing certain hands in a less profitable way can be a part of the superior strategy, if they can make the overall range more profitable.
Alright, that answers to all my questions. Thanks a ton, Tom and others! :)
That is far too complicated question for as huge unsolved game as poker, i think. Loads of math, and you can get some perspective on answer on that 1
But question itself is exploitation question, not GTO one in its core, because GTO doesn't care whats more profitable, but is making u unexplainable against nemesis, so as such i don't think GTO can answer your question.
That's true, I should have asked the question without saying GTO :(
nm is good to see how much can we make our GTO pro;s laugh this year reading our ramble on subject
Be the first to add a comment