GTO and Nash equilibrium
Posted by suwasup
Posted by
suwasup
posted in
Low Stakes
GTO and Nash equilibrium
if there is nash equilibrium in NLH, does it make NLH the same game as rock paper scissor(i know its million times more complicate) ?
for example, in the game of rock paper scissor,let's say you earn 1 point when you win, you lose 1 point when you lose, and when you tie, you earn 0 point. in this case, if you do rock 1/3 the time, paper 1/3 the time, scissor 1/3 the time, you are playing pure GTO, no matter what your opponent do, you will always get 0 point in the long run. however, if your opponent doesn't use the same strategy as you do(pure GTO), you can change your strategy to exploit him, but at the same time, you are no longer playing pure GTO, which you are not at nash equilibrium anymore.
in conclusion, if you use pure gto, no matter what your opponent do, your EV should be 0 in the long run?
Is this the correct understanding of GTO and Nash Equilibrium? please leave your common
Loading 10 Comments...
update:
I talked to someone in a group chat and found out that, unlike rock paper scissor, there are dominated strategies in NLH.(it basically means the strategies worse than the others). for example: all in with 27o all the time and fold any other hands. you will lose against GTO strategy.
dominted strategy don't exist in the game of rock paper scissor. no matter what strategies villain use, both of you will have 0EV as long as you play GTO strategy.
my guess is that NLH and RPS have different type of nash equilibrium
A way I like to think about it is that GTO is defense and prevents you from losing money by being exploited if your opponent is perfect. However, there are spots your opponent needs to also play perfectly in order to avoid losing money (i.e. follow a "dominant strategy").
For example, if GTO says a hand is a pure call, and opponent doesn't follow this advice, they are following a dominated strategy and lose equity. Likewise if your opponent plays imperfectly preflop, GTO let you capture all their leaked money and prevent you from being exploited by their strategy, etc.
I just had another thought on GTO.
any imbalance, or dominated strategies, are based on the deviation from nash equilibrium, which is GTO strategy.
In another word, when you try to exploit villain, what you do is find his leak, which is where his strategies deviate from GTO strategy. and change your strategy to exploit him. but everything is based on the understanding of GTO
is this correct? if so, does it make GTO the only strategy that work? (if everyone is smart enough like super computer)
Ugh ... some kind of misunderstandings raise their heads. :)
Let's start with two very basic misunderstandings and clear them up once and forever:
1) GTO IS NOT ABOUT DEFENSE - IT'S ABOUT MAXIMALLY EXPLOITIVE PLAY!!
2) GTO DOES NOT MEAN "BREAK-EVEN".
OK, start with the first one: it's a common idea to view GTO as a purely defensive strategy. That is nonsense. Let's dissect the term GTO - it means "game theory optimal". The exact term "GTO" actually is a synonym for the nash equilibrium, hence the state were two game theoretically optimal playing players play against each other. That becomes clear when we talk about GTO-strategy against a "weak player". Obviously the "optimal" strategy is not optimal anymore - even though we still call it a GTO-strat.
What does that mean? Let's eliminate the word "optimal" for now - and talk about "game theory". What does THAT mean? Game theory is about maximally exploiting our opponent in each and every situation. BAMMM!!! :-) Got it? We "exploit" our opponent by always taking the maxEV-strategy available in each and every spot. Now, when two players try the same - they eventually settle at a point where nobody can do better anymore - which is called Nash Equilbrium. As this situation is somewhat static and does not really look aggressive in most cases (like two boxers dancing around each other and covering themselves while nobody leaves his coverage to punch) - it's NOT a defensive strategy, that's just a sideeffect. But look at a situation where we get to the river with a polarized range. The GTO-strat maximally overbets the river - and add a healthy bunch of bluffs. Quite the opposite of a defensive strategy right? And the defender (bluffcatcher) is not really "defending" himself either, he just tries to win as much as possible (or lose as little as possible).
Summarized, the "optimal" in GTO does not stand for the absolute optimal decision, but it stands for the optimal decision against an equally GTO-playing opponent. If that does not hold true, the game theoretical "optimal" strategy is an exploiting strategy - but that is not equivalent to the term GTO, so let's call it GTB - game theoretical best. :-) (Actually I'm sure I coined a new - different - term for that many years ago already, but I don't remember ... :-/).
Like imagine we are IP on the river and have a polarized range. 10x nuts, 10x air. We got one PSB left. GTO strat for IP is to bet 10x nuts and 5x air, OOP will call 50% of his bluffcatching range. Right? But what if OOP will call 45% and fold 55%? "GTO" remains at 10/5, but GTB is now to bet 100%, 10x nuts and 10x air.
Got clear?
OK, let's tackle the second point. GTO is not about break-even-play. That again is a side effect in the bigger universe - namely a game with even starting positions. That's exactly the case in RPS - both players have the exact same limitations and options. In result, nobody will win. Same principle holds true if you flip coins. In poker - it's true if we consider the entire game (with all positions as one entity, starting from preflop). But, if we look at certain scenarios, we are far from being break-even. A player in the blinds for example will (still) lose with an optimal strategy. But he will lose less than if he played supoptimally. A player on the button will always win with optimal strategy. In a 3-bet pot the aggressor will always have a positive EV on the flop - in a GTO scenario.
But all that does not really matter if we separate ourselves from such terms as winning, losing, attacking, defending, bluffing and such. Those are just "human" creations to label things to make it easier to talk about. At the end of the day the only thing that matters is EV. maxEV. Even the sign does not matter, positive or negative is just an "artificial" separation. The only important thing is the choice of the maximal EV strategy - regardless if that's -5 EV (when others are -10 or -20) or +5 (if the others are 4 or -1).
OK?
All that might shed some light on the final (or the initial) question - the difference between poker and RPS: OP talked about "dominated strategies" that exist in poker, leading a GTO-strategy to "suddenly" win - and constituted a difference between poker and RPS, formulating that "there seem to be two different types of Nash Equlibria". But that's wrong. There don't exist different types. Nash Equilibrium is a definition. It's precise. It means the point where no player can unilaterally deviate from his strategy to increase his own EV!
Do you recognize the important word? It's "increase"! It does neither say "break-even" (at any point) nor does it say that the result has to be static. That means, (existence of) a dominated strategy does not contradict that definition. It still holds true.
Which again means, RPS and poker - GTO-wise - are absolutely comparable. The differences root in different setup of the games (rules, limitations and options - as the existence of blinds), but the basic theory is absolutely the same.
I hope that was understandable and did not came off too wise-ass! :-D
BigFiszh
I have a thought that GTO is a required poker course. or let say the foundation of modern poker, because any imbalance, or dominated strategies, which can be exploited, are based on the deviation from nash equilibrium, which is GTO strategy.
people say you should not play GTO strategy at micro stack because your villains make too many mistakes, you should try to exploit them as much as you can. but isnt it still based on the understanding of GTO? it just their mistakes are too obvious in micro stack that you dont realize you apply GTO strategy. or GTO thought.
That is EXACTLY what I've been preaching for years. :D
Who says that you don't need knowledge about GT(O) on micro stakes purely proves that he does not understand GTO.
A lot of great points hear Fiszh, especially that understanding GTO lets us dominate! Yeah, some of my terms were fairly casual, but i still think they express the consensus of what I've learned, which is that: GTO makes you impossible to exploit (defense) against perfect opponents but strictly following GTO prevents you from maximally exploiting your imperfect ones (offence). I think the assumption and justification behind GTO is that opponent has to be playing a near perfect game and the further he moves from it, the further we can deviate from GTO.
Let me explain: assume Pio tells us what a perfect GTO strategy is. Then in certain spots it will tell us to pure check hand 1 or pure call hand 2. But if you know your opponent (or pool) overfolds and underbluffs, respectively, then by strictly following Pio (aka GTO), you are not maximizing your profit as you should overbluff hand 1 and overfold hand 2. As in your example about GTB vs GTO - if you bet 2/3 value, that is the GTO strat. If you bet 50% value, that is NOT the GTO strat, as it is an exploit. Or to use 27o: if it's folded to you in SB and you know your BB opponent will fold everything except top 10% to a 6x bb open, are you not making a mistake following the GTO strategy which tells you to fold? The higher the confidence in your read, the higher your confidence in the 6x bet, etc. When you can't trust the read, the consensus I've heard is to fall back to something like a GTO / GTO adjusted line.
By "defense", I didn't mean passive or weak, I meant playing in a way to protect your equity and not be exploited. This can mean finding bluffs that most humans couldn't, or making hero calls because opponent will have these same bluffs.
Even though micro/low stakes probably don't always play that close to GTO, IMO the best way to learn to exploit opponents is to learn GTO and your opponents - the better we understand what perfect actually is, the better we will be able to notice their mistakes and our own.
It definitely didn't come across as wise-arse. i feel we're probably pretty close in understanding and it's about setting an agreement on general terms.
Suwasup: I'm not sure if you've seen or bought From The Ground Up yet, but I think Carroters does a great job explaining a baseline strategy. I think much of this is rooted in GTO theory as a starting point and then talks about some exploits based on typical leaks at the micro / lower stakes. For what it's worth, it's fairly reasonably-priced and I found it (and his regular videos) super helpful.
Yes, From the Ground up is a great course! and its only $50. I ve seen it and it improved my game a lot. and im now reading his book The Grinder's Manual.
Awesome! Me too! (I went from winning live player to "oh sh*t, gotta learn online properly" player over the past 3 months, feels like i'm getting there).
Do you find the book different enough from the material to recommend it? Been debating that book vs. 100hands right away.
Be the first to add a comment