willt's avatar

willt

53 points

Comment | willt commented on Balance Riddle

well, the equilibrium for this situation has been given many times before this thread, but these threads keep coming up, so I don't think it hurts anything to remind ppl of the logic behind the answer.

April 14, 2014 | 5:35 a.m.

Comment | willt commented on Balance Riddle

if villain only makes our 0 equity stuff indifferent to bluffing, and we have no 0 equity stuff, then our max expl strategy has no bluffing. so in response, villain folds a lot when we do bet, which presumably motivates us to start bluffing a lot.  (and similarly if we only have a little 0 equity stuff, but not enough to make a bluffing freq that makes villain want to call.)

so obv it can't be equilibrium for villain to just be m aking our 0 equity stuff indiff. 


April 13, 2014 | 9:15 p.m.

Michael,

I really don't mean to be insulting.

I believe all of your comments about figuring out players' frequencies are besides the point.  If a player plays a certain strategy, he has certain frequencies.  For example, in the PvBC river situation BigFizch described above, maybe the bluff-catching player's strategy was "call 50% of the time, randomly, and fold the other 50%".  Certainly figuring out opponents' frequencies is challenging at the tables for a variety of reasons, but that is a complete non-issue in solving for GTO play. 

I believe your objection to people claiming to have a GTO strategy to the full game is a strawman argument.  I don't know of anyone making such a claim.  If somone is, I agree that he is wrong. 

That said, the fact that no equilibrium of a big-bet game is known does not mean that anyone is free to say anything they want about what it might look like.  In fact, we can confidently say a good many things about it, starting from the definition and some properties of the game and applying logic.

I hope you will consider taking a step back and spending some time learning about how simple games are solved.  They are much easier to work with than the full game, and it might give you a better feel for some of the concepts involved.  You will be able to see why GTO is defined the way it normally is.  And some of the results even provide insights useful for real play.

Thanks again for starting this thought-provoking thread.

Will



June 11, 2013 | 7:29 p.m.

What I'm getting at is a deeper level of strategy
that involves the appearance of switching strategies, but really it was a
part of the strategy itself.

Well, that hand-wavy mystical stuff is all well and good, but don't lose track of what actually matters -- players' frequencies.  So you mean to say that unexploitable strategies can involve changing your frequencies in certain spots over time?



June 11, 2013 | 4:08 p.m.

Michael,

You seem to still be confused.  Certainly many games have multiple equilibria.  Are you implying that unexploitable play might involve switching from one equilibrium strategy to another in order to exploit our opponent?

Also, what do you think the word "strategy" means, precisely?

Will


June 11, 2013 | 2:55 a.m.

June 11, 2013 | 1:47 a.m.

Nah, at the equilibrium, Villain's bluffs have the same EV whether they bet or check because Hero is bluff-catching just the right frequency to make it so. 

That is, bluffing the size of the pot and checking are both "0EV" for Villain's air hands when Hero is calling 50% of the time.  So Villain can't improve his EV by changing just his strategy.  If he switches to always checking (or always bluffing for that matter) and Heros strategy stays fixed, then Villain's air hands stay "0EV".

June 10, 2013 | 9:02 p.m.

First, if I understand this right, playing GTO means that you'll never change your strategy based on opponent's tendencies,

Yea, but there's nothing deep here.  Playing GTO means playing a particular strategy.

and it will be impossible to show a profit playing against someone that plays perfect GTO.

Yea, in heads-up play at least, two GTO players will break even against one another on average over both positions.  Obviously there's no reason to expect to break even if we're just look at one spot in a vacuum (like BigFizch's polar versus bluff-catchers river situation) or even one hand (since being out of position is bad).  But on average over both positions, two GTO players facing each other expect to break even, and if one of the players stops playing GTO, he can only decrease his expectation by doing so.

But let's say villain decides to never bluff the river and
only valuebet, how can we not change our strategy and be unexploitable?
Villain could play GTO in all the other areas and only vbet this river
for value and his change in strategy would now give him and edge over
the other person.

No, this is fuzzy thinking -- go through it more carefully, and you can see that if Villain began only betting this river for value, it would not increase his EV.  His value hands have the same EV since they still bet and still get called the same amount.  And his air hands have the same EV when they just give up since Hero was calling the right amount to make them indifferent between bluffing and just giving up in the first place.  So, overall, Villain's EV does not change.

This brings another question for me, that might answer the last one: Is
there any range stronger than the opponent's range in GTO, at any time
in a hand, if we suppose both players are playing perfectly against each
other? If there is no polarized vs. bluff catching range on the river,
then there is no problem anymore.

There may or may not be any pure polar vs bluff-catcher situations in the GTO play of the real game, but it's still a good situation to be familiar with if for no other reason than that it's a close approximation of real river situations that come up a lot.  Anyway, it's the PvBC river situation is well-defined game that we can talk about solving, and I think the problem you refer to stems from a misunderstanding anyway.


June 10, 2013 | 7:37 p.m.

You've been told the actual definition of GTO a bunch of times ITT.

June 6, 2013 | 4:18 p.m.

Looks good to me.

In addition to the "neither player can deviate to improve his EV" definition of equilibrium, there's an equivalent, possible more intuitive way to say the same thing -- equilibrium is when both players are playing maximally exploitative at the same time.  And BigFizch did a good job showing how both players' attempts to play as profitably as possible actually led to balanced/equilibrium play in this simple river situation.

So, of course GTO/unexploitable play arises out of the desire to make as much money as possible.  That's the point of poker, so a theory of how to play the game that didn't try to do that at some level would be a pretty crappy theory.  GTO play is your most profitable strategy when your opponent is playing as profitably as possible also.



June 5, 2013 | 8:26 p.m.

Michael: np, glad to help :)

Aleksandra: yea, that was the Wikipedia page for Nash Equilibrium.  To see that the authors of MoP use "optimal" to mean the same thing, see the top of page 103 in the book.  For a discussion on this terminology (to which Jerrod Ankenman contributes), see the terminology sticky in the Poker Theory subforum on 2p2.


June 5, 2013 | 2:10 a.m.

Well, it's true we don't know the actual equilibrium of the whole game, but that doesn't mean it's not a useful concept in theory.

As far as practice goes, there's no way to find a perfect maximally exploitative strategy with any certainty either.  We have to make simplifying assumptions, approximations, etc, and do our best.  Similarly, we can find equilibria of approximate games which are simplified by various assumptions and can often gain a lot of insight by studying these. 

In fact, the question of whether we have anything to gain by changing our strategy is the same as the question, are we currently playing as exploitatively as possible?


June 5, 2013 | 1:32 a.m.

First two sentences of http://en.wikipedia.org/wiki/Nash_equilibrium :

In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally.[1]:14
If each player has chosen a strategy and no player can benefit by
changing strategies while the other players keep theirs unchanged, then
the current set of strategy choices and the corresponding payoffs
constitute a Nash equilibrium.

edit: where do you see it otherwise?


June 5, 2013 | 1:07 a.m.

So, think about it like the word "work" in the physics community.  It has a well-defined, technical meaning: force times distance.  Sometimes this definition does not agree with people's intuition about what the word should mean.  For example, if you carry a weight across a room, such that the force you exert on the weight is aways vertical (to support the weight) but the direction you move it is always horizontal (across the room), then you have done 0 work on the weight. 

But it was heavy, and you carried it all the way across the (long) room!  Doesn't matter.  It turns out that "force times distance" is a very useful concept in physics, and someone gave it the word work, and that was that.  Now, whenever we say work in physics, we're refering to the very specific concept that may or may not always match the colloquial use of the term.  You definitely shouldn't try to make up your own definition in physics for the word "work" which already has a widely-used and precisely-defined meaning.

In The Mathematics of Poker (and some but not all game theory literature), optimal means Nash equilibrium.  As this thread proves, the terminology confuses people in the poker community.  So, "game theory optimal" is useful, since it's less likely than with "optimal" that people will see it and think they understand what the writer means when they really don't. 

But anyway, Nash equilibrium has a very specific meaning: a set of strategies such that no player can increase his EV by unilaterally deviating.  This is a cold, hard, dry definition.  There's no touchy feely qualitative stuff like "defensive", "playing not to lose", etc.  But it turns out that, in the case of a 2-player zero-sum game like HUNL, this is enough to guarantee some pretty strong properties: in particular, at least break-even expectation when our EV is averaged over both positions.

The Nash equilibrium may not have all the properties you want for your strategy.  In that case, tell us what properties you want, and then, hopefully, show that such a strategy exists and show us how to find it.  And give it a name.  But not GTO, because that name's already taken. :)

June 5, 2013 | 12:33 a.m.

As far as I know, the terminology "GTO" started after The Mathematics of Poker came out.  The authors used the word "optimal" to refer to minimax (i.e. Nash equilibrium) strategies.  But "optimal" confused people since it seems like an optimal strategy should be one that is as profitable as possible.  To make things even more confusing, the equilibrium strategy and the most profitable strategy are the same under certain conditions, but they're not the same concept in general. 

So, the terminology "game theory optimal" (GTO) was born to describe strategies that MoP just called optimal, i.e. Nash equilibrium strategies.  So that's my understanding of where "GTO" came from.  It's not used in math or economics literature, but in poker nowadays, it refers to Nash equilibrium strategies.

Now, "Nash equilibrium" has a very specific, technical definition, and equilibrium strategies have well-defined properties.  So, GTO isn't really a word that you can use to mean whatever you want.  It isn't just the best strategy to play as informed by a study of game theory.  An equilibrium is a set of strategies, one for each player, such that no player can improve his EV by unilaterally changing his own strategy.

That said, I really like your point that there's no essential conflict between exploitative and balanced play, and I agree that the problem of coming up with exploitative strategies such that your opponent does not realize he is being exploited is very interesting.  Thanks for the interesting article.

Cheers


June 4, 2013 | 7:24 p.m.

Load more
Runitonce.com uses cookies to give you the best experience. Learn more about our Cookie Policy