PIO Experiments

Posted by

You’re watching:

PIO Experiments

user avatar

Sauce123

Elite Pro

Video Player is loading.
Current Time 0:00
Loaded: 0%
Duration -:-
Remaining Time 0:00
  • descriptions off, selected

Resume Video

Start from Beginning

Watch Video

Replay Video

10

You’re watching:

PIO Experiments

user avatar

Sauce123

POSTED Mar 22, 2019

Ben Sulsky aka Sauce123 begins by responding to a question in his previous video thread utilizing PIO to check his line before diving into some interesting spots with PIO.

36 Comments

Loading 36 Comments...

Zachary Freeman 6 years ago

I like videos like this where we take Pio results and discuss the difficulty of applying them. Thanks. Haven't watched a vid in quite a while.

On A73 experiment, we compared varying IP cbet strats and resulting ev losses where OOP was clarvoyant to IP strat and max exploiting, correct?

It might have been useful and interesting to also include what amount of EV loss the varying non-optimal strats would have experienced if we locked OOP flop turn river strat to the original GTO strat. I'd imagine these losses are closer to what we would likely encounter and would be smaller losses.

Sauce123 6 years ago

Zach, every hand in IP's range except 77/A7 were mixed strats, so if OOP's strat is locked to GTO IP can click buttons.

Zachary Freeman 6 years ago

Zach, every hand in IP's range except 77/A7 were mixed strats, so if OOP's strat is locked to GTO IP can click buttons.

I'm either misunderstanding your comment or the theory. Why can he "click buttons"? I get that his mistakes won't be maximally exploited but for example, if IP cbets 1/3p 100% freq, he is going to face a xr ~30% (OOP GTO) with many hands too weak to continue that he would have checked if playing GTO? Certainly, if OOP was playing max exploit his xr freq would have risen to better exploit the too weak cbet range but the mistake in IP cbet strat will still be punished by folding hands that were supposed to check. IP will experience ev loss compared to playing GTO. And I'd imagine it would be closer to the level of ev loss IP would encounter given no villain is likely to play an accurate max exploit strat.

Markus 6 years ago

I felt the same way of approaching this situation as Zachary is mentioning. Reasoning that the GTO solution is unexploitable, so our response that is not perfectly GTO but the best result us humans are able to achieve, with us deviating slights from GTO, Villain should still gain EV in a multistreet game where he is closer to equilibrium than we are.

So as a training to get closer to GTO this was a valid way of trying to "plug in" humanly playable ranges and figure out how close I can get to what the solver is suggesting.

Why are you thinking this is not an appropriate way of getting more knowledged in this topic?
Do you think, due to people accidantly or intentionally not being able to play the equilibrium solution, they could be exploiting us in a way the solver is not able to because we nodelocked a solution that is not perfectly exploiting our deviations?

I do feel like I/we are missing something but at the same time this is the closest way of trying to learn how to have as perfect Cbettingranges as possible I came up with.
And since all of my results were worse than the best playable simplification of bet/check range X% of pot, I went back to trying to be more precise in this way and still am wondering if playing mixed strategies would gain myself more EV

Samu Patronen 6 years ago

I get that his mistakes won't be maximally exploited

They won't be "exploited" at all due to OOP being locked to a GTO solution.

the mistake in IP cbet strat will still be punished by folding hands that were supposed to check.

No it won't and that is demonstared by the EV numbers in our strategy againts the equilibrium strategy, we're actually just indifferent with virtually every hand. The only reason we're forced to care about our frequencies is the fact that OOP can actually exploit us if we go too low or too high with our frequensies. Therefore locking OOP to playing GTO allows us to do whatever we want with every mixed strategy hand.

Zachary Freeman 5 years, 11 months ago

we're actually just indifferent with virtually every hand. The only reason we're forced to care about our frequencies is the fact that OOP can actually exploit us if we go too low or too high with our frequensies. Therefore locking OOP to playing GTO allows us to do whatever we want with every mixed strategy hand

Zach, every hand in IP's range except 77/A7 were mixed strats, so if OOP's strat is locked to GTO IP can click buttons.

Right! It took two of to explain you until I saw it.

So, is it the case that only combos that we deviate from pure bet pure check will affect ev vs locked strat? Even when the freq is fringe like 99.5% bet 0.5% check. In practice we treat that as pure bet but pio would show no ev loss vs a locked strat if we changed it to 100% check, correct? But would show large ev losses if opponent can adjust.

Samu Patronen 5 years, 11 months ago

Yes, that seems correct to me. Mixed strategies exist in equilibrium merely because if we played in other way and used different frequensies, PIO could change it's strategy and make more money againts what we're doing somehow. But once the strategy is set in stone, it turns out that many hands are indifferent, especially on the flop, so we can actually "click buttons" with all of the mixed strategy hands, assuming no exploits from opponent, and that is correct in PIO results because PIO is not doing any exploits after the fact.

therapist 6 years ago

I like this kind of video. I did my own exploration (same board, similar ranges). Some interesting things I found...

IP having a forced check out performs a forced 1/3 cb. Also, I tried to make the forced cbet have the same betting volume as optimal strat. In my case this was 16.5% pot. the result was 0.65% of pot ev loss. But the ev maximizing size for forced cb was 12% pot which achieved around 0.5% pot loss. That outperforms your GTO approximation.

My question is why we never see a 12% range cbet strat in game? Am I a genius pioneer or is there something I am missing?

simrud 6 years ago

This is very interesting. I have seen very small sizes picked a lot when allowed in sims down to min bet. I think with enough computing power a bot does use very small and very big sizes (up to like 4x pot). It is probably a question of human implementation.

Markus 6 years ago

Very nice, I will try to follow this route for 3betpots trying to be more accurate.
For SRPs tho for example it is not possible to bet <20% due to only roughly 5bb being in the pot.
Thanks for this interesting idea!

Demondoink 6 years ago

therapist I was watching a HU recording between Linus and Trueteller and Linus was using a 10% c bet sizing on Axx boards in 3 bet pots. so it seems like you are second behind the GOAT with this idea :P

Samu Patronen 6 years ago

I also recall someone using a super small cbet on some Axx type of board in some old Tyler Forrester video, I don't remember who it was but I recall Tyler saying that he was one of the biggest winners at the given stake at the time. Even more interestingly the guy kept betting super small on the turn. Tyler had like a J9-high or something and seemed a little frustrated. :D

Demondoink 6 years ago

Samu Patronen I haven't seen that video (or at least if I have I can't remember it) but I guess it is these little things that they have studied and implemented successfully in to their respective games that allow them to become the biggest winners at a given stake.

and yeah likewise, Linus also followed up that tiny flop c bet with another very small bet on the turn. those small bets are actually pretty common on most board textures, especially in 3 bet pots, but most villains either don't include the small bet option in their PIO sims, want to simplify their strategy and only bet one/two turn sizing's with a predominantly polarized range, or haven't thought of this idea yet.

you are also meant to use small sizing's when something like a flush card comes in on the turn/river, but again, the majority of players are not doing this.

DF_Newb 6 years ago

Skipped around a bit so I apologize if this was mentioned but it rounds the cbet up to the nearest whole number. In the first example the cbet is 2 into 5 or 40%. Not sure how much this changes things. I accidentally did this last week in a huge script and had to scrap the whole thing.

Markus 6 years ago

This is exactly the experiment I have done before, wondering if simplifying strategies but camouflaging the stats with having pure Checking and Pure bettingboards (at least in SRP vs BB situations) would be better than trying to play mixed strategies that would perform worse than what the best pure strat would have as a result.

Additionally we are simplifying our gametree into one less node while villain is still thinking we are playing with multiplay options otf.

So now getting to my question:

I think the amount the we get exploited on future streets comes from the the concentration of certain hands in our range that the solver is able to pick up on and give us less value on specific cards that we hit and is maximizing his potshare on cards where his range is performing better than ours. Which obviously is a terrible result.

Nonetheless, our opponents are never ever going to figure out what exact parts of our range we are "overcbetting" and which we are "undercbetting" and so in reality our opponent, if he tries to exploit certain things, is rather going to be imbalanced himself than being able to exactly punish us for those "imbalances"

I am putting these words in quotes, because I think for the solver we are for sure imbalanced, but in the grand scheme of range vs range interaction and frequencies, nailing what kind of hand is supposed to bet at what frequency should be our aim and is what makes us unexploitable in reality.

That is shown, in my humble opinion at least, in that you, as the great player you are and with the incredible amount of solver experience and equilibrium knowledge you have, can only pin down the approximate ranges with a 0,9% inaccuracy while in reality you are crushing because your frequencies with the hand-groups you chose are spot on due to the conceptual knowledge you have gained with observing dynamics in the solver.

So basically, I think this is a very very useful training that one can use to figure out what hands are supposed to belong in what "category" and trying to get a better result than any simplified strategy that is trying to lose one node in the tree for the sake of more correct manouverability on future streets. Still I don't believe this grouping and estimating how to play what part of the range results in the seen EV loss.

I am very curious about what you are thinking about my estimations and look forward to a response. Thanks for the insightful content :)

Sauce123 6 years ago

I don't have a strong opinion about this. On the highly mixed boards like A73s I'd expect the best explo players to have a large edge.

Markus 5 years, 11 months ago

2 Thoughts that I had:

1st) I think we can figure out whether we would really get exploited by figuring out how weird and unnaturally connected the cards are where the EV shifts. When it is, for example, that the EV and EQ distribution shifts on two very similar cards differently, then it is likely that we will not get exploited by anyone in reality as the solver would.

2nd) You have figured out that Betting half your range has more EV than betting range. This is just an indicator, since our range is exactly structured as it would be with either simplification, that Betting range has simply less EV than checking range and therefor the EV changes. Checking Range would therefor have more EV than playing a 50/50 strategy.

So basically what I wanted to figure out with my question above and where I think you have an intuitive answer that does not have to be correct, but I am simply curious on what you think:
Do you think it is higher EV to try to imitate solverranges or would you think a somehow camouflaged, explored simplified strategy would have a higher EV [if like in the given scenario we have less EV with an imitating strategy than with a simplified one].

I feel like you are for imitating because it is tougher to play against us (as stated in the video) but I would believe that is the same with a simplified one as long as villain does not know that we are doing it.

Maybe your answer is still the same, then sorry for bothering, but I thought by rephrasing it i might be able to get an opinion :P

Thanks for the great content, really inspired me to work in sovlerland a bit more again

Markus 5 years, 11 months ago

I have now replicated this experiment and suits are (unsurprisingly) very important and I can get the difference between GTO and my assumptions down to -0,4% with mixed strats and -0,6% without any mixes.

I don't think I will be able to execute suitmixed going beyond 50% Bet/ 50% Check but I think this experiment will very likely be one of the best possible traininggrounds for being able to execute complex Cbettingstrategies and getting to know equilibrium-dynamics.

I will now compare these results with other 3BP as SBvsBU and BBvsSB to figure out what could be different there.

Very interesting to try to figure out in what areas simplified strategies could be executing better and in which we should get our brain working. Thanks for the inspiration

Demondoink 6 years ago

really enjoyed this video. was almost like a little thought experiment with PIO involved, which opens your eyes to the errors and opportunities for EV gains that we all have in abundance in our games.

fwiw I think a ton of players, even at 500z, end up c betting any Ax board at close to 100% frequency. so I guess I should be using the node locking feature much more efficiently :P

tdeans 5 years, 11 months ago

Hi Ben,

Did this exact same experiment before but drew different conclusions...after running the forced c bet IP, I ran another sim including the node lock for the pools oop response strategy. Pretty much 99% of villains arnt going to find the 33% xr required against the c bet range strategy imo. If you input the actual xr range of everybody excluding like 10 guys in the world then c bet range pretty much always come out with lower ev loss. I dont think the meta in the pools goes much deeper than this i.e. when studying people seem to mostly only have time/their studying efforts only extend to going through their original sim and not a bunch of node locking. IMO most people skew passive also.

In the past I tried to employ a high x back strategy as you have sometimes advocated (as I agreed that people seem to play much worse in this line), but generally it lead to pretty bad results over large sample (although this could completely be down to execution). I then compared my database to the top guys and saw my c bet was 10-15% lower than otb linus and bigblingbets (i made sure to only include their 6 max numbers) who if i remember correctly were in the 70-80% range. I think I saw every high stakes reg had something like this. This lead me to go pretty full circle and come back to a maybe overly simple but i think pragmatic solution to c bet range in a bunch of these spots but then play like a savage vs population as the preflop defender (at least in a vacuum).

I also ran some deep sims and even on boards where ip might have a very high x back eg j89s in hj vs co 3 bet spot, after node locking and seeing peoples response, I in a vacuum decided just betting range was fine even though i think it lost 3% of the pot if oop responded correctly.

I guess my question is just this: Whats your thoughts philosophically on this whole approach?

Appreciate your time. Thanks.

Sauce123 5 years, 11 months ago

I think you've chosen a very sharp way to think about it.

In my very limited experience playing NL lately, I agree with your read that the population isn't finding the exploitative high freq XR counters to high freq cb from IP. I also agree "playing like a savage" against their forced cb nodes is likely to produce a higher than GTO WR (you'll notice in my recent videos I'm playing very aggressively vs CB).

I disagree with the plan to cb a high freq even on boards like J98 HJ vs CO 3bp. 3% of the pot just seems like way too much EV loss to me.

I do think it's possible to execute X behind strategies that perform as well or better than high freq cb strategies, but it's difficult. I encourage you to look at some population tendencies for fold to delay cb and fold to double delay cb freq across various runouts and I think you'll find that they compare favorably to the deviations from GTO you're seeing in fold vs flop cb.

simrud 5 years, 11 months ago

IMO there are 2 main things in play here that are both very good but hard to say which is better:
1) Population does not find nearly enough c/r (and often folds too much also) so range betting just cannot be beat from an equity realization perspective
2) Population plays terrible vs delay lines (like all of them really) and makes huge mistakes often in this part of the game tree
Which one is better is hard to say. I think some villains will make more mistakes vs one rather than the other however.

discr3tion 5 years, 9 months ago

That's helpful to hear that the high freq XR counter is effective. I watched Phil's older MTT video How I Exploit You on raising flops and was pretty sure that it would apply to today's cash games as well, and now I can confidently proceed to employ the strat.

tdeans 5 years, 11 months ago

ps do you have any tips from your experience on how to handle feeling lost philosophically with your strategy/study/pio ?

Sauce123 5 years, 11 months ago

This is a video+ length topic, but I'll spout a couple of platitudes that perhaps will be useful :p

I think ~90min of engaged study per day, extended over the course of years, is sufficient, and a lot more effective than lots of hours and lots of breaks.

You want your study to create a feedback loop between emotionally engaged training of high EV decisions and learning. An easy way to do this for me is to look at a variety of nodes I've played over the course of a session and then I'll do some science to try and determine how I could have played better, and then next session I'll try and implement and extrapolate those conclusions to similar situations. I then iterate this process each session I play and tend to get better over time.

Julian Kopanskiy 5 years, 11 months ago

Hi Ben, while doing node locking, you should select fixed for that part of the tree you was editing. Because if you leave it proportional, then frequencies will end up being a bit lower then you wanted.

Zachary Freeman 5 years, 11 months ago

I ran the following Pio solve (it was one of the defaults provided with Pio). I then locked IP to bet small on flop 100% with all hands except 22 which I locked to bet large 100% given it was the only combo with a pure strat. I locked OOP to GTO for flop strat.

Im getting an ev loss of 0.501 or 0.9% pot.

Last, I ran the sim with locked IP but OOP unlocked and not surprisingly IP had the largest loss in this scenario of 0.93 or 1.7% pot.

Conceptually I understand the reasonings you guys gave for why we shouldn't encounter EV losses when OOP is locked. However, the solve is showing the exact type of results I originally anticipated which is some EV loss but not as much as when OOP can adjust.

Thoughts why these results are the output?

SOLVER: stopped (requested)
Results:
EV OOP: 21.149
EV IP: 33.851
OOP's MES: 21.289
IP's MES: 33.997
Exploitable for: 0.143

SOLVER: stopped (requested)
Results:
EV OOP: 21.650
EV IP: 33.350
OOP's MES: 21.792
IP's MES: 33.491
Exploitable for: 0.141

SOLVER: stopped (requested)
Results:
EV OOP: 22.080
EV IP: 32.921
OOP's MES: 22.249
IP's MES: 33.041
Exploitable for: 0.145

Samu Patronen 5 years, 11 months ago

OOP can still adjust to our range on the turn and on the river, thus the EV loss. What you suggested originally was to lock flop, turn and river.

That said we don't have to lock all streets and all runouts to figure out the EV of our strategy, because it is already demonstrated by mixed options having the same EV. When solving for equilibrium and looking at the results, PIO is effectively been locked to GTO already, because it doesn't adjust after the sim is done and the results are out.

Zachary Freeman 5 years, 11 months ago

Samu Patronen By only locking OOP flop strat, OOP had his blinders on for flop reactions to IP's strat choice to bet 100% but now on the turn he sees the exact makeup of IP's range (which we can assume is too weak/wide given all of OOP x/f's are removed given theres been postflop vpip on both sides) and now even though both players can adjust, IP will experience some losses given he is showing up with a weaker range than he should have. agree?
Whereas, if I locked every node of OOP strat for turn and river he would still be proceeding as if he knew nothing except PF ranges and ev results wouldn't change.

I'd like to run that sim and lock every node if I have the time to confirm.

The part that gets a little iffy is when we see stuff like 120% pot at 0.02% for a combo like J5s on J72dd. Pio would show no ev loss overbetting that combo 100% but in reality that 0.02% might be close enough to 0 that overbetting is supposed to not be a strategic option.

Markus 5 years, 11 months ago

But that's the exact reason why you should not node-lock all streets with mixed strategies.
The moment, somebody deviates from solvers strategy slightly, your 100% J5s overbet becomes catastrophal probably, without even anybody trying to exploit something. And on the other hand, if you are overbetting at more than the given frequency, which is probably very low, this node in your tree becomes way more substantial to your EV than it was before.

So basically, a part of your tree that was negligable is now becoming far more important. Therefor the EV might change, but not because you found a more or less sound strategy, but because a part of the gametree that might still have some solving inaccuracies or has a higher/lower EV because it was the way the solver was balancing multiple strategies has now shifted to become a bigger part of the gametree.

And the solver would therefor change his whole strategy as a response, because you now have a lot of inaccuracies in your strategy (talking about your extreme example) but not letting it do that will just destroy the whole perception of what is possible to be played in that spot.

I just think for what you want to figure out the solver is probably not the right tool.
I understand what you are trying to achieve and what you want to figure out and I have had that thought before as well, but I just think trying to estimate strategies and getting closer and closer to equilibrium without mixing too much giving one a better understanding of how to construct ranges in certain spots is the best way to get better at this.

And for stuff vs different assumptions or playertypes, you can nodelock the opponents reaction and can figure out now what you have to do differently vs certain playerypes.
And you can still run some simplifications like betting range and figure out if this is better than your estimations and use that, which will be useful.

And if you are still far off you can compare your result with the optimal result and figure out in what way your estimation can get exploited. That is basically now my solution that I come up with because what I tried to achieve with the solver was not really possible in the way you described and I wanted to do as well.

Hope that helps a bit

Be the first to add a comment

You must upgrade your account to leave a comment.

Runitonce.com uses cookies to give you the best experience. Learn more about our Cookie Policy