by James Grosjean
Count me among the Netflix drones who loved The Queen’s Gambit (2020), but I’ve always been a chess enthusiast. During my college years, I probably ate a thousand chocolate croissants while watching the quirky, magnificent Murray Turnbull (aka “The Chess Master”) take on all comers in the town square—“$2, refund if you win or draw.” It was my honor to capture a photo of the great Karpov framed by the stained glass of Memorial Hall when he did a 40-board simul on campus. I was part of the student press when Kasparov made his then-controversial statement that a computer would be grand champion before a woman would be.
Saving the debate over Kasparov’s possible misogyny for another forum and another day, I took his statement as merely a projection based on empirical observation of the chess community. Female participation has always been low, and not meaningfully increasing, while the computers were already strong, and rapidly getting stronger. The machines will usher in a new equality—where all genders get crushed like ants.
Zermelo’s Theorem tells us that a game with full information (both players can see all the pieces on the chess board), that is finite (the game WILL end after some number of moves), has a solution, and that if both sides play this optimal solution, then every game has the same result. Chess is complicated enough that we’re not sure what the result would be, but we think that White would win every time, in which case there is no Black response that can change the outcome. The game of Connect Four also falls under Zermelo’s Theorem, and the analysis has determined that in that game, the sneaky sis always wins if she goes first and plays optimally.
If you are an AP who liked QGambit, and are starving for more content during this never-ending pandemic, then your next assignment is to watch AlphaGo, a documentary about the rise of computers in the ancient game of Go, which is more complicated than chess. Not kidding, this movie is a tear-jerker for people who are interested in this field and appreciate the intense emotional drama for humanity’s champion, Lee Sedol—the best to ever do it.
The film captures Sedol’s distress, courage, brilliance, then humility, as he realizes that this match against the machine isn’t just a game, but the emergence of a new world order. Not a programmer, Sedol didn’t appreciate what he would be up against, but as an expert at his craft, on the board he could feel the relentless, impenetrable weight of his opponent.
After you enjoy AlphaGo, I recommend (actually, Google recommends) that you watch some of the poker match between old-school Dan Negreanu and modern computer-clone Doug Polk. The best player on earth is the machine, so a clever human like Polk emulates the machine’s strategy. Adapt or die.
I once had a brief exchange with Howard Lederer. I asked him about bots on the poker sites. He dismissed the issue by saying: “Poker isn’t like chess. Poker is a game of incomplete information. Computers aren’t good at that.” I couldn’t tell whether he was a naïve fool or a conman shill for Full Tilt Poker. Either way, I didn’t want to continue that conversation 15 years ago. But now here we are, in 2021, and it’s time to continue that conversation, by refuting that first fallacy regarding the GTO (game-theory optimal) computers, and all the other overlapping fallacies that the poker dinosaurs and self-proclaimed poker savants are desperately clinging to:
Fallacy #1: Computers aren’t good at games of incomplete information.
This is just ignorant. It is true that Zermelo’s Theorem does not apply to games like poker. For poker, there are OTHER theorems that basically say that there is a solution to the game, and a computer solution will generally involve “mixed strategies” meaning that there is some randomizing component to the strategy (such as calling Scissors with probability 1/3). Computers are quite good, better than humans, at calculating expectation over probabilistic outcomes, especially when the probability distributions are known precisely, as they are in card games. For example, the computer knows exactly what the probability of drawing a backdoor Flush is and what pot odds it needs to justify chasing. Though poker involves incomplete information, heads-up no limit poker is a simpler game than Go, even though Go involves full information (common-knowledge information).
Fallacy #2: The computer’s superiority comes from being able to remember every hand I’ve played, and adjust accordingly.
While an “exploitive bot” would indeed analyze your past play and adjust to perceived weaknesses, a standard GTO bot (which we used to call a “Nash bot”) is the poker equivalent of BS in blackjack. The GTO strategy does not change, regardless of how you played past hands. It doesn’t need that information, and doesn’t care.
Fallacy #3: The GTO solution is only “correct” if playing against another GTO bot, because that is what was assumed when the bot was developed—the bot “learned” by playing against itself.
This is false. “The bot played against itself to learn poker” is a mischaracterization of the development process. The media likes to hype its clickbait to make every result in computing sound like a generational breakthrough, invoking HAL and Skynet. A GTO bot doesn’t know a thing about poker. Deriving the GTO strategy is an exercise in calculation, made possible by the massive memory and CPU speed available in today’s computers and development of an efficient algorithm to do the computation (“regret minimization”). We never used to describe the algorithm as “machine learning” or “AI”–we used to just call it “hill climbing” or “maximization” or “optimization.” At each step of the iterative algorithm, the computer has the current strategy under development for each seat at the table, and this current strategy could be popularly described as “itself,” as in: “PokerSnowie plays against itself.” But it’s really just an iteration on its path of climbing the hill to converge at the peak—an optimal strategy for poker. That optimum does not assume any particular opponent. There are other ways we could have computed the solution (though maybe not as fast), and it would be just as valid. This GTO strategy is “The Book” for poker, and it would never be at a disadvantage, regardless of its opponent. There is no strategy that can get an edge against it.
Fallacy #4: The GTO bot assumes I’ll play a certain way, but I’ll trick it by playing my off-suit 72 out of position.
Wrong. The GTO bot doesn’t assume anything about how you play. It doesn’t care. It is unbeatable against ANY opposing strategy. Imagine you have an upcoming fight against Floyd Mayweather, and you say, “Floyd expects me to show up in impeccable physical conditioning. He assumes I’m going to train hard for the next six months. I’ll trick him—I’ll just watch Netflix and eat donuts for the next six months.” Floyd has no idea how much you’ll train. He knows that if he himself shows up in perfect shape, then no opponent can get an edge against him. Does it make sense to say, “The bot assumes I will play well. I’ll trick the bot by playing bad poker!” Yeah, you sure showed them!
Fallacy #5: I found a weakness—when I have such and such, from such and such a position, then the bot should do X, but it does Y.
Wrong. The bot doesn’t have a weakness. You’re looking at a particular hand holding, and a particular result, but based on the likelihood of being in that scenario, and all the possible hands you could hold viewed from the bot’s point of view, its play is correct, and you can’t find a hole there. It is very dangerous to look at a play in isolation. The bot makes moves to balance its ranges, so that you can’t chisel in other situations, or if different cards came on the river. If you don’t see it, then the flaw is in your own poker thinking, not the bot’s.
Fallacy #6: If I play it for a while, I’ll figure out how it plays and find a weakness.
Wrong. There is no weakness. In fact, we could publish the bot’s strategy, and it wouldn’t make any difference. If I tell you that I’m going to play Scissors, Rock, and Paper with probability 1/3 on each, the fact that you know my strategy gives you no ability to get an edge. There is no Achilles heel.
Fallacy #7: The Heads-Up Limit bots introduced into casinos were highly beatable, so probably GTO bots are as well.
This is not a meaningful comparison. Some of the casino bots were instructed to not play their A game, because it was too strong against average humans. If the casino sets the bot to play its B game, to achieve, say, a 5% edge against most players, then a really good human could have made money against that GTSO bot (game-theory sub-optimal bot). But that’s a different issue. I don’t care who you are: If you play heads-up against PokerSnowie, you will lose.
Fallacy #8: GTO bots can beat weak players, but the bots will have trouble against top opponents like Phil Ivey or Dan Negreanu.
Wrong. A beautiful thing about a GTO bot is that it doesn’t matter who the opponent is. At best, someone could play even with the bot. A ring of GTO bots would be like a sink, with the money flowing clockwise chasing the button, and draining out the center of the table due to the rake. A practical problem that real-world pros used to have was deciding when their skill was sufficient to step up to the next higher stakes available, where the players were presumably stronger. But now, a player who mimics GTO strategy can sit down at any table in the world, at any stakes, and not have to worry about being the fish. At best, the game would be even (outside of the rake), and in practice, a GTO strategy confers a sizeable edge against anyone you’ll encounter in the wild.
Fallacy #9: Dan Negreanu is a longtime poker pro with N bracelets, so he’ll crush computer nits like Doug Polk who don’t understand the nuances of real poker.
If a guy like Polk just memorizes “the charts” and mimics GTO strategy, he doesn’t need to understand a damn thing. He doesn’t need to know what the word “nuance” means. Poker experience of pros like Negreanu is what enabled them to figure out the best play in scenarios that were complicated. That experience is obsolete now that the computers have just computed what the right plays are. Perhaps in 1950, a player’s experience enabled him to determine that hitting 14 v T was better than standing. Once the Four Horsemen computed the BS chart, that blackjack experience became irrelevant. The Book renders experience unnecessary.
Now Doug Polk is not a GTO bot. He is a top pro who employs GTO strategies. So, Negreanu’s only shot to beat Polk is if Polk’s emulation of GTO is not accurate, and if the holes are big enough for Negreanu to find and exploit. I doubt it. Another longshot would be if they play live, and if Polk has physical tells that give away information about his hole cards, and if Negreanu can read him that way. Or if Polk has tilt issues and starts to stray from GTO if he has a bad run of cards. Not likely. Or, perhaps the game is short enough that Negreanu gets lucky in a small sample.
After playing against AlphaGo, Lee Sedol elevated his game and started crushing everyone (not that he didn’t already), but then retired from the game! He conceded that he had been bested (what a concept!), and that no human would ever again challenge the best player on earth, AlphaGo.
We will see if Negreanu will have the same epiphany. A recent tweet makes me wonder, because Negreanu seemed to be questioning a bot play, and suggesting that there is a thin line between genius and donkishness. I think he is still hoping that there is a flaw in the GTO strategy. There isn’t. Last I heard, Negreanu was catching up in the contest, and there is some indication that one reason is that, to his credit, Negreanu is practicing with PokerSnowie, and adapting! If Negreanu can quickly learn GTO strategy, then he could level the playing field, which would be a tremendous achievement. The only way a dinosaur can survive is by evolving. We’ll see in 2021.
[Next time, I’ll discuss some of the limitations and weaknesses of the poker bots, unless we thrash them out in the Comments below.]