Next Stop, the WSOP
January 23, 2017 7:26 AM   Subscribe

Carnegie Mellon University has developed an AI to play poker, Libratus. They are having it face off versus four poker pros, and halfway through the competition, Libratus is winning. The game is heads-up, no limit poker, which pits two players face to face, and doesn't constrain their betting, making the game an aggressive mix of math, psychology, and strategy.

This is a rematch from a previous 2015 competition, which Libratus' predecessor (Claudico) lost. Libratus is the work of Noam Brown and Tuomas Sandholm. Here is a conference paper about their poker algorithm, and, a short article about that algorithm's dominance. Poker Night in America [YouTube] featured Claudico's attempt at the competition, and interviewed several of its designers.
posted by codacorolla (52 comments total) 21 users marked this as a favorite
 
Well, humanity is still best at Calvinball
posted by thelonius at 7:28 AM on January 23 [9 favorites]


One thing that struck me about this is that Poker is an imperfect information game: meaning that you do not know the entire game-state the whole time. What remains in the deck, in what order, and what is in your opponent's hand, and what cards have been burned (discarded before being drawn to prevent cheating) are all unknown to you. This is in comparison with Chess, a perfect information game, where the entire game-state is always knowable to all players.

That may be important, because many of the serious, but game-like, things that AI could be turned towards are imperfect information games. For example, buying limited resources at auction (you do not know your opponent's maximum wager), or an election (you do not know every voter's current intention). Really, anything with a heavy social element to it. This AI could be a backbone to serious applications.
posted by codacorolla at 7:36 AM on January 23 [11 favorites]


Just let me know when I can turn on my TV and watch Phil Hellmuth angrily call a computer screen a idiot motherfucker for raising with pocket nines.
posted by delfin at 7:39 AM on January 23 [40 favorites]


I am continuously tickled by the idea that somehow "machines" win when the algorithm beats the human side. What actually happened is some other humans won, by building a tool.
posted by chavenet at 7:51 AM on January 23 [32 favorites]


I'm imagining a WOPR/WSOP crossover where they play a lovely game of Global Thermonuclear Squadoosh with commentary by Lon and Norm.
posted by Molesome at 7:54 AM on January 23 [3 favorites]


Indeed. This isn't man vs. machine, this is man vs. God.
posted by Hatashran at 7:55 AM on January 23 [2 favorites]


Play begins at 11 a.m. each day and ends after 8 p.m. The public is welcome to observe game play, which is in Rivers’ Poker Room.

If I could cope with the cigarette smoke at the casino, I would wander down to check it out; this is only three blocks from home.
posted by octothorpe at 7:57 AM on January 23


It wouldn't be the first imperfect information game an AI has done well at. Bridge robots are very strong - competitive with some of the best humans.
posted by edd at 8:15 AM on January 23 [2 favorites]


I am continuously tickled by the idea that somehow "machines" win when the algorithm beats the human side. What actually happened is some other humans won, by building a tool.

Yes and no. Absolutely, at its base, an algorithm like this is a human invention, and at the core of this thing is an AB-pruning routine that has been refined over the decades since it was first described. Of note, though, is that contemporary AI agents like this are essentially fire-and-forget kind of deals: you wire up your game states and your heuristics, and then you let it crawl over a hundred trillion hands to prime its search tree. The data set that your poker player then ends up with is so mindnumbingly vast and incomprehensible that no human could ever do anything useful with it. If you read the article, you also see the human players griping because they used to be able to huddle up after the first day of the tournament and compare notes about obvious weaknesses in the AI's strategy, and then crush it on Day 2. They can't do that anymore, because Libratus is able to spot its own weaknesses and adapt its strategy to compensate for them. Unless there's some horrible tournament management and reporting happening, I promise you there's no one on the periphery twiddling any knobs to make it do that.
posted by Mayor West at 8:21 AM on January 23 [5 favorites]


Does the machine have access to the pros' online play records?
posted by praemunire at 8:22 AM on January 23 [2 favorites]


This isn't man vs. machine, this is man vs. God.

Man destroys God. Woman inherits the earth.
posted by Mayor West at 8:23 AM on January 23 [3 favorites]


What does a computer bet with?
posted by codacorolla at 8:26 AM on January 23


Microchips.
posted by codacorolla at 8:26 AM on January 23 [59 favorites]


It's worth noting that Heads-Up LIMIT Hold Em has been essentially solved prior to this.

No Limit, especially deep-stacked (where the maximum possible bet is much higher than the minimum bet) is a MUCH more difficulty nut to crack, and that's the one Libratus is working on. In fact, it's probably not solvable, in that there is probably not a strategy that can't be exploited by an opponent that knows your strategy. Of course, there is plenty of room for computers to outperform humans without solving the games (c.f. chess).

As for 6 to 10 player No Limit Hold Em (which is how the game is normally played), the best bots are still a lot worse than even moderately decent players.
posted by 256 at 8:27 AM on January 23 [9 favorites]


Online gambling market is ~US$50B. That's one way for a sentient AI to bootstrap its financial holdings.
posted by gwint at 8:35 AM on January 23 [4 favorites]


Online gambling market is ~US$50B. That's one way for a sentient AI to bootstrap its financial holdings.

There's a great short story in there somewhere.
posted by jedicus at 8:50 AM on January 23 [8 favorites]


Mayor West is correct, to an extent. However, a number of people (including, apparently, people working with the AI playing Go at Google) have been saying that this case is overstated. The AIs being used are highly-trained expert systems. While there may be a very large number of possibilities that a machine may evaluate where a human could not, an AI has been trained not just by previous game data, but by restrictive rules telling it where not to look.

It's a system that uses novel learning that can both build from human experience and be constrained by the same, to keep it within reasonable bounds. There are still routes that will never lead to victory that a machine can't suss out on its own, and it could spend all that time branching through paths that we know go nowhere.

(There's also the idea that novel solutions that would never occur to humans could be found, but there are still bounds that we know that don't need to be computed)
posted by mikeh at 8:58 AM on January 23


In fact, it's probably not solvable, in that there is probably not a strategy that can't be exploited by an opponent that knows your strategy.

All games have a Nash Equilibrium, which means that there exists a strategy for a player in heads-up Hold 'Em that cannot be exploited. This strategy can be "mixed" meaning that it incorporates deliberate randomness.
posted by justkevin at 9:01 AM on January 23 [4 favorites]


If I could cope with the cigarette smoke at the casino, I would wander down to check it out; this is only three blocks from home.

I think that most poker rooms are non-smoking these days. If you're interested, you might want to call the poker room (412-566-4606) to check.
posted by Betelgeuse at 9:06 AM on January 23


Chess and especially go ai already uses lots of tools for dealing with imperfect information; the imperfection is not knowing the future! (And having an impossibly large state space to explore as a result.)
posted by kaibutsu at 9:07 AM on January 23


Chess and especially go ai already uses lots of tools for dealing with imperfect information; the imperfection is not knowing the future!

I think "perfect information" is a technical term of game theory - it means that both players have all information about the current game state and history. The future has nothing to do with it, nor does not knowing what the opponent is thinking. In chess you have all the objectively available information: you know where all the pieces are on the board, and you know whose turn it is to move, and you know if someone's already moved their King and can't castle.
posted by thelonius at 9:12 AM on January 23 [6 favorites]


Betelgeuse: "If I could cope with the cigarette smoke at the casino, I would wander down to check it out; this is only three blocks from home.

I think that most poker rooms are non-smoking these days. If you're interested, you might want to call the poker room (412-566-4606) to check.
"

Just walking into the door of that casino hits you with a blast of smoke. Even if the particular room is smoke-free the whole building just reeks.
posted by octothorpe at 9:16 AM on January 23 [1 favorite]


All games have a Nash Equilibrium, which means that there exists a strategy for a player in heads-up Hold 'Em that cannot be exploited. This strategy can be "mixed" meaning that it incorporates deliberate randomness.

That's a really good point. I am provably wrong about their not being an unexploitable strategy at deepstack HU NLHE. We know it must exist but no one knows what it is.

It's also good to know that there is a difference between unexploitable play and optimal play. For short-stacked NLHE (where you have less than about 10x the minimum bet), an unexploitable strategy has been figured out where you just fold or go all in depending on your hand, stack sizes, and the action before you. You can print out a reference chart. If you follow this strategy, even the best players in the world won't be making money off you on average (though your strategy has to change to something more fluid as soon as you double up). The thing is that, if you're playing against BAD players, you can often do better by deviating from Game Theory Optimal play in order to exploit their mistakes. Your play becomes more exploitable in turn, but as long as you are exploiting their deviations from GTO more than they are yours, it is profitable.
posted by 256 at 9:16 AM on January 23 [5 favorites]


There is one more thing that separates poker from chess (beyond imperfect information): the element of luck. A player can be lucky and get good cards.

Over many games, luck tends to even out. But I would guess that in the limited number of games in a tournament with human players, luck can be a factor.

Does anyone have a feel for how large the element of luck is in this type of poker and this kind of tournament? Is there a metric of how important luck is in a particular game? Perhaps a scale with the game of "coin toss" on one end of the scale and the game of "chess" on the other other.
posted by Triplanetary at 9:19 AM on January 23


The general consensus in the poker community (which loves doing robust statistical analysis on performance) is that you want 100,000+ hands of NLHE in order to be highly confident that your results represent your true results and not good/bad luck.

In fact, it's considered a truism that you can't ever know your win-rate because, by the time you have enough hands to evaluate properly, your play-style will have changed sufficiently to render the earlier hands in the sample unrepresentative.

That's less of a concern for a static AI, obviously. And it bodes well that this specific tournament is set to run for 120k hands. Statistically, if the AI ends up winning money, we should be able to be reasonably confident that it is actually better at the game than the average of the four pros it is playing against.

(Of course, it's statistics, so you can never be 100% sure, you can just quantify how unlikely it is that you are wrong. It is possible for the AI to get dealt AA while the pro gets dealt KK 120,000 hands in a row.)
posted by 256 at 9:28 AM on January 23 [3 favorites]


(Honestly, the fact that they chose to play 120k hands suggests to me that they probably arrived at that number based on some desired p-value/confidence interval.)

Oh and the amount it wins by will be extremely relevant. If the AI ends up +$1000 after 120k hands, we'll be able to say that it's more likely than not that it's better, but the confidence in that will be low. If it ends up +$1,000,000 then we can be a lot more confident that it surpassed the error bars.
posted by 256 at 9:34 AM on January 23 [1 favorite]


"I'm imagining a WOPR/WSOP crossover where they play a lovely game of Global Thermonuclear Squadoosh with commentary by Lon and Norm."

So this is where I peek my head in and tell you guys about my Wargames fan theory.

When David Lightman (Broderick's character) finds the article about Professor Falken, the article is called:

"Poker and Armageddon: The Role of Bluffing in a Nuclear Standoff".

Poker is one of the games listed as being on WOPR. So "Joshua" (WOPR's AI) knows how to play poker and thus probably understands bluffing and thereby lying. So the whole ending of "what a strange game. The only way to win is not to play." might just have been a bluff.....
posted by I-baLL at 9:37 AM on January 23 [6 favorites]


I am continuously tickled by the idea that somehow "machines" win when the algorithm beats the human side. What actually happened is some other humans won, by building a tool.

I feel like this relies on an assumption that isn't fully thought out. Humans built a tool, yes. But then the tool was sent out to do its thing separate from the humans who created it.

I liken this to that whole monkey-selfie copyright controversy. A human set up a camera, the monkey hit the shutter. Whether or not the human is the author of the work is still in litigation (district court said no, appeal is pending in 9th circuit).

Obviously it's not a perfect analogy (Slater didn't make the monkey that took the picture) but it shows that the line that dictates where human involvement ends and begins isn't as clear as you might think it is.

So yes, some humans made a tool that is beating other humans at a game. I wouldn't go as far to say that the first humans are "winning" though.
posted by sparklemotion at 10:18 AM on January 23


Let's see how well it does at 7-stud. ;)
posted by mrgrimm at 10:27 AM on January 23


Ohhhhh it's the Philly Rivers not the one by O'Hare.
posted by PMdixon at 11:08 AM on January 23


I-baLL: "Poker is one of the games listed as being on WOPR. So "Joshua" (WOPR's AI) knows how to play poker and thus probably understands bluffing and thereby lying. So the whole ending of "what a strange game. The only way to win is not to play." might just have been a bluff....."

Your vastlly overestimating the sophistication of 80s computer gaming. Poker was a common game even on toy equipment like the VIC-20.
posted by Mitheral at 11:10 AM on January 23


The thing is that, if you're playing against BAD players, you can often do better by deviating from Game Theory Optimal play in order to exploit their mistakes. Your play becomes more exploitable in turn, but as long as you are exploiting their deviations from GTO more than they are yours, it is profitable

Oh god, yeah, this. Playing against nine bad players is basically playing against a randomized strategy that gets 9 more sets of cards than you do. You're always going to get rivered with some bullshit.
posted by schadenfrau at 11:26 AM on January 23 [5 favorites]


"Your vastlly overestimating the sophistication of 80s computer gaming. Poker was a common game even on toy equipment like the VIC-20."

I'm talking about the movie Wargames which featured a very sentient AI.
posted by I-baLL at 11:39 AM on January 23


They structure the tournament to minimize variance by reversing the hole cards given to the computer and human players, e.g. Claudico receives cards "A" when playing player 1 with cards "B", then receives cards "B" when playing player 2 with cards "A". So traditional rules of thumbs like 100k hands will overstate the variance experienced.
posted by pingu at 12:16 PM on January 23


Over a long enough play time wouldn't human fatigue play a role in the overall results of play versus a computer. A computer can keep up perfect "concentration" over an unlimited amount of time. Even the best of players is going to lose mental sharpness over a long enough period and make mistakes. Not to mention distractions from the game such as hunger and physical comfort.
posted by Justin Case at 1:27 PM on January 23


Oh god, yeah, this. Playing against nine bad players is basically playing against a randomized strategy that gets 9 more sets of cards than you do. You're always going to get rivered with some bullshit.

Playing with several bad players is really a completely different challenge all its own. You really have to settle down and get comfy with them constantly sucking out. For me, it takes much more discipline than playing against "better" (i.e. more predictable) players.
posted by mrgrimm at 1:31 PM on January 23 [2 favorites]


Over a long enough play time wouldn't human fatigue play a role in the overall results of play versus a computer. A computer can keep up perfect "concentration" over an unlimited amount of time. Even the best of players is going to lose mental sharpness over a long enough period and make mistakes. Not to mention distractions from the game such as hunger and physical comfort.

Sure. Another thing that has badly hurt chess Grandmasters in match play against computers is psychology. A computer program doesn't have any, but a person who lost badly yesterday has to struggle to recover their confidence to play well today. If the program, say, could have won yesterday, but forced a draw instead, it doesn't change anything about its play today.
posted by thelonius at 1:41 PM on January 23


I-baLL: "I'm talking about the movie Wargames which featured a very sentient AI."

But with no, IIRC, exhibited ability to lie or deceive. My point was that the mere existence of poker as one of the game choices isn't enough to infer Joshua could lie/bluff. Most (all?) of the poker varients available at the time didn't feature any sort of intelligent bluffing.
posted by Mitheral at 7:51 PM on January 23 [1 favorite]


PMdixon: "Ohhhhh it's the Philly Rivers not the one by O'Hare."

It's in Pittsburgh, not Philadelphia.
posted by Chrysostom at 8:48 PM on January 23


Oh hey, I know some of the people working on this. We've been making trips to the casino to root for the AI. :-)
posted by glass origami robot at 9:53 PM on January 23


Can we stop using the term AI for things that aren't? It's getting extremely annoying.
posted by GallonOfAlan at 11:05 PM on January 23


Yayogate gets over 2600 comments on 2+2 and *this* is the story that gets posted to Metafilter?

For shame.
posted by PeterMcDermott at 4:07 AM on January 24



It's in Pittsburgh, not Philadelphia


That only reduces it from a 15 hr drive to a 10 hr drive. (All this time I thought CMU was in Philadelphia. But I should have remembered that the casinos there are a Harrah's and Sugarhouse)
posted by PMdixon at 7:20 AM on January 24


Can we stop using the term AI for things that aren't? It's getting extremely annoying.

Care to educate us?

Yayogate gets over 2600 comments on 2+2 and *this* is the story that gets posted to Metafilter?

It's possible that I need another FPP to summarize this for me because the vast majority of reliable sources for this are on sites that my work's firewall disapproves of (GAMBLING BAD).
posted by sparklemotion at 7:27 AM on January 24 [1 favorite]


All this time I thought CMU was in Philadelphia

Both Andrew Carnegie and Andrew Mellon lived in Pittsburgh, you see.
posted by Chrysostom at 7:29 AM on January 24 [1 favorite]


So the whole [WarGames] ending of "what a strange game. The only way to win is not to play." might just have been a bluff....."

If this allows us to have a WarGames / Terminator universe mashup, I'm all on board.
posted by Kadin2048 at 8:01 AM on January 24


It's possible that I need another FPP to summarize this for me because the vast majority of reliable sources for this are on sites that my work's firewall disapproves of (GAMBLING BAD).

A gambling addict had himself banned from playing Blackjack at the major casinos for his own good, but still hung out with the gambling crowd (including a handful of minor semi-notable pros). He expressed to some of these gambling "friends" a desire to play again and lamented having had himself banned, so they arranged to run an illegal high stakes blackjack game in their house just for him. He lost $10k to his friends. A few days later they plied him with liquor and cocaine to get him to come back and he lost another $10k.

A few days later, they tried to get him to come back again with promises of more liquor and cocaine, but he was broke. They kept badgering him and he managed to borrow another $10k in order to try one more time. He ended up getting very lucky and running his $10k up to $500k, at which point his friends said they didn't have the money on hand, gave him back his borrowed $10k and a mere $5k on top, promising to pay him the rest the next day.

When he went to meet his friends the next day, they had hired muscle with them, accused him of cheating, and refused to pay him his winnings. With little left to lose, the guy went to the Internet and told his story, naming names.
posted by 256 at 12:04 PM on January 24 [1 favorite]


Heads up poker works much better for this type of computer simulation. A table with 7 other players and new faces and playing styles working through a field would make it more difficult. It won't be long but it needs to scale information processing by a few more orders of magnitude.
posted by ShakeyJake at 12:06 PM on January 24


Yayogate gets over 2600 comments on 2+2 and *this* is the story that gets posted to Metafilter?

Post about Yayogate yo' own self if you think it's all that
posted by thelonius at 12:28 PM on January 24 [5 favorites]


"But with no, IIRC, exhibited ability to lie or deceive. My point was that the mere existence of poker as one of the game choices isn't enough to infer Joshua could lie/bluff. Most (all?) of the poker varients available at the time didn't feature any sort of intelligent bluffing."

Yes, that's why I point out that the article written by Falken is called "Poker and Armageddon: The Role of Bluffing in a Nuclear Standoff" and we know that Falken was training Joshua to understand abstract concepts by having it play games with him.
posted by I-baLL at 8:58 AM on January 30


Whoa, way to totally airbrush co-author John McKittrick out of history!
posted by Chrysostom at 1:28 PM on January 30 [1 favorite]


I blame Joshua.
posted by I-baLL at 1:47 PM on January 30


« Older Jaki Liebezeit (1938-2017)   |   "Bessie used to wear [suffragette] colours at the... Newer »


This thread has been archived and is closed to new comments