Known as Pluribus, the AI was developed by Carnegie Mellon University in collaboration with Facebook AI. Over the course of 10,000 hands of six-player poker against a rotating cast of 13 professionals - each with career winnings in excess of $1m - Pluribus came out on top.
In a separate experiment, the AI took on Darren Elias, who holds the record for most World Poker Tour titles, and Chris ‘Jesus’ Ferguson, winner of six World Series of Poker events. Playing 5,000 hands ‘heads up’ (one on one) against each pro, Pluribus again emerged the winner. But it is the multi-handed victory the researchers are heralding as most significant.
"Pluribus achieved superhuman performance at multi-player poker, which is a recognised milestone in artificial intelligence and in game theory that has been open for decades," said Professor Tuomas Sandholm,( below, credit: Carnegie Mellon) from Carnegie Mellon's Computer Science Department.
"Thus far, superhuman AI milestones in strategic reasoning have been limited to two-party competition. The ability to beat five other players in such a complicated game opens up new opportunities to use AI to solve a wide variety of real-world problems."
Described in the latest issue of Science, Pluribus is the fruit of more than 16 years of poker AI research by Sandholm. He and fellow researcher Noam Brown had previously worked on a programme called Libratus that also came out on top against professionals playing heads up. But the classic Nash game theory strategies that are successful in two-player games are not always winning strategies when playing multi-handed. As a result, Pluribus had to forego theoretical guarantees of success in place of a more mixed strategy that could still help it win.
Pluribus first computes a ‘blueprint’ strategy by playing six copies of itself, which is sufficient for the first round of betting. From that point on, the AI does a more detailed search of possible moves in a finer abstraction of the game. It looks ahead several moves, but not all the way to the end of the game, which would involve too many permutations and would be computationally prohibitive.
Limited-lookahead search is a standard approach in perfect-information games, but is extremely challenging in imperfect-information games. According to the team, a new limited-lookahead search algorithm is the main breakthrough that enabled Pluribus to achieve superhuman multi-player poker. The blueprint strategy was developed over just eight days using only 12,400 core hours, with Pluribus using just 28 cores during live play.
"Playing a six-player game rather than head-to-head requires fundamental changes in how the AI develops its playing strategy," said Brown (right, credit: Noam Brown) a computer science PhD student at Carnegie Mellon who joined Facebook AI last year. "We're elated with its performance and believe some of Pluribus' playing strategies might even change the way pros play the game."
One of the strategies employed included ‘donk betting’, where Pluribus would end one round with a call but start the next round with a bet. Though this is classically considered a weak move, the AI used it much more frequently than the professionals and ultimately won.
"Its major strength is its ability to use mixed strategies," said Darren Elias, as he prepared for this year’s World Series of Poker main event. "That's the same thing that humans try to do. It's a matter of execution for humans - to do this in a perfectly random way and to do so consistently. Most people just can't."