SMART NEWS

This Poker-Playing A.I. Knows When to Hold ‘Em and When to Fold ‘Em

Pluribus won an average of around $5 per hand, or $1,000 per hour, when playing against five human opponents

Associate Editor, History

July 15, 2019

Poker poses a challenge to A.I. because it involves multiple players and a plethora of hidden information. Facebook/Carnegie Mellon University

A computer program called Pluribus has bested poker pros in a series of six-player no-limit Texas Hold’em games, reaching a milestone in artificial intelligence research. It is the first bot to beat humans in a complex multiplayer competition.

As researchers from Facebook’s A.I. lab and Carnegie Mellon University report in the journal Science, Pluribus emerged victorious in both human- and algorithm-dominated matches. Initially, Merrit Kennedy writes for NPR, five versions of the bot faced off against one professional poker player; in the next round of experiments, one bot played versus five humans. Per a Facebook blog post, the A.I. won an average of around $5 per hand, or $1,000 per hour, when playing against five human opponents. This rate is considered a “decisive margin of victory” among poker professionals.

Speaking with Kennedy, four-time World Poker Tour champion Darren Elias explains that he helped train Pluribus by competing against four tables of bot rivals and alerting scientists when the A.I. made a mistake. Soon, the bot “was improving very rapidly, [going] from being a mediocre player to basically a world-class-level poker player in a matter of days and weeks.” The experience, Elias says, was “pretty scary.”

According to the Verge’s James Vincent, Pluribus—a surprisingly low-cost A.I. trained with less than $150 worth of cloud computing resources—further mastered poker strategy by playing against copies of itself and learning through trial and error. As Jennifer Ouellette notes for Ars Technica, the bot quickly realized its best course of action was a combination of gameplay and unpredictable moves.

Most human pros avoid “donk betting,” which finds a player ending one round with a call and starting the next with a bet, but Pluribus readily embraced the unpopular strategy. At the same time, Ouellette reports, the A.I. also offered up unusual bet sizes and exhibited better randomization than opponents.

“Its major strength is its ability to use mixed strategies,” Elias said, according to a CMU statement. “That's the same thing that humans try to do. It's a matter of execution for humans—to do this in a perfectly random way and to do so consistently. Most people just can't.”

Pluribus isn’t the first poker-playing A.I. to defeat human professionals. In 2017, the bot’s creators, Noam Brown and Tuomas Sandholm, developed an earlier iteration of the program called Libratus. This A.I. decisively defeated four poker pros across 120,000 hands of two-player Texas Hold’em, but as the Facebook blog post explains, was limited by the fact that it only faced off with one opponent at a time.

According to the MIT Technology Review’s Will Knight, poker poses a challenge to A.I. because it involves multiple players and a plethora of hidden information. Comparatively, games such as chess and Go involve just two participants, and players’ positions are visible to all.

To overcome these obstacles, Brown and Sandholm created an algorithm engineered to predict opponents’ next two or three moves rather than gauge their steps through the end of the game. Although this strategy may seem to prioritize short-term gain over long-term winnings, the Verge’s Vincent writes that “short-term incisiveness is really all you need.”

Moving forward, multiplayer programs like Pluribus could be used to design drugs capable of fighting antibiotic-resistant bacteria, as well as improve cybersecurity and military robotic systems. As Ars Technica’s Ouellette notes, other potential applications include overseeing multi-party negotiations, pricing products and brainstorming auction bidding strategies.

For now, Brown tells Knight, the algorithm will remain largely under wraps—mainly to protect the online poker industry from incurring devastating financial losses.

The researcher concludes, “It could be very dangerous for the poker community.”