INNOVATION

Google’s New AI Is a Master of Games, but How Does It Compare to the Human Mind?

After building AlphaGo to beat the world’s best Go players, Google DeepMind built AlphaZero to take on the world’s best machine players

Katherine J. Wu NOVA

Correspondent

December 10, 2018

AI Chess — Google's new artificial intelligence program, AlphaZero, taught itself to play chess, shogi, and Go in a matter of hours, and outperforms the top-ranking AIs in the gameplay arena. Andriy Popov / Alamy Stock Photo

For humans, chess may take a lifetime to master. But Google DeepMind’s new artificial intelligence program, AlphaZero, can teach itself to conquer the board in a matter of hours.

Building on its past success with the AlphaGo suite—a series of computer programs designed to play the Chinese board game Go—Google boasts that its new AlphaZero achieves a level of “superhuman performance” at not just one board game, but three: Go, chess, and shogi (essentially, Japanese chess). The team of computer scientists and engineers, led by Google’s David Silver, reported its findings recently in the journal Science.

“Before this, with machine learning, you could get a machine to do exactly what you want—but only that thing,” says Ayanna Howard, an expert in interactive computing and artificial intelligence at the Georgia Institute of Technology who did not participate in the research. “But AlphaZero shows that you can have an algorithm that isn’t so [specific], and it can learn within certain parameters.”

AlphaZero’s clever programming certainly ups the ante on gameplay for human and machine alike, but Google has long had its sights set on something bigger: engineering intelligence.

The researchers are careful not to claim that AlphaZero is on the verge of world domination (others have been a little quicker to jump the gun). Still, Silver and the rest of the DeepMind squad are already hopeful that they’ll someday see a similar system applied to drug design or materials science.

So what makes AlphaZero so impressive?

Gameplay has long been revered as a gold standard in artificial intelligence research. Structured, interactive games are simplifications of real-world scenarios: Difficult decisions must be made; wins and losses drive up the stakes; and prediction, critical thinking, and strategy are key.

Encoding this kind of skill is tricky. Older game-playing AIs—including the first prototypes of the original AlphaGo—have traditionally been pumped full of codes and data to mimic the experience typically earned through years of natural, human gameplay (essentially, a passive, programmer-derived knowledge dump). With AlphaGo Zero (the most recent version of AlphaGo), and now AlphaZero, the researchers gave the program just one input: the rules of the game in question. Then, the system hunkered down and actively learned the tricks of the trade itself.

AlphaZero is based on AlphaGo Zero, part of the AlphaGo suite designed to play the Chinese board game Go, pictured above. Early iterations of the original program were fed data from human-versus-human games; later versions engaged in self-teaching, wherein the software played games against itself to learn its own strategy. Chad Miller / Flickr / CC BY-SA 2.0

This strategy, called self-play reinforcement learning, is pretty much exactly what it sounds like: To train for the big leagues, AlphaZero played itself in iteration after iteration, honing its skills by trial and error. And the brute-force approach paid off. Unlike AlphaGo Zero, AlphaZero doesn’t just play Go: It can beat the best AIs in the business at chess and shogi, too. The learning process is also impressively efficient, requiring only two, four, or 30 hours of self-tutelage to outperform programs specifically tailored to master shogi, chess, and Go, respectively. Notably, the study authors didn’t report any instances of AlphaZero going head-to-head with an actual human, Howard says. (The researchers may have assumed that, given that these programs consistently clobber their human counterparts, such a matchup would have been pointless.)

AlphaZero was also able to trounce Stockfish (the now unseated AI chess master) and Elmo (the former AI shogi expert) despite evaluating fewer possible next moves on each turn during game play. But because the algorithms in question are inherently different, and may consume different amounts of power, it’s difficult to directly compare AlphaZero to other, older programs, points out Joanna Bryson, who studies artificial intelligence at the University of Bath in the United Kingdom and did not contribute to AlphaZero.

Google keeps mum about a lot of the fine print on its software, and AlphaZero is no exception. While we don’t know everything about the program’s power consumption, what’s clear is this: AlphaZero has to be packing some serious computational ammo. In those scant hours of training, the program kept itself very busy, engaging in tens or hundreds of thousands of practice rounds to get its board game strategy up to snuff—far more than a human player would need (or, in most cases, could even accomplish) in pursuit of proficiency.

This intensive regimen also used 5,000 of Google’s proprietary machine-learning processor units, or TPUs, which by some estimates consume around 200 watts per chip. No matter how you slice it, AlphaZero requires way more energy than a human brain, which runs on about 20 watts.

The absolute energy consumption of AlphaZero must be taken into consideration, adds Bin Yu, who works at the interface of statistics, machine learning, and artificial intelligence at the University of California, Berkeley. AlphaZero is powerful, but might not be good bang for the buck—especially when adding in the person-hours that went into its creation and execution.

Energetically expensive or not, AlphaZero makes a splash: Most AIs are hyper-specialized on a single task, making this new program—with its triple threat of game play—remarkably flexible. “It’s impressive that AlphaZero was able to use the same architecture for three different games,” Yu says.

So, yes. Google’s new AI does set a new mark in several ways. It’s fast. It’s powerful. But does that make it smart?

This is where definitions start to get murky. “AlphaZero was able to learn, starting from scratch without any human knowledge, to play each of these games to superhuman level,” DeepMind’s Silver said in a statement to the press.

Even if board game expertise requires mental acuity, all proxies for the real world have their limits. In its current iteration, AlphaZero maxes out by winning human-designed games—which may not warrant the potentially alarming label of “superhuman.” Plus, if surprised with a new set of rules mid-game, AlphaZero might get flummoxed. The actual human brain, on the other hand, can store far more than three board games in its repertoire.

What’s more, comparing AlphaZero’s baseline to a tabula rasa (blank slate)—as the researchers do—is a stretch, Bryson says. Programmers are still feeding it one crucial morsel of human knowledge: the rules of the game it’s about to play. “It does have far less to go on than anything has before,” Bryson adds, “but the most fundamental thing is, it’s still given rules. Those are explicit.”

And those pesky rules could constitute a significant crutch. “Even though these programs learn how to perform, they need the rules of the road,” Howard says. “The world is full of tasks that don’t have these rules.”

When push comes to shove, AlphaZero is an upgrade of an already powerful program—AlphaGo Zero, explains JoAnn Paul, who studies artificial intelligence and computational dreaming at the Virginia Polytechnic Institute and State University and was not involved in the new research. AlphaZero uses many of the same building blocks and algorithms as AlphaGo Zero, and still constitutes just a subset of true smarts. “I thought this new development was more evolutionary than revolutionary,” she adds. “None of these algorithms can create. Intelligence is also about storytelling. It’s imagining things that are not yet there. We’re not thinking in those terms in computers.”

Part of the problem is, there’s still no consensus on a true definition of “intelligence,” Yu says—and not just in the domain of technology. “It’s still not clear how we are training critically thinking beings, or how we use the unconscious brain,” she adds.

To this point, many researchers believe there are likely multiple types of intelligence. And tapping into one far from guarantees the ingredients for another. For instance, some of the smartest people out there are terrible at chess.

With these limitations, Yu’s vision of the future of artificial intelligence partners humans and machines in a kind of coevolution. Machines will certainly continue to excel at certain tasks, she explains, but human input and oversight may always be necessary to compensate for the unautomated.

Of course, there’s no telling how things will shake out in the AI arena. In the meantime, we have plenty to ponder. "These computers are powerful, and can do certain things better than a human can," Paul says. "But that still falls short of the mystery of intelligence."

This article was originally published on NOVA.

Katherine J. Wu | | READ MORE

Katherine J. Wu is a Boston-based science journalist and Story Collider senior producer whose work has appeared in National Geographic, Undark magazine, Popular Science and more. She holds a Ph.D. in Microbiology and Immunobiology from Harvard University, and was Smithsonian magazine's 2018 AAAS Mass Media Fellow.