Last year, an artificial intelligence program called AlphaGo created by Google’s DeepMind team beat a human champion at Go, an ancient Chinese strategy game that is in many ways more complex than chess. As Emily Matchar reported for Smithsonian.com at the time, it was a stunning achievement, since as late as 1997 some people were predicting it would take 100 years for a computer to beat a human at Go.
While the feat is impressive, AlphaGo learned to play the game by analyzing previous games played by humans. But as Merrit Kennedy at NPR reports, a new version of the artificial intelligence called AlphaGo Zero has figured out how to master the game on its own, with no human input or manipulation—an advancement that has big implications for future AI development.
According to a press release from DeepMind, previous versions of AlphaGo learned to play the game by studying matches between professional and strong amateur players, absorbing the rules of the game and successful strategies of play. AlphaGo Zero, however, did not look at any games played by humans. Instead, it was given the rules of the game and then played against itself, using reinforcement learning to teach itself right and wrong moves and long-term strategies. As the AI played the game, it updated its advanced neural network to better predict its opponent’s moves.
The researchers watched as the AI mastered the game in real time. After three days it was able to defeat a previous version called AlphaGo Lee, which beat Korean Go master Lee Sedol in 4 out of 5 games in 2016. After 21 days it bested AlphaGo Master, the version which beat 60 top Go players online and the world’s best player Ke Jie earlier this year. The latest version bested AlphaGo Master 100 games to 0. After 40 days, it reached levels of play no one has seen before. The research appears in the journal Nature.
“In a short space of time, AlphaGo Zero has understood all of the Go knowledge that has been accumulated by humans over thousands of years of playing,” lead researcher David Silver of Google's DeepMind says in a Youtube video. “Sometimes it’s actually chosen to go beyond that and discovered something that the humans hadn't even discovered in this time period and discovered new pieces of knowledge which are creative and novel in many ways.”
As Agence France-Presse reports, AlphaGo Zero reached this level of mastery much more efficiently than its predecessors. While the previous iteration had 48 data processing units and played 30 million training games over the course of several months, Zero had only 4 processing units and played 4.9 million training games over three days. “People tend to assume that machine learning is all about big data and massive amounts of computation but actually what we saw with AlphaGo Zero is that algorithms matter much more,” Silver tells AFP.
But the research is about more than just mastering a board game. As Ian Sample at The Guardian reports, this type of tabula rasa, or blank slate, learning could lead to a new generation of general purpose artificial intelligence that could help solve problems in fields that can be well-simulated in a computer, like drug composition, protein folding or particle physics. By building its knowledge from the ground up without human biases or limitations, the algorithms could go in directions humans have not yet thought to look.
While many people in the AI community see AlphaGo Zero as a big accomplishment, Gary Marcus, psychology professor at New York University who specializes in artificial intelligence, tells NPR's Kennedy that he doesn’t think the algorithm is truly tabula rasa because prior human knowledge went into the construction of the algorithm. He also doesn’t think tabula rasa AI is as important as it seems. “[In] biology, actual human brains are not tabula rasa ... I don't see the principal theoretical reason why you should do that, why you should abandon lots of knowledge that we have about the world,” he says.
Even so, Alpha Go's rapid mastery of the game is impressive—and a bit frightening.