One GIANT Leap For Mankind
by NM Dana Mackenzie
Today, like many people, I was shocked by the news in my Facebook new feed. AlphaZero beats Stockfish! For those who (like me) had never heard of AlphaZero, let me explain that it is a new deep-learning algorithm created by the same folks who gave you AlphaGo, the computer program that vanquished the human Go champion last year.
The main difference between AlphaZero and AlphaGo is that AlphaZero is designed to be an even more general-purpose program than AlphaGo was. Starting with only the rules of chess, and no more knowledge than that, within 4 hours AlphaZero was able to beat Stockfish, the 2016 Top Chess Engine Championship winner. After more training (not completely clear how much more), AlphaGo won a 100-game match against Stockfish by 64-36, with 28 wins, 72 draws and zero losses. As White, AlphaZero won by +25 =25 -0; as Black it won by +3 =47 -0.
There are so many things to say about this wonderful event, which is a huge triumph for the Deep Mind team. (Deep Mind is a subsidiary of Google.) But let me address the very human reactions that came up in the Facebook feed.
"Maybe we can compare it to the discovery of how to control fire. Wonderful and terrifying at the same time." -- Dan Lucas. "I am terrified." -- Carl Boor. "Sky Net is getting close." -- Jay Stallings.
For chess players, there is absolutely nothing to be terrified of. We have known since the early 2000s that computers can play better chess than humans, and this doesn't do anything to change it. As I posted on Facebook, the people who should be quaking in their boots are chess programmers, because the leading approach to computer chess for the last fifty years, alpha-beta search, has just been busted. The current leading chess programs all use alpha-beta search, augmented with all sorts of bells and whistles provided by grandmaster consultants. After today, you can forget the grandmaster consultants and you can forget your proprietary software. A completely general-purpose program can do better.
But let me address a deeper concern that the mention of "Sky Net" brings up. The concern is that these computers are somehow intelligent and will somehow outcompete humans. The first statement is false; human-like artificial intelligence is as far away today as it was yesterday. This article by Adnan Darwiche, called "Human-Level Intelligence or Animal-Like Abilities?" phrases it just right. What we have given machines is a new level of ability. It is like building a car that goes faster or a program that recognizes faces better. These abilities should not be confused with intelligence. As Judea Pearl and I explain in our forthcoming book, The Book of Why (Basic Books, 2018), one of the basic aspects of human intelligence that all artificial intelligence programs lack is an understanding of cause and effect. AlphaZero has no more understanding of why its moves work than Stockfish did. It merely observes that in its huge database of randomly generated games, 1. c4 has a high winning percentage for White and 1. e4 e6 has a low winning percentage for Black.
"When will AlphaZero develop the ability to explain these games in terms humans can utilize?" -- Hal Bogner.
This is exactly the right question! The answer is, never. Again, as Judea and I explain in our book, a requirement for AI is that computers should be able to talk with humans in our natural language of cause and effect. AlphaZero is not a step forward.
"I am more concerned with how this type of technology will play out if say it can make similar progress playing the stockmarket." -- Carl Boor
This is a good point. I'm out of my depth here because I am only a chess player, not a stock market player. It's possible that programs like this would tame the irrationalities of the stock market, as more and more people have access to programs that correctly (or more or less correctly) "evaluate the position." But I am far from sure about this, because unlike chess, the stock market is subject to human psychology in ways that could break the system. We might have a computer program that could predict the stock market better than humans 99 percent of the time, but would not be able to predict crashes like the one in 2008.
"The game where Stockfish gets its queen trapped on h8 is terrifying." -- Andy Lee "There's also the last game where the white queen goes from c4 to h4 to h1 in order to play Bg2-e4. All this while down a piece. (White had castled short with a typical fianchetto position.)" -- Michael Aigner
Yes! Hooray! Some chess comments! In fact, one of the most wonderful things about AlphaZero is the way it plays. I only played over two games, #5 and #10, but they were both amazing (not terrifying). AlphaZero is winning by playing incredible brilliancies, sacrificing pieces for nebulous compensation that somehow forces Stockfish to give back a rook 25 moves later. This isn't the death of chess, this is the rebirth of chess!
Finally, no one has mentioned it on Facebook, but Table 2 in the research paper is extremely interesting. It's as close as we're going to get to a "God's-eye view" of the openings. It shows how often AlphaZero chose each of the 12 favorite human openings in its training games against itself, as well as how it performed with those openings against Stockfish. This is a different database than the 100-game match; it's a series of 50-game mini-matches in each of those twelve openings. Suffice to say that AlphaZero really likes the English Opening, 1. c4 e5 2. g3 d5 3. cd Nf6 4. Bg2 Nxd5 5. Nf3, which at certain points in its training period appeared in close to 20 percent of its games. It's a little hard to say what this means, though, because they were self-training games, which means AlphaZero chose that opening line both as White and as Black. Interestingly, it also had a period of roughly four hours (from 2 hours to 6 hours after it learned how the pieces move) when it loved the Caro-Kann, 1. e4 c6 2. d4 d5 3. e5 Bf5 4. Nf3 e6 5. Be2 a6.That was appearing in around 10 percent of its games. But around the 6-hour mark, it lost its taste for that opening; maybe it was losing too often as Black?
The results against Stockfish are easier to interpret. There we find AlphaZero's Achilles heel, such as it is: it does not score well in the Kan Variation of the Sicilian Defense, 1. e4 c5 2. Nf3 e6 3. d4 cd 4. Nxd4 Nc6 5. Nc3 Qc7 6. Be3 a6. As White, AlphaZero scored +17 =31 -2, or 65 percent, which was considerably worse than the 75 percent it scored when choosing its own openings. As Black, it scored +3 =40 -7, or 46 percent (!!), the only one of the 12 major opening lines where it scored below 50 percent. In general, AlphaZero seemed to play the Sicilian Defense very rarely in its training games.
This makes me wonder about cause and effect. Does it score badly in the Sicilian because it doesn't play enough training games in that opening, or does it not play many training games because it considers the Sicilian Defense a bad opening? Alas, because it can't talk with us about cause and effect, we do not know.