AlphaZero trained itself & in 4 hours destroyed Stockfish 28W-0L-72D!
You can read the academic paper here:
DeepMind’s AI became a superhuman chess player in a few hours, just for fun
The descendant of DeepMind’s world champion Go program stretches its muscles in a new domain
by James Vincent Dec 6, 2017, 8:11am EST
The end-game for Google’s AI subsidiary DeepMind was never beating people at board games. It’s always been about creating something akin to a combustion engine for intelligence — a generic thinking machine that can be applied to a broad range of challenges. The company is still a long way off achieving this goal, but new research published by its scientists this week suggests they’re at least headed down the right path.
In the paper, DeepMind describes how a descendant of the AI program that first conquered the board game Go has taught itself to play a number of other games at a superhuman level. After eight hours of self-play, the program bested the AI that first beat the human world Go champion; and after four hours of training, it beat the current world champion chess-playing program, Stockfish. Then for a victory lap, it trained for just two hours and polished off one of the world’s best shogi-playing programs named Elmo (shogi being a Japanese version of chess that’s played on a bigger board).
FOR EACH GAME, THE AI PROGRAM TAUGHT ITSELF HOW TO PLAY
One of the key advances here is that the new AI program, named AlphaZero, wasn’t specifically designed to play any of these games. In each case, it was given some basic rules (like how knights move in chess, and so on) but was programmed with no other strategies or tactics. It simply got better by playing itself over and over again at an accelerated pace — a method of training AI known as “reinforcement learning.”