TCEC Season 13 – the 13th Top Chess Engine Championship
Written by Guy Haworth and Nelson Hernandez
Reading, UK and Maryland, USA
This is the first in a new series of analytical articles on TCEC events. The full text can be read below on this webpage, and at the bottom you will find a link to the full layouted article in pdf format, including the important tables, graphs and images.
TCEC is very grateful to the authors for their kind permission to publish these substantial and scholarly analyses of its events!
After the successes of TCEC Season 12 (Haworth and Hernandez, 2018a), the Top Chess Engine Championship moved straight on to Season 13, starting August 3rd 2018 with the same divisional structure as for Seasons 11 and 12.
Five divisions, each of eight engines, played two or more ‘DRR’ double-round-robin phases each, with promotions and relegations following. Classic tempi gradually lengthened and the Premier division’s top two engines played a 100-game match to determine the Grand Champion.
The formidable 44-core server of TCEC11-12 (Intel, 2017) was joined by a second server sporting two Nvidia GeForce GTX 1080 Ti GPUs (Nvidia, 2018) to provide better support for two engines, LC0 and DEUS X which both exploited LC0’s ‘NN’ neural network architecture. CHESS 2K and IVANHOE were also new to TCEC while FRUIT chose to step away from the action this time. The tie-break sequence was changed to ‘number of disconnects’, ‘head-to-head results’, wins, 0-1 wins, Sonneborn-Berger
score. Given CHIRON’s and others’ technical failure in Season 12, and the added risk factors associated with the more complex common platform, the rules for modifying engines were redefined to include mandatory scaling-down and one repair of engines between the games of a division.
Division 4: 2 DRR phases, 28 rounds, 112 games, tempo 30′+10″/m
As for TCEC12, each engine played both White and Black from 14 defined four-ply openings. The results are as in Fig.3: ‘P%’ is the %-score and ‘ELO±’ is the change to the engine’s nominal ELO based on its performance. ‘nSB’ is the Sonnerborn-Berger score, normalised as for one double round-robin.
Online interest naturally focused on the new ‘NN approach’ engines LC0 and DEUS X (Silver, 2018), DEUS X being powered and trained by LC0 software. It is however trained from human games rather than from zero which is the convention for training ALPHAZERO and LC0. As seen in the results of Fig. 3, and surprising to those not actually involved, LC0 justified its ‘wild card’ invitation with a comfortable win and DEUS X was the runner-up on debut and in its very first version.
Division 3: two DRR phases, 14 rounds, 112 games, tempo 30′+10″/m
Again, the eight engines involved played both sides of 14 prescribed four-ply openings. LC0 upgraded to a new version. 13 games were won ‘below the diagonal’ including NEMORINO-LC0 g11.1/45.
ETHEREAL (Grant, 2018) was way out on its own. LC0 and ARASAN were 6.5 points behind, ARASAN progressing courtesy of the one win between them, g19.1/73. The key error was 29. … Bg4?? which missed 30. c6+ Ka7 31. Rd3 Rxd5 32. Nxd5 Qxb2 33. Ne3 Qc1+ 34. Nd1 Qc2 35. Nf2. LC0 and DEUS X, the latter some 2.5 points behind LC0, were in fact frustrated by overheating in the GPU hardware which therefore had to be throttled back. First impressions are that LC0 and DEUS X play in a more human way, being relatively strong on strategy like ALPHAZERO but weaker on tactics. NEMORINO rather than HANNIBAL was demoted as it crashed twice, one being against ARASAN.
Division 2: two DRR phases, 14 rounds, 112 games, tempo 30′+10″/m
Nine of the 40 wins are below the diagonal in the cross-table of Fig. 5. ETHEREAL and CHESSBRAINVB quickly distanced the rest of the field and finished three points clear of XIPHOS, an engine that doublepromoted from Division 4 in TCEC12. ETHEREAL won, adding another 28 games without defeat. In game 19.2/74, ETHEREAL reached an adjudicated KRKRPP mate in 42 moves against XIPHOS. There is always a question of what contribution the sub-7-man EGTs make. Here, XIPHOS was not using the Syzygy 6-man EGTs (de Man, 2018) while ETHEREAL was in the end consulting them more than 10 million times per move.
VAJOLET cut back on threads and power after two disconnects. ARASAN had one less win than NIRVANA and so was this time on the wrong end of the third tie-break.
Division 1: two DRR phases, 28 rounds, 112 games, tempo 60′+10″/m
Eight wins went to engines relatively lower in the final table, most notably g7.2/26 LASER’s defeat of CHIRON (a wild finale of 30 moves and two Q-sacrifices) and CHESSBRAINVB’s two wins over FRITZ, g6.2/22 and g20.2/78.
Other notable games included g1.3/3 CHIRON-FRITZ, EGT-adjudicated as a 61m mate after 83 moves: in fact, it had been a 7-man 46m mate after 72 moves but the shortest route to goal is not usually the one most easily traversed. BOOOT sadly got off on the wrong foot with disconnects in games 2.2 and 4.3. ETHEREAL playing Black swiftly demolished FRITZ in g7.3/27 and JONNY in g8.2/30. In g9.4/36, ETHEREAL demonstrated the value of the EGTs in beating CHESSBRAINVB after reaching a KRPKNP endgame with mate in 37 moves: ETHEREAL consulted the EGT over 100m times on move 50w.
ETHEREAL and CHIRON had established their claims to the top spots on the podium with the first round-robin. They extended away with FIZBO a distant third. ETHEREAL has just one loss, to CHIRON, in its last 84 games and has uniquely promoted three times this season. It won both sides of an opening on CHESSBRAINVB, JONNY and FRITZ here. CHESSBRAINVB mysteriously worsened with each round-robin and returned to Division 2 after being third at the mid-point. The two early crashes by BOOOT led to its downfall and saved LASER from the same fate. Given that crashes are so disappointing for the online audience, TCEC could usefully pull together the known intelligence on how to avoid them.
Division P, four DRR phases, 56 rounds, 224 games, tempo 90′+10″/m
ANDSCACS, ETHEREAL, GINKGO, KOMODO and STOCKFISH updated for this season whereas CHIRON, FIRE and HOUDINI did not. A key question was whether CHIRON and ETHEREAL would stay in the top division after their promotions. The mandated openings from the second author here specified the first eight moves.
After the first round-robin, STOCKFISH led KOMODO with ANDSCACS, ETHEREAL and HOUDINI contesting third place. After colour-switching the engines in the second round-robin to level the playing field, a clearer potential podium suggested itself: STOCKFISH, HOUDINI, KOMODO, FIRE in equidistant line astern with ETHEREAL just fifth. However, a presumably updated ETHEREAL might fare better in the TCEC 2018 Cup (Haworth and Hernandez, 2018c), an interlude following this division. After the first quarter, where one might claim to be half-informed statistically, GINKGO and CHIRON were occupying the relegation zone. The matches STOCKFISH–ANDSCACS and KOMODO–GINKGO were 2-0 wins for the first-named engine.
At the half-way point, STOCKFISH had pulled 3½ points clear of HOUDINI, courtesy of two relatively successful results, 4-0 v ANDSCACS and 3½-½ v ETHEREAL. Both leaders remained unbeaten and had scored 3-1 against KOMODO which was clear 3rd. FIRE was a lonely 4th: the top half of Division P seems to be unchallenged and perhaps sequenced. ETHEREAL just edged 5th on number of wins but was only 1½ points clear of tail-ender GINKGO.
The third DRR saw KOMODO wake up, breathing fire. It inflicted a first loss on STOCKFISH and its third on FIRE and ETHEREAL: it sustained no losses itself. It finally overhauled the still unbeaten but win-shy HOUDINI with scores of 4½/7 in RR5 and 5½/7 in RR6. Would KOMODO continue in this vein: would HOUDINI’s +2 against Komodo save it in a tiebreak? Who would ultimately join STOCKFISH in the Superfinal? In RR8, g50.4/200, KOMODO beat STOCKFISH and two games later, STOCKFISH beat HOUDINI: the first game had plenty of play left after 73 moves but the second was a clearer and quicker win from an advantageous opening.
The division was marked by relatively few wins for Black, the long g14/4.2 FIRE-KOMODO battle being of particular interest. Perhaps the only two notable ‘underdog wins’ below the cross-table diagonal were by the demoted engines against ETHEREAL (games 25.4/100 and 52.1/209) which was only three points above demotion itself.
This season, TCEC introduced a change of mode between Division P and the Superfinal. This was the ‘TCEC Cup’, a knockout tournament involving all the TCEC 13 engines. It was an excellent innovation which will no doubt be repeated. The authors here report on its thirty-one matches separately (Haworth and Hernandez, 2018c).
Jeroen Noomen (2018) had adjusted his approach to choosing superfinal openings. His comments reveal how much thought goes into this aspect of TCEC. The 50 openings split across the ECO A/B/C/D/E range 13/12/12/6/7, the D/E lines being considered “too easy for top engines”. The openings aimed to leave a position with an advantage in the range [0.2, 0.55] and, despite the excellence of the engines, a win-rate of 20% was expected with 25% as target.
Once again, Jeroen made target. The win-rate was 22%, STOCKFISH winning 16 to KOMODO’s 6. STOCKFISH had two wins with Black to KOMODO’s one and there was only one game-pair, games 85-86, where both sides won. Thus, the final score was 55-45, see Fig. 8, a performance that would suggest an ELO difference of only 36. In fact, although KOMODO lost the match, it did marginally better than might have been expected.
Wool (2018) provides an admirably generous and informative commentary on the games, covering the wins of course but also showing the struggle inherent in the many draws.
The two innovative engines exploiting neural-network architecture progressed to Division 3 with LC0 nearly promoting again to Division 2. Shall we see one of them passing through Division 2 next season?!
TCEC are to be congratulated for taking on the cost, risk and controversy of including GPUs to facilitate these exciting NN developments. They are now being rewarded by positive momentum and results from these new engines. No doubt the overheating and reliability problems will be addressed and solved. Another highlight was ETHEREAL’s progression from Division 3 to Division P where it still gained ELO points despite shipping several losses.
The TCEC exploration of chess openings by the second author here and by Jeroen Noomen has been treated above. Terminations by the 50-move rule and ‘EGT wins’ are very rare as the engines anticipate these endings and evaluate accordingly.
Assaf Wool (2018), as mentioned above, continues to provide his usual statistics and perspective on the TCEC tournaments, picking out his own favourite games for each round robin. This is very much to be applauded. ‘GM TheChesspuzzler’ (2018) set up further playlists on YouTube. Kingscrusher (2018) is also commenting on TCEC and particularly LC0 in his comprehensive YouTube presence, 5000 videos and counting.
The pgn and logfiles for TCEC13, together with some chess and statistical analysis as in Fig. 10, are available (Haworth and Hernandez, 2018b) for further study. Some of the decisive games have had an exemplar playout added as a variation. Whether you are looking for opening novelties or subtle endgames, the longest, most balanced or the shortest, most dramatic battles, see Fig. 9, there is plenty of interest here, plenty of occasion for reflection. Feedback to the authors is most welcome.
To read the full article in pdf, click HERE