Saturday, January 10, 2015

Martin Thoresen's World Chess Championship

My third “Game On” column, "The Real Kings of Chess Are Computers," appears this weekend in The Wall Street Journal. I write about the "real world chess championship," which is known formally as the Thoresen Chess Engines Competition, or TCEC. This is a semi-annual tournament that pits almost all the top computer chess programs against one another. Since the best chess engines are now much stronger than even the best human players, a battle between the top two engines is a de facto world championship of chess-playing entities.

That battle was the Superfinal match of TCEC season 7, and it was won last month by Komodo over Stockfish (both playing the same 16 core computer). In a digital-only extra, "Anatomy of a Computer Chess Game," I try to explain a key moment in game 14 of the match, which gave Komodo a lead it never relinquished over the remaining 50 games.

As part of the research for these pieces, I interviewed TCEC impresario and eponym Martin Thoresen by email. Below is an edited transcript of our conversation, which took place between 29 December 2014 and 2 January 2015. The questions have been re-ordered to make the flow more logical.

CHRISTOPHER CHABRIS: Let’s start with the recent Season 7 Superfinal match. What is your opinion about the result? Do you think it shows that Komodo is a “better chess player” than Stockfish, in their current versions?
MARTIN THORESEN: I think the Superfinal was very close and exciting. The draw rate was slightly higher than what I expected, but then again the engines are very close in strength so this is quite natural. I think the result shows that Komodo is the better engine on the kind of hardware that TCEC uses. And for grandmasters with powerful computers this should be something to take note of when they analyze games using chess engines.

Do you believe that TCEC features the “best chess players” in the world?
Yes, I would say any of the top programs of say, Stage 3 and onwards would pretty much crush any human player on the planet using TCEC hardware.

Do you think it is a problem to have so many draws (53 out of 64 games)? It definitely distinguishes engine-engine matches from human-human matches to have so many draws, but I agree with you that it must result partly from the players being stronger than the best humans.
Personally I don’t mind the draw rate being this high in the Superfinal, it makes it very tense. But one of the main goals of TCEC is to entertain people. Too many draws defer from that and too many one-sided openings would lower the quality overall, even if it lowers the draw rate. I would be satisfied with a draw rate of roughly 75% in the Superfinal.

You must have watched more engine-engine games than almost anyone else. Were there any games or particular moves or positions that you thought were especially beautiful or revealing in this most recent Superfinal match?
I have not looked deeply at all the games yet, but games like #9 strike me as fascinating.

Let’s talk about some of the details of how TCEC works. Are the games played entirely on your personal computer at your home?
Yes, it’s a 16-core server I’ve built myself. It has two 8-core Intel Xeon processors and 64 GB RAM. It’s located at home here in Huddinge, a suburb of Stockholm, Sweden. I live in an apartment of about 45 square meters.

Why do the games run only one at a time? Because it all happens on one computer? Have you considered using multiple computers so that more games can happen at one time?
Yes exactly, they run only one at a time because the engines utilize all 16 cores to get maximum power, which makes it impossible to run more games. Using more computers is of course something I wish I could do, but then people need to donate more. ☺ The server cost me roughly €4000–€5000 to build. Of course it would be possible to limit each engine to say, four cores, then I could have four games running simultaneously, but then again the engines would be weaker due to the fewer cores. I want TCEC to show only the highest quality of games. Not to mention that I’d have to redesign the website to support many games at once.

How hard was it to write the code that “plays” the two engines against each other, passing moves back and forth, and so on? Do the engines provide you with an API, or do the engine authors give you a special version that corresponds to an API for your own server code? (I assume you wrote the server code yourself too, correct?)
The interface that plays the games is a small command line tool called cutechess-cli, but somewhat modified for TCEC by Jeremy Bernstein after my instructions. I have not coded this tool. Cutechess is simply a UCI/Xboard interface tool that “runs” the engines in accordance with the UCI or Xboard specifications. Basically all chess engines comply with the UCI or Xboard protocols for I/O requests (time control, time left, the move it makes, etc.). Using this tool does not give you a chessboard to view the action like a GUI (Fritz, Arena, SCID, etc.) so ironically I can’t actually watch the game on the server—all I see is a bunch of text.

Who developed the software to broadcast the games to the internet? As someone who followed the latest Superfinal and browsed the archives quite a bit, I can say that it has a very nice interface.
There are two parts of TCEC. One is the website which shows the games, the other is the server on which the games are played. These two are not run on the same machine (for obvious performance reasons), so the server uploads the PGN to the website each minute. The website is designed by me and it has had different designs in previous seasons. The core technology on which it is built is the free JavaScript chess viewer called pgn4web.

How much money would you estimate you have personally spent, and how much total has been spent, to run the TCEC since it started, and season 7 specifically?
I have spent a lot of money. I am not quite sure how much, but I would estimate €6000–€7000 since TCEC started (hardware upgrades, power bills, etc.).

How many hours do you spend on it out of your own life?
For Season 7 I didn’t really code anything new for the website compared to Season 6, so I didn’t spend much time preparing this time around. But when I made the new (current) website for Season 6, I started right after Season 5 finished and coded for almost 3 months straight, sometimes as much as 4–6 hours a day. That left little sleep considering I had (and still have) a full time job as well. But when a season is running, my attention goes mostly to moderate the chat and making sure the hardware runs as it should. So everything from 0–4 hours per day during a season.

Are there any major engines that did not participate over the past few seasons? If so, do you know why they declined?
I pick the engines myself, but there was the case of HIARCS for Season 6, where the programmer Mark Uniacke told me to withdraw it. I only did it because I did not buy his program—he sent it to me for free for Season 5. But if I had bought it myself, I would have included it. Other than HIARCS there have not really been any similar cases in TCEC history. Now and then the question of why Fritz does not participate pops up, but that has a simple answer: It does not come in a form that supports UCI or Xboard—it has a native protocol built into the Fritz GUI which makes it unusable. 

If I understand correctly, your goal is to include every major engine, and the only reasons they could be left out is (a) their authors explicitly withdraw them, or (b) they aren’t compatible with the required protocols. Do I have that right? And that HIARCS and Fritz are the only major engines not participating?
Yes, every major engine that is not a direct clone. The whole clone debate is a hot topic in most computer chess forums. So your (a) and (b) are both correct. HIARCS was not a part of Season 7 for the same reason as it was not a part of Season 6.
 
Has there been any recent criticism of the TCEC from chess engine developers that were not included (Fritz), or sat out (HIARCS), or others?
No, there has not.
 
How strong a chess player are you? Do you play in tournaments, a club, or online?
I am not very strong. I don’t even have a rating. I would estimate my strength at around 1500 FIDE on a good day.
 
Can you tell me a bit about yourself?
I am 33 years old and living with my dog. (For now!) I am currently working as an IT consultant and for the past 1.5 years I’ve worked for Microsoft as part of their international Bing search engine MEPM team. I have no formal education apart from what would equal high school in the U.S. Everything I’ve done so far is self-taught.
 
How many other people help regularly in organizing and running the TCEC? Are they all volunteers?
Nelson Hernandez is in charge of the openings, assisted by Adam Hair and international master Erik Kislik. Jeremy Bernstein has helped me with the cutechess-cli customization. Paolo Casaschi (author of pgn4web) has also helped me with some specific inquires I’ve had about JavaScript code. They are all volunteers. ☺

How did the idea for the TCEC come to you?
Basically it started after I left the computer chess ranking list (CCRL) after a couple of years of being a member. I was tired of just running computer chess engines games for statistics—I wanted to slow down the time control and watch the games. Obviously, the idea of a live broadcast wasn’t new, and in the beginning it was very simple, just a plain website with moves and not much else. It has now evolved with a more advanced website that I think is kind of intuitive and nice to use and gives TCEC a kind of unique platform.
 
Why is there so little time between TCEC seasons? Why not one season per year, more like the human world championship? Do the engines change enough between seasons for such frequent seasons to be meaningful?
The rhythm the past few years has been roughly two seasons per year. One season takes 3–4 months, so basically you can watch TCEC for half a year per year. It is definitely debatable whether this is useful or meaningful, but that’s just how it has been. Of course, this might change in the future. I have no other good answer. ☺

What are your plans for the future of TCEC, short-term and long-term?
Short-term would be to take a (well deserved) break. ☺ Long-term would be to be recognized by some big company to “get the ball rolling.”

Are you planning any changes in the format or rules for Season 8?
There might be changes for Season 8. Nothing is decided yet.
 
Regarding rules, while following the Superfinal games I noticed that some games were declared drawn by the rules when there seemed to be a lot of life left in the position—for example, the final position of game 18, which human grandmasters might play on for either side. Do you think this rule might be revised?
I don’t think the TCEC Draw Rule or TCEC Win Rule will be changed. They have been there from the start (slightly modified since the beginning) and no one is really complaining. As for the particular example with game 18, both engines are 100% certain that this is a draw (both show 0.00) so even if we humans think it looks chaotic, the engines simply have it all calculated way in advance.
 
I noticed that endgame tablebases were not used in the Superfinal, and this must have resulted in some incorrect evaluations. For example, as I was watching one game, I saw that one engine’s principal variation ended in a KRB-vs-KNN position, which is a general win for the stronger side, but the evaluation was not close to indicating a forced win. Do you think that could have helped cause more draws to happen?
That is correct, tablebases were disabled for all engines for the whole of Season 7. Previously they had been available, but some fans wanted them disabled so I figured they would have their wish fulfilled for Season 7. What tablebases do is to basically help the engines find the correct way into a winning endgame—or in worst case scenario, prevent a loss. It shouldn’t affect the draw rate overall since it would even out in the end. But the point is that without tablebases, the engines can only rely on their own strength in the endgame and the path for getting there.

Have you thought of inviting strong players to comment on the games live, as happens in the top human-versus-human tournaments and matches? Is it too expensive?
We’ve had some discussions, but nothing concrete yet. It could probably be something to do for the Superfinal if the required money could be arranged.

Have you approached any major companies like Intel, AMD, or Microsoft about sponsoring the event or making it much bigger in scope/publicity?
Not in a while. Back when I did, I got no reply or acknowledgment whatsoever.

Do you have data on how many people in total looked at the latest Superfinal on tcec.chessdom.com, and any other rough numbers on chat commenters, etc.? 
There were approximately 26,000 unique visitors there during the Superfinal. From memory, the number of users in the chat peaked at roughly 600 at one point during the match.

Do you think that the chess world should pay more attention to TCEC in particular, and to engine-versus-engine games in general? They are rarely quoted in discussions of opening theory, or of the best games, best moves, or most interesting positions. Do you have an opinion about why this is?
I think they should. There are so many beautiful games coming out of TCEC that can blow one’s mind. Why we see little reference to engine-versus-engine games is hard to say, but my guess is that it related to the fact that a chess engine is basically an A.I., so people might have a hard time admitting that “a robot” can play even more beautiful chess than humans.
 
What intrigues me most about TCEC may be the fact that it is a very personal project for you, yet it has attained a measure of worldwide respect and fame without having a big sponsor or lots of money involved.
This project is of course very personal. Anton Mihailov of chessdom.com contacted me prior to Season 5 and we have continued our cooperation since. To have a hobby being acknowledged like that is of course very nice. With that said, if Intel or AMD or any other big company would be interested in sponsoring TCEC I would definitely be interested in having a talk with them too. Bottom line is: Most people regard TCEC as the official “world computer chess championship.” And I don’t think they are wrong about that! ☺

My thanks to Martin Thoresen, grandmaster Larry Kaufman (of the Komodo team), international master Erik Kislik (who made the final selection of openings for the match), and everyone else who answered my questions for these pieces. I am looking forward to Season 8 of TCEC!