Russell Sherwood

Lost in the Endgame

Russell Sherwood Thursday, April 20, 2017

Is there a Good, Better, Best Engine in the Endgame?

When it comes to Engines the 2^nd most common question asked is “Which Engine is the best in the X” - The “X” being the Opening, Middlegame or Endgame.

It's generally held that Engines are still relatively weak in the Endgame and this is demonstrated in a number of games where the evaluation is significantly incorrect (typically showing Win/Loss for a draw).

This raises four interesting points for the CC Player:

Which kind of positions does the Engine typically misevaluate?
Which Engines tend to be better in the endgame?
Do any combinations of Engines improve the overall analysis?
Can I leverage any of this knowledge in my own games?

Miss-evaluation

The most common errors tend to be positions, which when stripped down bare tend to be about Zugzwang or Fortresses. Miss-evaluation also tends to happen as engines tend to use generic methods which will generate “good” moves but not the absolute best move. In endgame analysis, the engine often relies on search depth finding the win rather than knowledge but often the required winning move is beyond its horizon.

Engines

My first Analysis was to use a very large epd file with 432 various endgame positions – some simple, some very difficult. The outcome of this analysis was showed two interesting outcomes:

(1) That even the best engines were only scoring just over 80% or more importantly were getting around 20% of positions incorrect. That is not to say that these moves were blunders, many were 2^nd best but not optimal.

(2) Here there was very little difference with the Komodo (10.4) and Stockfish (in the form of Matefinder) being out I front, with Houdini 5 just behind – we then see a chasing pack with Deep Shredder, Fritz 15 and older versions of Komodo and Stockfish. Looking at some historical analysis

https://sites.google.com/site/computerschess/scct-ets-all

We can see that it appears Komodo, in the most recent versions, has made up ground on Stockfish and Houdini

What is interesting is that each version of the Big 3 Tends to advance the score by 1-2%, this being due to improving overall general evaluation picking up a few extra “correct” moves, rather than any additional knowledge being added to the engine.

This knowledge is what tends to make a difference in Endgame analysis – even a fairly weak human player will know about Bishops of Opposite colour being generally drawish but an engine which is a pawn ahead that does not have this built in, will trade down evaluating the game as won!

That knowledge is the key factor here is supported by running the tests again with significantly increased times – as expected the scores only increased slightly.

So I then moved onto a smaller 100 position test set with a view to looking at coverage.

So out of a score of 100 I obtained….

85 CorChess 1.2

84 Matefinder

82 Raubfisch ME262

81 Sugar Xpro 1.0

80 AsmFish

80 Houdini 1.5

79 Houdini 5

78 Komodo 10.4

74 DS 13

71 Fritz 15

71 Gull 3

68 Andscacs

65 Hiarcs 14

59 Hakkapeliitta

57 Critter 1.6

57 Chiron 3

41 Fire 4

I did run other, older engines through the test but in general the scores were at the bottom of the table I also ran other Stockfish Clones through the test suite but these simply crowded out the top of the table.

So from this, we see that even the best modern engines are missing around 15% of these positions.

Some might be surprised that Houdini 1.5 does so well. To a certain extent this is due to the way that Chess Engines are developed. What is more important these days is developing engines which beat other engines and sit atop of the rating list, this aligned with the self-test method utilised means that on occasion, to gain 50 wins, 1 loss is sacrificed and the “Baby goes out with the bathwater”.

Does using a range of engines improve accuracy?

Even from this small-scale test, the answer appears to be a clear Yes!.

The best single score is 85/100, but with a combination of Engines, we reach 99/100!

So Looking at a few combinations….

CorChess 1.2/Houdini 5/Komodo 10.4 we get 92/100

For 99/100 we need(!!)

MateFinder/Houdini 5/Fritz15/Sugar/DS13/Gull3 and Hakkapeliitta!!!!! Rather impractical

A rather more interesting choice is:

CorChess (or another Stockfish Clone) with Support from Houdini 1.5 and Hakkapeliitta with a score of 96/100

Can I leverage any of this knowledge in my own games?

The most simple takeaways from this brief analysis are:

(1) Engines do have gaps in their knowledge, which needs to be taken into account both in analysis and CC play.

(2) These knowledge gaps are not universal and using a combination of engines to analyse a position is likely to bring us closer to the “right” move.

(3) Not all older engines are obsolete and some do give new insights into a position.

(4) Even with the use of a combination of engines it still requires the Player to act as arbiter and when doing this s(he) should consider that the highest rated engine is not always correct.

This brings us to the end of this brief review of Engines in the Endgame. I include a spreadsheet with engine results should the reader be interested in different combinations.

Download

Getting those Title Norms you know you deserve!?

Russell Sherwood Tuesday, April 18, 2017

In a prior article, I reviewed a path to the CCE Title but considering my own aspirations I thought it useful to review the approach I believe is necessary for Higher Titles.

The first thing we need to consider is getting access to the right level of Tournament. Although CCM, IM, SIM and GM Norms become available at Category G,1,4 and 7 respectively very few Norms are scored at these levels because you are required to score around 79% - around 7 wins out of 12 games, which In modern CC is almost unheard of.

Preference is given instead for 3 Categories Higher (J,4,7 and 10) where the required performance drops to 4 wins from 12 games – difficult but not impossible.

So from a practical view, we need to build out rating up as high as possible to be able to try and gain access to these higher level events – either via direct access (such as Master Norm Tournaments) or selection (such as Invitational or National Team events). Outside of these the best possible option for an aspiring player is the Champions League, where even in the lowest Division events up to Category 3 are possible without either the rating limit or selection process seen in most other events.

To boost our rating, we have to look to primarily win games against lower rated players. This is worthy of a very extensive article in its own right but the basics include:

Going into Opening variations which are not fully clear
Not following worn out book lines
Not slavishly following engine advice

Let’s assume we have reached the position of being able to secure entry to a suitable event. For me there are now three areas we now need to consider:

General Event Preparation
Specific Opponent Preparation
Opening Preparation

All of these overlap to a certain extent, starting with:

General Event Preparation

Here we have several considerations:

What are my performance requirements for the Norm?
What other objectives do I or are my opponents likely to have for this event? (e.g. Do I want to be promoted to the next round of an event)
How important is this event to me? (Am I willing to risk losing when trying to win?)
Are there wider considerations? (Team event place additional constraints on players)

These combine to give a general picture of the event and what is required of you to achieve your goals!

In addition, at this point it can also be a very useful exercise to determine the “average” expected Outcome of the event – this can be built up in a spreadsheet quite quickly and can give an indication of potential targets and threats.

Specific Opponent Preparation

If we have built the spreadsheet mentioned above we can have a very useful indication in terms of games we are expected to win and lose. This can also be done by a simple examination of our games. As a general rule anyone we are more than 100 Elo above we should be looking to win against, Anyone more than 100 Elo above us a draw is our likely target and anyone else a draw is also a likely outcome.

From above we now have a basic list of likely targets but this is almost certainly going to be inadequate for us to reach out goal. The list above is based on the statistical outcome of the rating formula, which is generally correct over a large number of games but we now need to look at the games that buck the trend.

At this point we now start to look closely at our opponents. A very basic method is as follows:

What is happening to their ratings over the last 5-6 rating lists, Up, down, Flat or Choppy? Ideally our targets are on the way down, but choppy is also of interest!
How many games are the playing (whilst we cannot see this directly the number of games finished per rating period is a good proxy), If this number if too high it increases the potential for errors and gives us an insight into their approach.
What is their Win/Loss/Draw ratio over the last few rating periods? Some players (especially those with flat or choppy ratings) tend to fall into two groups: Those who Win/Lose and those who draw. For our purposes, the Win/Lose player tends to make a better target for a result.
What Title do they have? Whilst I believe there is a sell by date on titles (those from before 2012 are from the pre-engine age and probably less of a threat than those achieved more recently. On the flip side to this, do remember that those same players do tend to have well rounded positional chess skills)
How active are they? What have their recent results been like? For example, if someone is a SIM but their recent performances have been IM or lower level this might indicate possibilities?
What other information can be gathered? Age, FIDE Rating, published ideas, Team participation. All potential little nuggets.

All of this, when added to the statistical numbers, should now allow us to identify our target games and the move onto Opening Preparation.

Opening Preparation

Our first consideration is Opening strategy – Do we look to gain results via safe or challenging openings? In its simplest form do we meet 1.e4 with a Sicilian or go 1..e5 and head to the Berlin Wall? This should be influenced by our opponent’s likely choices.

Again, an in-depth article is necessary here but in basic steps:

Obtain a large database of our opponent’s games
Examine it for general considerations – are there lines we don’t like to play? Or do they play the same lines as we do?
Examine their wins and losses in detail – Do they use tend to novelties early or late or look to win in the endgame?
Do they seem to understand the opening themes of the lines they play (do you understand them?) If not them early deviations or move-order shenanigans may be possible
Chessbase’s Prepare for and Dossier tools can be very useful here.
We now can build lines to be able to set our opponent new challenges.
At this point, we now are able to start playing and we if we have done our homework there should be very few surprises (for us!)

As we are now playing a few other considerations come to mind:

Don’t play into drawish lines early on (unless that’s your objective) – you don’t want to draw early with someone who then loses a number of other games – better to work harder in the game so the potential for a result remains longer.
Take notice of your opponent’s time handling method – if they play very quickly this may indicate insufficient analysis (although it could indicate prepared lines!) or do they run their clock down (at which point an unusual move could cause panic!)
Take your time – ~~double~~ Triple check your analysis.
Be patient, especially in favourable or winning positions – the result will come in the end and if they are delaying excessively there are approaches to deal with this!

Good Luck and Good Norm Hunting!

Lifting the Hood

Russell Sherwood Sunday, April 16, 2017

Taking a look at Chess Engine Settings

Most players will use Engines with their default settings, yet for Correspondence Chess this could be a major mistake as most engines are tuned for fast time control Engine v Engine games, not Analysis or Correspondence play.

Most Engines are very poorly documented (Komodo being a notable exception!) so it can be hard without extensive research to determine the effects of various parameters, options and settings, hence this brief (and superficial) review. In general I won’t suggest the best settings as these can be very hardware dependent.

Before we move onto some of the common adjustable settings its worth thinking about the Engine itself. Chess Engines have two main working parts – Search and Evaluation:

Search is the method by which the engine decides which moves to evaluate. It has to do this because the number of possible moves is beyond astronomical and the number of moves it can evaluate, even with the best possible hardware is finite.

A very simple example – if we look at 20 (half) moves deep we are looking in the region of 10,000,000,000,000,000,000,000,000 possible positions – yet a typical engine on fairly modest hardware will reach this depth in a few seconds – during which time it will have “only” looked at a few million positions – clearly a tiny fraction of a fraction of a percent of the possible positions.

This means that search techniques have evolved that aim to reduce the number of positions looked at but the problem from our perspective is that sometimes the “Baby goes out the with bathwater”. This has been recognised and so a number of options exist which we can adjust which can make the engine stronger for Analysis even though it will be weaker in rapid gameplay.

Evaluation

This is the method by which the positions are evaluated – Options here tend to alter the relative impact of certain parts of the evaluation – e.g. Structure vs Initiative.

Onto the options – not all are available in every engine and even then the might have a slightly different implementation.

LMR (Late Move Reduction)

This is a technique where the amount of time spent analysing each move is proportionate to its prior rating – so moves considered better get more time and worse moves get less time. This can be observed if you watch an engine – it seems to spend a fair bit of time on the first few moves then rush through the later ones. If switched off this gives more time to worse moves, the benefit of this is that moves, where the real impact is “down the line”, can be picked up, the downside is that the engine analyses much slower.

Null move Pruning

This has a similar impact as LMR but is based here on the concept that in a position that doing something is almost always better than doing nothing (A null move). If switched off a much slower but more through search results which can be useful finding deep threats and zugzwang.

Reduction

This determines how deeply the engine search along some lines. When used in Komodo higher values will encourage a narrower, deeper search but with a consequence that it may miss a move!

Threads

If you are using a single engine this should be set to either the number of processors you have or this minus one if you intend to do other light work whilst analysing. There is an alternative view that this should be the number of physical processors your pc has but this is beyond this discussion.

Hashtable Size

This is the working memory your processors will use. General rule of thumb is that this should be around half the total RAM on your machine.

Smart CPU Usage

This determines the default number of Cores the engine starts with. Should be on.

Contempt

Should be set to zero. This is a setting for engine v engine play used to improve results against inferior opponents.

Syzygy Probe Depth

This determines the search depth at which the engine will probe the tablebases. Generally left at default value.

Syzygy Probe Limit

This is the minimum number of pieces on the board before the engine will start to access any available Tablebases. Should be set to what size of tablebases you have installed.

Large Pages

Where available this allows the memory allocation to be “together” generating a slight speed increase.

Tactical Mode

When utilised much more emphasis is given to Tactics in both the search and Evaluation. The is the equivalent of looking for a knock out punch. Of course, this is at the cost of positional concerns. This mode is very useful is tactically charged positions where the feeling is that “Something is on”!

Razoring

Is a method used to reduce the search based upon the assumption the opponent will be able to find at least one good move. Generally best left in its default state but if a wider but shallower search is required then experimentation is required.

Futility Pruning

Is another method used to prune the search tree – again best left at default.

Split Depth

This determines the depth at which the analysis will be split between multiple Cores.

Load Hash/Save Hash

A very useful tool where available for CC players. Here the Hashtable can be saved and then reloaded at a later date so that analysis can continue from the same or very near point.

Table Memory

This is a Komodo internal setting utilised for various helper tables. The default of 128 is for fast games and for CC the player should experiment with higher values up to a maximum of 1024 as this is hardware dependent.

King Safety

This is the relative weight given in the evaluation to King Safety. Set too high is can cause play to become stodgy and defensive, set too low it can cause the engine to become rather more carefree in its attacks but potentially leaving itself open to counter-attacks.

Selectivity

Another Komodo setting which controls how aggressively prunes the tree, higher values mean Komodo will search deeper but is more likely to miss something, lower values lead to a much more shallow search but is less likely to miss a move.

Dynamism

Controls the relative weighting of Dynamic and Static parts of the evaluation. An interesting comment in the Komodo Readme file is that values of 80 will give more accurate evaluations than the default of 100 but the engine will be around 20-30 elo weaker.

Progress Threshold

This parameter determines how quickly the engine starts to look at the 50 move rule.

Variety

Is a setting for engine v engine and determines how likely the engine will make a different move in the same position.

Extend Checks

This is an option seen in some Stockfish variants. This is related to extending the search certain positions. Generally best left at the default setting.

This brings us to the end of this review – for the aspiring player much suggestion is to experiment with settings to see which gives the optimal results!

Infinite Variety

Russell Sherwood Wednesday, April 5, 2017

I've had an interesting discussion recently in terms of Engine based Analysis techniques. Now to the uninitiated there is only one method - plug in the position rather IA for a rather long time and then mindlessly play the move. Pish!

In a fairly short brainstorm we came up with over 20 different methods that can be used ( a partial list is below) - you would need to know more detail to be able to use these but the point is.....think with creativity about how to perform analysis!

(i)Infinite Analysis
(ii) Multi PV Analysis
        (iii) Selective Analysis
        (iv) Forward Sliding
        (v) BackSliding
        (vi)Breadcrumbs
        (vii)Deep Analysis (Fritz)
        (viii) Deep Analysis (CB)
        (ix)IDEA
        (x) Monte Carlo
        (xi) Shootout
        (xii)Lets Check
(xiii) Comparative Analysis
        (xiv) Brabo Methods
        (xv) Paired Deepening
        (xvi) Saved Tables
        (xvii)2nd Opinion
        (xviii) My Move - Your Move
        (xix) Patched Super Deepening

(xx) Remote Engines

(xxi) Subtractive Analysis

A Peek behind the Curtain!

Russell Sherwood Wednesday, March 8, 2017

Often players struggle to understand the evaluation that a Chess Engine gives a position.

Now its possible to see the breakdown of Stockfish Evaluation function.

https://hxim.github.io/Stockfish-Evaluation-Guide/

Please note two things

Ths is not a replacement for an Engine!
The position that is being evaluated is the at end of the line, not the start (or current ) position

Enjoy!

Blast from the Past

Russell Sherwood Tuesday, March 7, 2017

Cleaning up spme old links.....

came accross the Congress Video from Cardiff 2015!

Enjoy!

It's all just a matter of time!

Russell Sherwood Monday, March 6, 2017

Every so often I get complaints from a player that their opponent is playing incredibly slowly.

To be able to review this objectivly I created the attached spreadsheet, into which the time stamps from the game can be copied, as can the player's holidays.

(This information can be found under Board =>Show detais in the game itself)

One the data is reviewed a more detailed picture can be formed of the game.

It should be noted that slow play, in itself, does not break any ICCF rule, other factors need to be evident for a claim to be made.

An interested player should look in the TD Manual - Server found on ICCF.com to review the guidelines before looking to make any claims!

Download

What's in a Norm?

Russell Sherwood Monday, March 6, 2017

As the new CCE/CCM titles came into play I did some research into Norm requirements. This culminated in the attached tables.

The purpose of the tables is twofold:

As an organiser it aids in the selection of the optimum category for the purpose of Norm generation (e.g. for CC, Cat I is far better than Cat H)
As a player it aids in selecting events most likely to give a Norm (The Green zone!)

It is worth adding that Norm Categories are changing. In simple terms the Category of an event used to be determined by the average rating of all the players, so if the average was 2332 it would be a Category 4 event with the same Scoring requirements for all players.

This system is now changing. The Category of the event is still calculated as above but this is only form communication purposes. The category for the player is the average of the rating of the opposition. Overall this does not make much of a difference, but if we take an extreme example:

12 Players rated 2500 and 1 player of 2100 make up an event. The Category of the event would be 9 with an average rating of 2469. However for the 2100 rated player the average rating of his/her opposition would be 2500 or Category 10 - which has Norm reqruirments generally half a point less. For the 2500 rated players the requirements would not change.

I hope this helps in the selection of Tournaments!

Download

So you want to be a Correspondence Chess Expert?

Russell Sherwood Monday, March 6, 2017

I have had a few players ask “How to I become a Correspondence Chess Master?” To answer this fully would be a book in itself but a condensed answer to this is given here.

Here we will focus on the ICCF Correspondence Chess Expert Title. The Blueprint for higher titles is similar but with the addition of a few additional steps.

Starting at the end: To achieve the Title we need 24 games with the necessary performance level in International Title Tournament events. Typically this will come from either 2 or 3 events.

To be able to access these events a rating of between 2000 and 2100 is generally required, as an absolute minimum. In addition there you need an invitation to one of these events, which are handled via your National Federation. This can be difficult as places are few.

A few other options do exist:

Champions League. This is a very strong team event and if one of the few where Norms are possible in the lowest tier of the event
International Opens. Norms tend to become available in the 2^nd round, so you will have to battle through the first to get there.
Regional Tournaments – events such as the British Championships are starting to offer Norm Opportunities at the lower levels

So get achieve those Norms and/or the rating necessary to gain access to them we need to win games! In the past a different tactic was utilised by some – aiming to draw out games with higher rated opposition. Recently changes have been made to the rating mechanism to make this far less effective method!

So how do we get push up are rating and generate those wins?

You will need at least one Chess Engine, even it it is only to blunder check your moves. This is possible regardless of the platform (Phone, Tablet, Laptop/PC). The effective use of engines is a massive topic but to as a minimum the use of two engines are recommended. Stockfish generally should be one of these.
A Database program is required to record you games and the analysis and ideas you have. This is very platform dependent with options ranging from Chessbase to open source options such as SCID.
Access to a large source of games for preparation. The ICCF database can be downloaded and makes an excellent basic source.
An opening book. This is a contentious area in terms of effectiveness but at the very least it can be used to prevent you repeating other peoples mistakes. Some Opening books work within database programmes. Free online options such as https://www.365chess.com/opening.php are also available.
Time – a mistake made by many rookie CC players (and a bias from some misguided OTB players) is that it is simply a case of putting the engine into Infinite analysis and then entering this move. The reality is that this method is not used by the majority of stronger CC players. More effective analysis methods exist and this take a lot of player time to put into use Some of these are described in linked in my Resources for CC article.

So this is a very lose indication of what is necessary to achieve the title. Putting the set up described above and working through the “Resources for CC” will put you on the path to success!

Optimising your Engine Set Up for Newbies

Russell Sherwood Monday, February 27, 2017

We would all love to have a Monster PC to suppot our analysis efforts but the reality is that the vast majority of players have relativly modest hardware to work with.

So a few areas to consider:

Have you shut down all other applications when analysing? If you have other programmes running this can reduce your nps by 50%
Is the RAM the maximum your machine can utilise? This is a very cheap way to upgrade performance and simple to install (If you are not confident in doing this local computer shops can do this in a few minutes)
Is the Hash setting on your engine correct. Often engines can have different optimal settings but as a rule of thumb its worth setting to half your overall RAM. Some testing is sensible to see which gives the best results.
If you are using Tablebases - do you have a SSD drive? This can make 20-30 difference in speed.
How many Cores is your engine set to? Exact settings are a long article in itself but if you want to max out then either all your cores or All minus one gives maximum pwoer. Of course this is not necessarily the best way to do things! Do remember than the more cores you utilise the more stress you are putting on the processor - a bit like I can drive a mini at 100mph all the time but it will wear out quicker.
If using Chessbase consider if Smart CPU should be checked. This can make a signifant difference to processor utilisation.

Anyway thats a few ideas for now - more advanced ideas another time!