A standard Elo rating system computes the expected winning probability by Se = 1 / (1 + exp((r2 - r1) / a))
, where a is a constant 400 / ln(10)
.
For 100 Elo difference, this predicts about 65% winning probility for the higher rated player.
The EGD rating system uses a similar calculation: Se = 1 / (1 + exp((r2 - r1) / a(min(r2, r1)))
, where a is not a constant, but a function of (GoR) rating: a = (4100 - rating) / 20
, where 100 GoR difference should equal 1 grade and equivalently 1 full handicap stone.
Fitting expected winning probilities more closely to observed winning probabilities, a was modified to a = (3300 - r) / 7
in the revised system.
Because a is not actually a constant like in a standard Elo system, the revised system uses the more fundamental Bradley-Terry formula:
Se = 1 / (1 + exp(β(r2) - β(r1)))
where β is the integral of 1 / a: β = -7 * ln(3300 - r)
.
In the revised rating system, ratings are updated by: rating' = rating + con * (Sa - Se) + bonus
,
where Sa is the actual game result (1.0 = win, 0.5 = jigo, 0.0 = loss),
con is a factor that determines rating volatility (similar to K in regular Elo rating systems): con = ((3300 - r) / 200)^1.6
,
and bonus(not found in regular Elo rating systems) is a term included to counter deflation: bonus = ln(1 + exp((2300 - rating) / 80)) / 5
(determined empirically).
The very top of human play (rare genius level) seems to be about 3040 GoR. The formulae above assume that this is about 3 stones handicap away from perfect play, so a perfect player would have about 3300 GoR (3 stones handicap corresponds to 250 GoR).
Pros already seem to need this much handicap against top AI which are not perfect, but the fairly short time control in those exerimental games seem to favour the AI.
It would be interesting to learn how much handicap top European players need against KataGo in serious games with a longer time control.
On the Player Rating History page you can compare the rating histories computed with this revised rating system with the original EGD rating histories.
The domain name of this site is similar to goratings.org from Rémi Coulom, but there is no connection. I did ask Rémi if he is ok with me using this domain name and he did not mind.
Rémi's WHR system seems to use a standard Elo rating scale. The β function can be used to estimate a rough conversion from the European ratings to Rémi's WHR pro ratings by Elo = -7 * ln(3300 - GoR) * 400 / ln(10) + 10500
:
goratings.eu GoR | > | goratings.org Elo |
---|---|---|
2700 (1 p) | > | 2721 |
2730 (2 p) | > | 2784 |
2760 (3 p) | > | 2849 |
2790 (4 p) | > | 2919 |
2820 (5 p) | > | 2993 |
2850 (6 p) | > | 3071 |
2880 (7 p) | > | 3155 |
2910 (8 p) | > | 3245 |
2940 (9 p) | > | 3342 |
2970 (10 p) | > | 3448 |
3000 (11 p) | > | 3564 |
3030 (12 p) | > | 3692 |
3060 (13 p) | > | 3835 |
DeepMind, the creator of AlphaGo, also uses an Elo rating scale to measure strength.
The β function can be used to estimate a rough conversion from the ratings in their papers about AlphaGo by GoR = 3300 - exp(((10500 - Elo) / 7) * ln(10) / 400)
(note that DeepMind extrapolates 230 Elo points per rank downward from 7d, which is incorrect IMO):
goratings.eu GoR | < | DeepMind Elo | |
---|---|---|---|
Fan Hui | 2770 (3p) | < | 2872 |
AlphaGo Fan | 2877 (7p) | < | 3146 |
AlphaGo Lee | 3040 (12p) | < | 3738 |
AlphaGo Master | 3196 (18p) | < | 4852 |
AlphaGo Zero | 3221 (18p) | < | 5187 |