Warzone

<< Back to Ladder Forum

Posts 1 - 6 of 6

Rating deviation: 3/13/2011 11:08:29

TeddyFSB

Level 60
Report

I was curious how accurate my rating was after I played a bunch of games, so I did a little research, and it appears that standard deviation for Elo ratings is

STD = 400/sqrt(Ngames)

In my case, I've played 29 games, and my rating as of this post is 1851. STD=400/sqrt(29)=74.

So at 95% confidence level, my rating is somewhere between 1700 and 2000. More data is needed!

P.S. I am not totally sure about 400/sqrt(N), if anyone knows better, please let the forum know.

Rating deviation: 3/13/2011 12:22:07

Math Wolf

Level 64
Report

Since this is BayesElo, and not normal ELO, this is, as far as I understand it, not correct.
1. The ratings are defined in a Bayesian framework.
2. They use the data of the games of everybody else too to calculate your rating.

Standard deviation and confidence intervals are frequentist methods and can't be applied in a Bayesian framework. Similar methods exist, but I'm not a bayesian statistician myself, so I'm not a specialist on this matter.

Rating deviation: 3/13/2011 16:16:16
Perrin3088 Level 49 Report	http://blog.warlight.net/index.php/2011/02/running-your-own-ladder-simulations/

Rating deviation: 3/13/2011 16:30:59
crafty35a Level 3 Report	The Bayeselo program actually gives estimated confidence interval numbers when it runs the ratings. It provides columns labelled + and -. Unfortunately, the BayesElo log link provided by Randy (http://warlight.net/Data/BayeseloLog.txt) no longer seems to contain any games, so I can't check what your range is. I'm guessing that the link changed when the 2v2 ladder was added.

Rating deviation: 3/13/2011 20:10:25
Fizzer Level 64 Warzone Creator Report	The bayeselo log has been fixed. Please refer back to the blog post for the new links.

Rating deviation: 3/13/2011 20:58:16
TeddyFSB Level 60 Report	So yeah, the output contains +/- numbers. For me, they are +141/-125. Assuming these are the numbers for 95% CL interval, it's consistent with what I wrote (+/- 150). The interval is a little tighter which I think is usually the case for Bayesian vs frequentist approach.

Posts 1 - 6 of 6

Post a reply to this thread