<< Back to Ladder Forum | Discussion is locked - replying not allowed   Search

Posts 1 - 15 of 15   
Steep Ladder / Game Creation Algorithm / Ratings: 2/27/2011 19:52:11


chas 
Level 43
Report
My statistical mind loves this ratings system. The thing I'm finding that I don't like is how hard it has been to overcome some bad early losses to players that aren't very highly rated now. It's only a week into this, but I cringed when I went back and look at how I played in my "bad losses". I know I can hang with the better players, but I haven't gotten much of a chance to prove it. My rating is currently 1546 and my overall record is 15-8.

Maybe the latest update to the game creation algorithm addresses this, but you would think with a rating around 1500 that I would have played about as many players above me as below me. In reality, I have completed games against 16 players rated below me and only 7 rated higher than me.

In addition, even if I was able to play and defeat the current top 3 players (Teddy, Impaller and Poop) in my next 3 matches, I would still only climb from 30th to 20th. While time will even everything out eventually, it just seems my early "bad losses" have been very difficult to overcome. After starting 1-3, I've won 14 out of my next 19 matches.

It seems to me that the ratings system would be much more effective if everyone had a more standard distribution of opponents until a player's Ladder position was more established with a larger sampling of matches. I think the game creation algorithm should try to schedule games more evenly throughout the Ladder for the first 20-30 matches a player plays. Then once they settle in to a ranking it might make more sense to try and schedule more matches against similarly ranked opponents, while still allowing people a fair chance to climb the ladder.

MathWolf's suggestion in the "Skewed ratings reults" thread ([Link text](http://warlight.net/Forum/Thread.aspx?ThreadID=1097&Offset=31)) to reweigh the results as a continuously decreasing function of time would be a **huge** benefit. I think there will be a lot of players who will make dramatic improvements in their play. In my mind equally weighing results for the past 3 months will prove to be too long of a tail to accurately reflect a player's current skill. The Whole-History Rating that Crafty found seems like the ideal solution!
Steep Ladder / Game Creation Algorithm / Ratings: 2/27/2011 20:06:12


Doushibag 
Level 17
Report
I agree that a game from 3 days ago should matter a lot more than one from nearly 3 months ago. It makes a whole lot more sense for a game like Warlight. That problem is one of the things I don't like about the current rating system. Needs to have a rating system that in some way can appropriately handle evolving player skill. If it can handle the fact that skill doesn't always determine who wins that would be nice too, ie sometimes as a more or equally skilled player you can still lose just because of bad luck. (Unlike in some other games where luck is little to no factor in the ultimate outcome)
Steep Ladder / Game Creation Algorithm / Ratings: 2/27/2011 20:13:22


Ace Windu 
Level 58
Report
I'm in pretty much the same boat, I'm not a top player but I think i'm better than 1498. I've won 8 and lost 4, 2 against bombfrog very early. Most players i've played have a lower rating than me as well. I know it's VERY early to get a true ranking but I'm wondering how long it'll take?

Also chas, you're probably looking at your one against me as a bad loss. I did get very lucky :). Hopefully we'll get another one soon and you can see if that luck evens out.
Steep Ladder / Game Creation Algorithm / Ratings: 2/27/2011 23:08:18


chas 
Level 43
Report
Ace, I would like a rematch, but mostly because I played poorly. You had some great luck in Central America, but the Finland bad luck you had just about evened it out...

[Game link](http://warlight.net/MultiPlayer.aspx?GameID=1216882)
Steep Ladder / Game Creation Algorithm / Ratings: 2/28/2011 01:19:45


NecessaryEagle 
Level 59
Report
good luck getting a rematch, the current system will try and place you against every opponent you have not played before it will let you rematch someone close to you.
Steep Ladder / Game Creation Algorithm / Ratings: 2/28/2011 01:37:12

Fizzer 
Level 64

Warzone Creator
Report
|> the current system will try and place you against every opponent you have not played before it will let you rematch someone close to you.

This is not true.

Please try to refrain from posting casual observation as facts like this. Since you presented your post as a fact, anyone reading this would assume you're correct and proceed under incorrect information.

The algorithm is posted in its entirety on the Help tab. Players will start to receive duplicate matches as soon as they've played half of the players in the ladder.
Steep Ladder / Game Creation Algorithm / Ratings: 2/28/2011 02:36:04


crafty35a 
Level 3
Report
Chas, I'm definitely with you on this one. I have put feelers out to the very few people who I have found online that have created tools to calculate Whole-History Rating (which is essentially the Bayeselo system WL uses, but designed to track players with changing strength levels). I'm really hoping that someone is willing to share their code or application.

I also found a very similar system to WHR called TrueSkill Through Time, which was developed by Microsoft. They were kind enough to publish source code ( http://blogs.technet.com/b/apg/archive/2008/04/05/trueskill-through-time.aspx ). However, the code will not compile on newer versions of F#, and I don't have access to an old version of Visual Studio/F# to compile the code with. Unfortunately they don't provide a pre-compiled executable. If anyone has the ability to compile in an older version of F#, please let me know (the last version they explicitly mention working is 1.9.6.2)!
Steep Ladder / Game Creation Algorithm / Ratings: 3/1/2011 05:02:09


chas 
Level 43
Report
I'm getting a bunch of matches against higher rated opponents now. Maybe the algorithm is reading these posts :)
Steep Ladder / Game Creation Algorithm / Ratings: 3/1/2011 05:30:34


Elucidar 
Level 61
Report
WL Algorithm has become self-aware.
Steep Ladder / Game Creation Algorithm / Ratings: 3/1/2011 20:07:34


Duke 
Level 5
Report
"Players will start to receive duplicate matches as soon as they've played half of the players in the ladder."

It would've been clearer if you said only one of the players needs to have completed games against half the field -- then they get duplicate matches against anyone (who may or may not have played half the field). I just got another duplicate (vs. fatguylittle coat) and only he's played half the field.

Not sure that's a desireable outcome -- should probably be both players. I hope you've prevented it from going to 3 games against same player until you've cleared the field entirely.
Steep Ladder / Game Creation Algorithm / Ratings: 3/1/2011 20:52:39


Perrin3088 
Level 49
Report
I think the reason for that duke, is primarily because the elo system, and thus this system, were based primarily with the assumption of same game tournaments.. either brackets, or RR's with equal games played for each player, *within 1*

but this is pure speculation
Steep Ladder / Game Creation Algorithm / Ratings: 3/2/2011 19:09:57


Ace Windu 
Level 58
Report
On reflection, most of my recent games have been against people higher than me and I think chas is the only one I've beaten, so this rating is probably about right for me at the moment.
Steep Ladder / Game Creation Algorithm / Ratings: 3/2/2011 19:19:07


Perrin3088 
Level 49
Report
I think I am like 13-12, and most of them i probably deserve.. and i am actually quite happy with where my ratings are.. if i can stay around 1650-1750 I'll be pretty happy, lol
Steep Ladder / Game Creation Algorithm / Ratings: 3/5/2011 02:08:53

The Impaller 
Level 9
Report
Should the algorithm always give you a ladder game when you're at less than the maximum number of games? Or should there be times where it waits to pair you up with someone who would be the optimal pairing for you?

Example: Player A has a rating of 2050. Player B has a rating of 2030. They are in spots 1 and 2 on the ladder but they have not played yet. The ladder should pair A and B together to get an accurate representation of who should truly be on top of the ladder between them, however, this will never happen unless they both happen to finish a ladder game in the same 2 hour block. If they don't finish games during the same 2 hour period, or worse yet, collaborate with each other to ensure that this doesn't happen, then they will never face off against each other, and you will see things like Player A being paired against a 1200 player and Player B being paired against a 1300 player, but not against each other.

I'm curious, because I am continually being paired against players who are rated 800+ points lower than I am, and I'm wondering if the pairing system wouldn't just be better served by waiting to pair me in a game until a slot opens up for me to play Fizzer or Teddy who are ahead of me on the ladder, or any of the players who are within 15% of the ladder behind me.
Steep Ladder / Game Creation Algorithm / Ratings: 3/5/2011 02:24:30

The Impaller 
Level 9
Report
My personal opinion is that the ladder should wait to pair you with people who are within 15% or 20% (or whatever the arbitrary value is) of your spot on the ladder and not pair you against anyone who isn't within that range. In situations where nobody fits that criteria, then the system simply won't give you a game. The system should then give preferential treatment on who is paired in a game the next time around to the person who has been waiting the longest for an available pairing to open up.

I think a system like this is going to be the fastest way to get accurate ratings for the ladder. The current system just kind of hopes to randomly have good players pair each other from time to time and you're not going to get very meaningful results for a long time because it's possible one or two players simply avoid playing good players for a really long time due to when their games start or end.
Posts 1 - 15 of 15   
Discussion is locked - replying not allowed