A new year, a new beginning.

Post Reply
User avatar
Zac
Private First Class
Private First Class
Posts: 359
Joined: Fri Oct 20, 2006 11:59 pm

A new year, a new beginning.

Post by Zac »

I was perusing the 1vs1.bzflag.net website, and i thought that perhaps there was room for improvement. Not only with the website, but with the 1vs1 league itself.

Personally i would like to see the entire 1vs1 league rebooted, all previous data archived and players required to re-register in order to play.

The 1vs1 league is old. In the 10 or so years that it has been around, the playerbase has changed significantly.

"BIG LADDA" is a mess. It's clogged with aliases, cheaters, people who have never even played a match, scoreboard campers and inactive/retired players.
I just think that with so few players actually playing bz atm, it would be nice for those who still play 1vs1 to have a fresh start, and compete against actual players on a scoreboard. Not ghosts of players past.

please post thoughts and opinions.
User avatar
dang dizzy white
Private First Class
Private First Class
Posts: 69
Joined: Tue Oct 29, 2013 11:30 pm
Location: Flowery Path Twelve

Re: A new year, a new beginning.

Post by dang dizzy white »

You've explained it beautifully in your post, so I'm just gonna go ahead and say: Full support.
A total 1v1 League refresh would be wanted, combined with the must to register on the site.
User avatar
kierra
Lieutenant, Junior Grade
Lieutenant, Junior Grade
Posts: 4107
Joined: Wed Mar 23, 2005 1:02 am
Location: outer Slovenia
Contact:

Re: A new year, a new beginning.

Post by kierra »

zac, I like your idea and think it has merit. There have been others that have said the same in the last few years.
"Sometimes people try to expose what's wrong with you, because they can't handle what's right about you."
"Measure your words -- they determine the distance of your relationships"
"If serving is beneath you, leadership is beyond ypu."
User avatar
strayer
Sergeant Major
Sergeant Major
Posts: 191
Joined: Sat May 24, 2003 3:54 pm
Location: Germany
Contact:

Re: A new year, a new beginning.

Post by strayer »

Zac wrote:I was perusing the 1vs1.bzflag.net website, and i thought that perhaps there was room for improvement. Not only with the website, but with the 1vs1 league itself.
You are right. The website (and its design and features) are pretty old. Since I'm lazy and kinda conservative I never saw the need to change that. But compared to other web pages it's indeed looking old fashioned. So I am open for improvements that people think they might be worth it.
Zac wrote:Personally i would like to see the entire 1vs1 league rebooted, all previous data archived and players required to re-register in order to play.
That's possible.
Zac wrote:The 1vs1 league is old. In the 10 or so years that it has been around, the playerbase has changed significantly.
That's correct.
Zac wrote:"BIG LADDA" is a mess. It's clogged with aliases, cheaters, people who have never even played a match, scoreboard campers and inactive/retired players.
That's correct and it will be the same mess after a reset. Just the list length will be different.
Zac wrote:I just think that with so few players actually playing bz atm, it would be nice for those who still play 1vs1 to have a fresh start, and compete against actual players on a scoreboard. Not ghosts of players past.
That's correct.

As kierra pointed out, these arguments aren't new, neither are my concerns. Since other players (whether still active or not) support your idea: Well, why not?

You are right, the 1vs1 League is almost 10 years old (as you can check at the stats page). Rebuilding the league and its website won't be done within a week. So why not planning a restart for 2014-07-01? That'll be the 10th birthday for the 1vs1 League and offers the necessary time.

So you would have enough time to find volunteers for further ideas, planning, implementation and testing. I'll provide you as much as I can.
The only wishes (not necessarily prerequisites!) I have are:
1) a readable / documented code (This is a problem right now. Others than me will have their problems with the current code. I was young and...),
2) an equal or extended feature set (see the plug-in with auto-match-reporting etc. which allows the league to work even without permanent maintenance),
3) the acceptance of current players (which means less complaints about the reset / new website than now),
4) availability of the old match information (via a special page or so) and
5) finding a solution / replacement for the league's biggest issue (in my eyes) - the rating algorithm.
A pessimist is an optimist with experience... ;o)
User avatar
Bullet Catcher
Captain
Captain
Posts: 564
Joined: Sat Dec 23, 2006 7:56 am
Location: Escondido, California

Re: A new year, a new beginning.

Post by Bullet Catcher »

Changing the scoring system to give less weight to results as they get older would give emphasis to current players without having to do an occasional reset.
Another approach is "seasons" as used in the Ducati league, where a reset is done on a regular schedule with recognition for the top scores at the end of each season.
An SQUERRILz
Private First Class
Private First Class
Posts: 91
Joined: Wed Apr 25, 2007 2:08 am

Re: A new year, a new beginning.

Post by An SQUERRILz »

strayer wrote:5) finding a solution / replacement for the league's biggest issue (in my eyes) - the rating algorithm.
Why is it an issue?
User avatar
strayer
Sergeant Major
Sergeant Major
Posts: 191
Joined: Sat May 24, 2003 3:54 pm
Location: Germany
Contact:

Re: A new year, a new beginning.

Post by strayer »

@Bullet Catcher: Good points.
I spent several days (or even weeks) just for playing around with different formulas. So, your points were something we already discussed. (Unfortunately, the pretty old postings to that topic have been deleted due to the old "auto-delete" settings. But there still exists a newer one with some facts.)

1) The weighting of old matches can be handled by using algorithms like TrueSkill or Glicko2. If you find acceptable parameter values, let me know. I tested a lot with Glicko2 and I never got happy with them. (Just using an additional factor to adjust the score weighting wouldn't solve other issues...read below.) A big disadvantage of the mentioned algorithms is the rating score itself. It never is a fixed value because of the deviation factors. I'm not sure players will accept/understand why it behaves like it does.
By the way, team based leagues like GU and Ducati would benefit much more from TrueSkill or Glicko2 because they allow mixing player and team strenght values. This is especially interesting when a player changes to a different team. They also allow to find good matching player groups.

2) Seasons aren't a bad idea. But if you concentrate too much on them (like GU and Ducati) some players could act tactical and won't match at the end of a period but wait for the next instead. That's counterproductive and could be taken into account by the ability to take over some extra points to the next period, like up to 10% compared to the initial rating after a reset. Yes, this could demotivate others. Well, many players argued about that already. (There were surely some sub-points I forgot to mention.)


@jadespicy: Where to start?...

Let's take the current ZELO rating formula as a base. (Read the page to get the abbreviations.)

As you know (or can read if you follow the link above) the 1vs1 League uses a slightly modified ELO rating which takes the match result difference (D) into account (S). Furthermore, it uses a very high weighting factor (K) to get recognizable score differences. While D bases on the 10-kills-limit, S is a (foul) compromise between players who don't want to blame weaker players with 10-0 results and opponents who are almost even. K instead is the result of keeping matches attractive. The original ELO rating formula uses much lower weighting factors - even when beginners are matching. Furthermore, it doesn't care about the real initial player strength because all new players get a fixed score instead of a "good guessed value" which bases on the first opponents' scores and allows higher fluctuations within the first matches and lower after a while. A similar idea would be the option to reduce the "anchor effect" of a score in relation to the time between the matches. (That would address Bullet Catcher's first point.)

Well, there are many more details to keep in mind. That's why we decided to choose the ELO rating (like the other BZFlag leagues) which has some disadvantages but is (at least) simple to understand. Just risk a deeper look into the current ZELO rating formula and play around with the parameters and modifications. That'll give you an idea of how much time you could spend with it. And finally, in case you have a good idea, you have to defend your results/decisions against others who have a different point of view and find circumstances where your modifications might have fatal side-effects.
*jokingly*
A pessimist is an optimist with experience... ;o)
An SQUERRILz
Private First Class
Private First Class
Posts: 91
Joined: Wed Apr 25, 2007 2:08 am

Re: A new year, a new beginning.

Post by An SQUERRILz »

Changing the scoring system to give less weight to results as they get older
This is basically some sort of system where players' rating gain/loss decays in relation to their past history. However this would mean inactives with 900 will gain rating, whereas inactive 2000 player will lose rating.
Another approach is "seasons" as used in the Ducati league, where a reset is done on a regular schedule with recognition for the top scores at the end of each season.
1vs1 already has a monthly cup and it is surprisingly successful, with all sorts of players getting the top score.
strayer wrote: A big disadvantage of the mentioned algorithms is the rating score itself. It never is a fixed value because of the deviation factors. I'm not sure players will accept/understand why it behaves like it does.
Glicko player rating is a fixed rating when no matches happen. But players don't need to understand how a rating system works. The less they understand it the more they will realise that on some level it is just a magic number and not the basis of the game. If they want to become #1 rated like Dotcom then they're playing the wrong game.
strayer wrote:By the way, team based leagues like GU and Ducati would benefit much more from TrueSkill or Glicko2 because they allow mixing player and team strenght values. This is especially interesting when a player changes to a different team. They also allow to find good matching player groups.
I've never seen anything suggesting Glicko can work with Glicko-rated players within Glicko-rated teams. Mixing any of rating, RD or volatility makes little sense.
strayer wrote:Just risk a deeper look into the current ZELO rating formula and play around with the parameters and modifications. That'll give you an idea of how much time you could spend with it.
It is very easy to understand ZELO and the parameter effects.
strayer wrote:Furthermore, it uses a very high weighting factor (K) to get recognizable score differences.
The only real justification for a high K factor in this case is possibly due to low activity. The idea that every match should get recognizable score differences is flawed. If a 1700 beats a 1000 and gets 1 point he/she should not complain or be surprised. One should play to win [or have fun], not play to gain rating.
strayer wrote:S is a (foul) compromise between players who don't want to blame weaker players with 10-0 results and opponents who are almost even.
Then why not just use normal ELO? Distinguishing a match based on closeness is dodgy. Closeness differs based on the map. Players go easy at the start, or players make a certain number of mistakes each game. That does not make the win any less meaningful. That said, S = 1/4 compared to S = 0 (ELO) does not differ dramatically.
strayer wrote:That's why we decided to choose the ELO rating (like the other BZFlag leagues) which has some disadvantages but is (at least) simple to understand. And finally, in case you have a good idea, you have to defend your results/decisions against others who have a different point of view and find circumstances where your modifications might have fatal side-effects.
No rating system is remotely near perfect.
Elo: invented to be simple enough to compute by hand, however its win expectation does reasonably match the dynamics of skill in 1vs1.
Glicko[2]: attempts to solidify Elo's lack of concept of rating convergence but also performance variation. Not designed for real-time rating updates and low match activity - in practice will cause unrealistic rating overshoots and some oscillation.

In summary I don't share your concerns as being substantial because every rating system has scenarios in which it has flawed behavior.
User avatar
strayer
Sergeant Major
Sergeant Major
Posts: 191
Joined: Sat May 24, 2003 3:54 pm
Location: Germany
Contact:

Re: A new year, a new beginning.

Post by strayer »

@jadespicy
Too bad. I don't seem to understand you and/or vice versa. So, a last try because vacation ends today...
jadespicy wrote:
Changing the scoring system to give less weight to results as they get older
This is basically some sort of system where players' rating gain/loss decays in relation to their past history. However this would mean inactives with 900 will gain rating, whereas inactive 2000 player will lose rating.
You aren't telling me anything new. This concept is well known. (If you would had read the backround information to TrueSkill and Glicko/Glicko2, you would know I am aware of that. Interesting that your comments below suggest you would know Glicko rating.)
jadespicy wrote:
Another approach is "seasons" as used in the Ducati league, where a reset is done on a regular schedule with recognition for the top scores at the end of each season.
I hope I didn't understand you...
1vs1 already has a monthly cup and it is surprisingly successful, with all sorts of players getting the top score.
We were talking about resetting the ZELO rating and the player lists and not the monthly trophy. As you might have noticed the ZELO rating is an eternal one while the monthly trophy uses a damn stupid (and easy to understand) formula for player's attraction. But the monthly stuff doesn't say anything about skills (which is the intention of the ZELO score...irrespective of whether it suits that or not). Don't mix that up. That is exactly the reason why I would prefer a better formula (with a raising deviation based on a defined volatility over players' inactivity times) to replace the current ZELO formula than anything a season-like reset.
...in case I misunderstood you, you wanted to say we should let the monthly trophy as it is without any season-like whatsoever?
jadespicy wrote:
strayer wrote: A big disadvantage of the mentioned algorithms is the rating score itself. It never is a fixed value because of the deviation factors. I'm not sure players will accept/understand why it behaves like it does.
Glicko player rating is a fixed rating when no matches happen. But players don't need to understand how a rating system works. The less they understand it the more they will realise that on some level it is just a magic number and not the basis of the game. If they want to become #1 rated like Dotcom then they're playing the wrong game.
"fixed rating" isn't correct. A player's "strength" is the combination of a player's last calculated score value and the (over time) increasing deviation. So we are not talking about a fixed score but about a range. Furthermore, you might check a few pinball leagues using Glicko/Glicko2 to realize that some of them use the score reduced by expected deviation. I prefer such a pessimistic approach.
However, when you write players are playing the wrong game when they focus on a ZELO score (and a rating they might not understand) then we don't need to reset it - simply removing it would be enough to keep them onboard.
jadespicy wrote:
strayer wrote:By the way, team based leagues like GU and Ducati would benefit much more from TrueSkill or Glicko2 because they allow mixing player and team strenght values. This is especially interesting when a player changes to a different team. They also allow to find good matching player groups.
I've never seen anything suggesting Glicko can work with Glicko-rated players within Glicko-rated teams. Mixing any of rating, RD or volatility makes little sense.
My fault. This is something that is done with TrueSkill not with Glicko. And using that rating to find player pairs/triples/... that seem to harm is just a research attempt of a paper I read.
jadespicy wrote:
strayer wrote:Just risk a deeper look into the current ZELO rating formula and play around with the parameters and modifications. That'll give you an idea of how much time you could spend with it.
It is very easy to understand ZELO and the parameter effects.
Don't forget these two sentences were the conclusion of the previous paragraph. I even wrote ELO is a kinda easy to understand rating and ZELO is just a simple modification. I know my English is far away from "good" and I tend to create nested sentences. So don't pick out a word and ignore the context.
jadespicy wrote:
strayer wrote:Furthermore, it uses a very high weighting factor (K) to get recognizable score differences.
The only real justification for a high K factor in this case is possibly due to low activity. The idea that every match should get recognizable score differences is flawed. If a 1700 beats a 1000 and gets 1 point he/she should not complain or be surprised. One should play to win [or have fun], not play to gain rating.
I know and I see it the same way, but be realistic! You need ugly compromises (either a periodical reset, which perverts the idea of a strenght rating, or some noticeable score change if a "1700 bears a 1000", which is kinda disproportional). Otherwize you run into the same situation that existed/exists in the 1vs1 League - that some players refuse to match much weaker players because a potential loss doesn't outweigh the win in their eyes. And if you want to ignore all success-oriented players, then we don't need any league for BZFlag. (This isn't a black-or-white view but you know that nice people can become strange inside a competition.)
jadespicy wrote:
strayer wrote:S is a (foul) compromise between players who don't want to blame weaker players with 10-0 results and opponents who are almost even.
Then why not just use normal ELO? Distinguishing a match based on closeness is dodgy. Closeness differs based on the map. Players go easy at the start, or players make a certain number of mistakes each game. That does not make the win any less meaningful. That said, S = 1/4 compared to S = 0 (ELO) does not differ dramatically.
I understand your arguments. But what about players who are almost equal but one player is always a bit better? What do you think how often the little less good player will match the better one? You could, instead, raise S to 80%. And even then the results wouldn't change dramatically. (For single matches "yes" but overall "no". I checked that.)
jadespicy wrote:No rating system is remotely near perfect.
Elo: invented to be simple enough to compute by hand, however its win expectation does reasonably match the dynamics of skill in 1vs1.
Glicko[2]: attempts to solidify Elo's lack of concept of rating convergence but also performance variation. Not designed for real-time rating updates and low match activity - in practice will cause unrealistic rating overshoots and some oscillation.

In summary I don't share your concerns as being substantial because every rating system has scenarios in which it has flawed behavior.
So, taking into account your final words and the arguments above: What about removing the ZELO score from the 1vs1 web pages? (Or making it much less important and eye-catching, at least.)
The ZELO score becomes senseless (as a more or less useful eternal rating) when resetting it periodically. For organizing groups for further contests it would be used but as long as the scores are invisible for the players, nobody can complain about weak points in the design of the formula or behave egoistic when a much weaker player wants to match. The current monthly trophy is easy to understand and does what it is good for.
A pessimist is an optimist with experience... ;o)
An SQUERRILz
Private First Class
Private First Class
Posts: 91
Joined: Wed Apr 25, 2007 2:08 am

Re: A new year, a new beginning.

Post by An SQUERRILz »

strayer wrote:I hope I didn't understand you...We were talking about resetting the ZELO rating and the player lists and not the monthly trophy.
Bullet Catcher mentioned Seasons. A one-time ZELO reset... fair enough. But seasonal resets do not make sense in the context of rating systems. It also impractical to "see who gets the highest zelo each season" (that is the only other reason why one would even consider season resets) because the mentioned rating systems don't work to that means and we already have the monthly cup that is successful and has had many winners.
jadespicy wrote:This is basically some sort of system where players' rating gain/loss decays in relation to their past history. However this would mean inactives with 900 will gain rating, whereas inactive 2000 player will lose rating.
You aren't telling me anything new. This concept is well known.
Pessimistic rating or not, it is not normal for inactive low rated players to gain rating over time. BC alludes to a specific detail that is not present in that form in the known rating systems. Therefore it is more likely he is thinking of some ad-hoc decay mechanism.

FWIW, Glicko2 is going to be less stable than ZELO due to low number of matches per period. For example if everyone starts with 1000 Glicko2 rating and Zac wins 50 games to become 1800, I could just beat Zac 5 times and become 2200 and camp the scoreboard. Potentially the problem of him worrying about losing points is reduced, but the new problem of him giving me points? Then he might hope for another reset :twisted:
strayer wrote:Furthermore, you might check a few pinball leagues using Glicko/Glicko2 to realize that some of them use the score reduced by expected deviation. I prefer such a pessimistic approach.
Now your implied decay mechanism makes more sense. I have not seen it in practice and RD has a cap, meaning that a bloated rating of 2000 will only decay at most a fixed amount say 400. So at 1600 after not playing for a year (say) he is still high in the leaderboard.
strayer wrote: However, when you write players are playing the wrong game when they focus on a ZELO score (and a rating they might not understand) then we don't need to reset it - simply removing it would be enough to keep them onboard.
Otherwize you run into the same situation that existed/exists in the 1vs1 League - that some players refuse to match much weaker players because a potential loss doesn't outweigh the win in their eyes.
That doesn't justify a high K factor, since instead of gaining 1 for a win and losing 12 for a loss you simply scaled it up to gaining 2 for a win and losing 24 for a loss (as an example). I expect the same selective matchers in 1vs1 don't match at all anyway. It is not known that catering to their "needs" will plausibly bring them to life, as 1vs1 matches are usually arranged by word of mouth.
jadespicy wrote:
strayer wrote:Furthermore, it uses a very high weighting factor (K) to get recognizable score differences.
The only real justification for a high K factor in this case is possibly due to low activity. The idea that every match should get recognizable score differences is flawed. If a 1700 beats a 1000 and gets 1 point he/she should not complain or be surprised. One should play to win [or have fun], not play to gain rating.
I know and I see it the same way, but be realistic! You need ugly compromises (either a periodical reset, which perverts the idea of a strenght rating, or some noticeable score change if a "1700 bears a 1000", which is kinda disproportional). And if you want to ignore all success-oriented players, then we don't need any league for BZFlag. (This isn't a black-or-white view but you know that nice people can become strange inside a competition.)
jadespicy wrote:
strayer wrote:S is a (foul) compromise between players who don't want to blame weaker players with 10-0 results and opponents who are almost even.
Then why not just use normal ELO? Distinguishing a match based on closeness is dodgy. Closeness differs based on the map. Players go easy at the start, or players make a certain number of mistakes each game. That does not make the win any less meaningful. That said, S = 1/4 compared to S = 0 (ELO) does not differ dramatically.
I understand your arguments. But what about players who are almost equal but one player is always a bit better?
If you are better, you deserve to win points off the other guy. Unlike in chess, equal performance is not something you explicitly aim to measure in a 1vs1 because it can't be done realistically. I can beat someone 10-0 every time on one map or 10-6 every time on another map, doesn't mean I played any worse. You are trying to cater for a scenario that is not your concern/responsibility as a rating system designer.
For organizing groups for further contests it would be used but as long as the scores are invisible for the players, nobody can complain about weak points in the design of the formula or behave egoistic when a much weaker player wants to match.
Has anyone complained about ZELO after the remake where a winner can't lose points?

Every system has trade-offs, and to dislike certain trade-offs does not mean that the system should not be accepted. Rating systems are just a statistic for interest, but when you go into the real competition it doesn't "state your chances as if you are bounded by its prediction". However it might be considered to remove just the Zelo gain/loss report from the plugin.

TrueSkill is patented.

By the way, I am not specifically advocating for or against anything in 1vs1.
User avatar
macsforme
General
General
Posts: 2069
Joined: Wed Mar 01, 2006 5:43 am

Re: A new year, a new beginning.

Post by macsforme »

I know this is a fairly isolated case, but I found the 1vs1 league's ZELO ratings to be useful as a way to rank players for the HiX Doubles tournament. Even though it was perhaps not designed (and not a perfect fit) for this usage, there are few other options as far as player skill ratings that cover such a large portion of the player population. I feel that it would be a loss if all of that was reset, especially if it was reset regularly, without keeping some more long-term data available.
User avatar
Zac
Private First Class
Private First Class
Posts: 359
Joined: Fri Oct 20, 2006 11:59 pm

Re: A new year, a new beginning.

Post by Zac »

consti, i agree that as a quick point of reference, the 1vs1 site was a handy way of sorting out the rankings for your tournament, but i can see several issues with it. For starters a 1vs1 ranking doesn't necessarily represent the skill of a player in a team situation. A player could have played few matches in 1vs1 giving them a lower rating or they could have played many matches selectively giving them a higher ranking. In your situation i would have suggested that you pit teams against each other based on how good you thought they were (whether you wanted a 2nd and 3rd party to weigh in would have been up to you), simply because data only accounts for a single variable in league which has a somewhat different objective.

My suggestion would be to have an "archives" link on the website which allows players to access big ladda rankings of players prior to reset. Personally i don't see the point. The 1vs1 site is so old it's just eye candy, and not particularly good eye candy at that (sorry strayer :P ).
jh^
Private First Class
Private First Class
Posts: 41
Joined: Mon May 29, 2006 5:10 pm
Location: kuopio,finland
Contact:

Re: A new year, a new beginning.

Post by jh^ »

Yep pls reset the stats, its silly that retired players keep the 1. or whichever high place forever.
User avatar
kierra
Lieutenant, Junior Grade
Lieutenant, Junior Grade
Posts: 4107
Joined: Wed Mar 23, 2005 1:02 am
Location: outer Slovenia
Contact:

Re: A new year, a new beginning.

Post by kierra »

There is some validity to resetting. Have a Hall of Fame page for historical purposes.

In the 1v1 Tourney last summer, one player told me he wasn't going to register for it because a loss would affect his stats & ranking.
"Sometimes people try to expose what's wrong with you, because they can't handle what's right about you."
"Measure your words -- they determine the distance of your relationships"
"If serving is beneath you, leadership is beyond ypu."
jh^
Private First Class
Private First Class
Posts: 41
Joined: Mon May 29, 2006 5:10 pm
Location: kuopio,finland
Contact:

Re: A new year, a new beginning.

Post by jh^ »

Finally....redders name is not on top anymore :P. ty.
User avatar
Snake12534
Private First Class
Private First Class
Posts: 216
Joined: Thu Oct 04, 2012 9:41 pm
Location: Austin, Texas

Re: A new year, a new beginning.

Post by Snake12534 »

2014 1vs1 Tournament?
retired
User avatar
strayer
Sergeant Major
Sergeant Major
Posts: 191
Joined: Sat May 24, 2003 3:54 pm
Location: Germany
Contact:

Re: A new year, a new beginning.

Post by strayer »

Snake12534 wrote:2014 1vs1 Tournament?
I'm not sure whether that's the right topic for your question, but: Yes, why not. Are you asking as a voluntary organizer? :)
A pessimist is an optimist with experience... ;o)
Post Reply