Why Elo ratings are less efficient for Yugioh Duel Links than for chess?

Yugioh Duel Links is a digital collectible card game (CCG), which could be played on mobile devices. Like many other CCGs, there is an in-game Ladder system, where players compete with each other to prove themselves as the best duelist in the world. However, many players complain about the mechanism of this Ladder and suggest replacing it with the Elo system. In this post, I can show you, thanks to the cutting-edge research of DeepMind, that the Elo system, or any other systems using averaging, is unavoidably inefficient for Duel Links.

According to my previous research, a diverse and accessible meta must be a game similar to rock-paper-scissors. This phenomenon can be perfectly observed at the current meta, where Koa’ki Meiru is dominated by Buster Blader, which is dominated by Ancient Gears, which is in turn dominated by Koa’ki Meiru. These three archetypes prey on each other and hence prevent any of them from being predominant.

This phenomenon is called cyclic domination, or nontransitivity in some contexts. It has recently been revealed by DeepMind that Elo ratings are a poor measure for games of this nature. In the following, I will explain their research result in layman’s terms and make it clear why Elo ratings are inefficient for Duel Links.

Let us start with the simplest case: a meta with three strongest archetypes – Rock, Paper, and Scissors, whose relationship can be represented as in the following figure (where the domination is strict).

Let us suppose that there is an equal number of Rocks, Papers, and Scissors in the player pool. For the Rock player, for each duel paired with a Scissors player (against whom he wins), he will also be paired with a Paper player (against whom he loses). In average, he wins half the duels.

If there are no other off-meta archetypes in play, his win-rate will be 50%. It is thanks to the off-meta archetypes that he is able to boost his win-rate above 50%. If one archetype within the Stone, Paper, and Scissors has a higher win-rate than the other two, it is because it can win against off-meta archetypes more efficiently (e.g., more quickly or more consistently).

Therefore, any duels among the meta archetypes are useless, and only the duels against off-meta archetypes can help boost the win-rate.

The Elo system, which is just a smarter way to calculate the win-rate, cannot offer anything more. Let us consider a scenario where the domination is soft and the relative strength can be represented by the following chart.

In this chart, the higher the number is, the more strict the domination will be. Let us call this chart the flow chart. According to Hodge theory, any flow is the sum of a divergence flow and a rotation flow. The decomposition for the example above is illustrated in the following figure.

The divergence flow represents the absolute strength, and the rotation flow represents the cyclic domination. Interestingly, the Elo rating loyally reports the divergence flow but completely discards the rotation flow.

In other words, the Elo rating can only reflect a very small portion of the information generated by your dueling results. That is why you have to play many games to get a precise Elo rating and hence get frustrated.

This inefficiency is not present in other games like chess, where the strength is transitive. In those games, the rotation flow is zero. Therefore, the Elo rating reflects the totality of the game results.

In the above, I explained why Elo ratings are inefficient for Duel links. Although they are inefficient, they can still capture some information.

The first thing they can capture is the capacity to win the disadvantageous battles. If a Rock player manages to win against a Paper player (e.g., by making it miss the timing), it surely demonstrates the Rock player’s skill.

The second thing is the capacity to win the mirror match. Yami Yugi won against Arkana and proved him as the true Dark Magician master.

The third thing is the capacity to predict the meta at any given moment. When the Rock players are more frequent than the other two archetypes, a skilled player will switch to Paper. Similarly, when the Paper players become more frequent later, that skilled player will quickly switch to Scissors.

Conclusion

Elo ratings mostly represent the deck strength. Only for very high ratings, it can somewhat represent the skills. It can capture a small portion of information, and in a very inefficient way. To get a precise Elo rating, you need play a large number of duels.

You may also like

Written on March 12, 2019