League of Legends can be a bit overwhelming to beginners. The basic goal is simple—destroy the enemy Nexus. There are a bunch of small AI enemies, some big AI enemies, and 5 humans that want to destroy your nexus first.
Beyond that, there are a ton of nuances—it wouldn’t be a successful e-sport if there weren’t. One thing that can be intimidating is the sheer number of champions. Riot! lists roles for each of them, but people don’t always play the champions as intended. Playing a champion in a bad role, especially for a beginner, makes the game harder than it needs to be, and can reduce fun.
ChampMap is a simple visualization that shows you how real players use different champions. While it isn’t going to revolutionize the way people learn about League, I think it’s pretty cool. It could also serve as a tool for Riot! when looking at how champions are actually being used, or what types of champions to target for upcoming releases.
Click to view the full interactive visualization!
How it Works
ChampMap is the output of a neural network trained on around 7.8M team compositions on Summoner’s Rift. The network takes the identifiers for four champions on a team and tries to predict the team’s final champion. Let’s first think about how a human might do this for a hypothetical team.
Our team has
If you’ve played LoL at all, you probably recognize that we’re missing an AD carry. Someone like Caitlyn, Ashe, or Ezreal would be alright here. It’s unlikely that you’ll see someone like Leona, because you wouldn’t have enough damage.
How can that intuition be codified? Maybe we want to cluster the champions, and hope that the different roles will be nicely captured in the clustering. That could work, but not everyone fits into a single role. There are plenty of examples, but Nunu can be played in the jungle or as a support.
What if we take each champion’s stats (attack damage, attack speed, mana, move speed, etc.), and perform some sort of dimensionality reduction like Principal Component Analysis? That could work too, but it would only reflect the rules, not how people actually play the game.
ChampMap takes a different approach—one that is very similar to word embeddings used in Natural Language Processing (see this paper as an example). Each champion is assigned two numbers that are meaningless outside of the model (the champion’s embedding). When we see four members of a team composition, these numbers are concatenated into an 8-dimensional vector. There’s one hidden layer, then we try to predict the remaining champion. When we see a new team composition, the embeddings, as well as the weights in the hidden and final layer are updated to account for that newly observed team.
Here’s a picture showing that same structure.
The visualization is then just the plot of all champions’ embeddings. If I started this from scratch today, I’d use Pylearn2, but since that wasn’t around when I did most of this work (around April 2013), I used Theano directly.
The more saavy among you might realize that the embeddings don’t look
quite right given the current state of League. The newer champions
aren’t shown, and Annie isn’t showing up as anything other than with
the AP mid champs. This is because the data aren’t new—they were
scraped over the course of a few months, and most of the games are
between March and July of 2013. Here’s a look a the number of games
in my data over time.
You see the weird patterns because I ran it every few days, and it captured the most recent 10 games of each summoner in my database.
This is an artifact of the way I got the data. I used the no-longer-free Elophant API, which works a lot like the new Riot! API. It would have been ideal to take all games in a slice of time, or some random sample of games by time, but that feature isn’t exposed. Instead, I did something akin to Snowball Sampling.
- Seed the crawler with the summoner names a few high-ranking players
- Get the recent match history for those players (10 games)
- Add all of the summoners that they’ve played against to a queue
- Get the match history for the summoner at the top of the queue
Since getting information about a summoner consumes an API call, I changed this up a bit after running this for a few weeks. By the time I had match histories for around 75,000 summoners, I stopped getting new summoners and just added updated new matches from the existing list of summoners.
For those that are interested, I stored the data in mongoDB, which was easier that a SQL database because the schema was quite complex, and I wasn’t sure what I would need. Most of the analysis was done in Python, with a little bit in R because I love ggplot2.
How it Turned Out
I’m really happy with the way this turned out. It’s clear that AD carry is a really well defined role, and so is support. You have a few champions that can be played as AP mid or support, and they show up as between the areas inhabited by those roles. The distinction between jungle and top-lane bruisers is fuzzy, and Teemo is a bit of an outlier!
What do You Think?
Is this kind of interesting, really lame, or the coolest thing you’ve seen in a while? Does anything just out as being weird, or unexpectedly insightful? Let me know in the comments below!