Rob Wiblin is the research director of 80000 Hours. His writings about philanthropy, economics, and life advice are read by thousands of people. Last week, Rob tweeted:
This tweet was read by 4 million people.
Let it not be said that I can’t take a hint. I’ve written extensively about philanthropy, economics and life advice, so it’s time I switched to writing about what really matters: Pokémon. Specifically, let’s answer the main question on the nation’s mind these days: how long will it take to, in fact, catch ’em all.
This isn’t a trifling matter to be addressed by guesstimating a few numbers and tossing them in an equation. In my quest for the ultimate pokemonumber I observed the best hunters, counted every single pokemon in my own deck, sourced the wisdom of the internet, estimated a Bayesian parametric model, programmed a simulation, and braved the stormy seas of GitHub for the first time so you can play with the program yourself. And of course, it all took twice as long as it should have because I frequently had to stop writing and go on a pokewalk in my neighborhood. All in the name of research.
The first thing we know about catching ’em all is that one guy out of the 20 million Americans with Pokemon Go on their phones has indeed done so, and that 19,999,999 still haven’t. It took Nick Johnson about 100-120 hours over two weeks to complete the American pokedex. That’s high but not uniquely so, there were probably several thousand people who played as many hours and Nick turned out to be the luckiest among them.
There’s a limit to how much we can extrapolate from a single data point, but it gives us a useful edge case: our simulation should catch ’em all in as little as 100 hours about once every several thousand tries. The average result should be in the order of magnitude of several hundred hours.
To get better detail, I looked at my own play two weeks after installing the app. IGN’s wiki informs us that Pokemon Go uses 4-8 MB of cellular data per hour. I’m using LTE so my own usage is probably on the high end, let’s say 7 MB/hr. I think I play for about an hour a day so my Pokemon data usage should be around 100 MB for 14 hours of play.
Holy crap this game is hard to put down! Ok, so that makes it 30 hours of play. During these 30 hours I have captured 502 pokemon, but I ignore about every third pokemon I see so that means that I’ve seen around 753, or 25 pokemon sightings per hour of play.
The next thing to figure out is the rarity of each specific pokemon. Here’s the best chart the internet currently has to offer:
This is a really useful chart, but there are three things missing from it:
- It only has one pokemon from each of the 69 evolutionary lines (66 that are available on each continent). I couldn’t find information on the chances of seeing an evolved pokemon, so we’ll start by estimating how long it will take to catch one pokemon in each genus and extrapolate from there.
- As people have commented on Reddit, the rarity of pokemon is very location-dependent. Drowzees overrun Toronto at night, but there are hardly any in Queens in daytime. The chart has zubats outside the most common category, this sounds crazy to anyone in Pittsburgh where every other pokemon is a zubat. This means that the general structure of this chart is useful, but we won’t rely on the specific types of pokemon in each category.
- The main thing missing from this chart is numbers. That’s where I come in.
What we need is a numerical rarity table for each pokemon. What we have is the chart above and the 753 pokemon I have meticulously recorded like Darwin on The Beagle. Our challenge lies in creating a general and broadly applicable model from very limited and specific data. The art and science of model selection consists in finding the right balance between applicability and precision, or equivalently between variance and bias, or parsimony and fit to the data.
Out of the 753 pokemon that I have seen, 37 have been pidgeys, 31 weedles and 4 ponytas. An overly specific model would say that the probability of seeing a pidgey is thus exactly 37/753 = 4.9%, for weedles it’s 4.1% and 0.53% for ponytas. This fits my observations perfectly, but it’s unlikely to generalize. A priori, it’s much likelier that I have simply lucked into more pidgeys than that Niantic programming pidgeys to show up exactly 4.9137% of the time for every player. This approach would use the data too much, and overfit the model. In model selection language, we estimate too many parameters: 66 of them if we’re picking a unique probability for each pokemon type.
On the other hand, saying that every pokemon has the same 1/66 chance of appearing is too general a model. It may be a reasonable guess ahead of time, but it doesn’t fit the data at all. The chance of seeing 10 times as many pidgeys as ponytas if they were equally likely is one in 20 million.
A balanced model tries to get close enough to the observed data, while using the least number of parameters, in accordance with Occam’s Razor. For example, we can assume that pidgeys and weedles fall in some category with other types that all have a roughly 4-5% chance of appearing, while ponyta is in another category that has a different, lower probability.
The reddit chart has 9 rarity categories plus “special”, but I’m not sure what’s special about the latter except that some of them are available as starter pokemon. We’ll stick with 9 groups. Each category has a different number of pokemon in it, and that strikes me as an unnecessary complication. I haven’t seen any of the 3 mythical pokemon yet so I’ll take the chart’s word on 6 and 3 pokemon in epic and mythical. On the everywhere side, there are 4 pokemon that show up much more often for me than any others: zubats, rattatas, drowzees and doduos. I will assume that the remaining 53 pokemon are evenly spread among the 6 categories in the middle: 9 in each (one group will have 8). That’s because there’s no reasonable amount of data the chart maker could have gathered that will show that there are exactly 7 pokemon in common and 9 in uncommon and not 8 and 8 or 9 and 7. In the absence of evidence, we should assume that the groups are equally sized to be parsimonious.
Now the tricky part: how to assign a probability to each pokemon type without “using up” too many parameters? Here, the edge cases are useful. The four everywhere pokemon account for about on third of all the ones I see, so I’ll give every pokemon in that category around 8% (1/4 of 33%). I have seen 3 of the 6 epic pokemon, so the chance that a single pokemon I come across is epic is roughly 3/753, or 0.4%, and each of the 6 epic pokemon has around .06%. Epic is 7 categories away from everywhere, and .06% * 2^{7} ≈ 8%. Hmm. Could the per-pokemon probability in each successive category differ by a factor of 2? That would certainly be elegant, let’s see if it fits.
If each category has half the probability per pokemon as the one above it, each pokemon the virtually everywhere category will have 4% of showing up. 4% * 753 = 30, so if that’s the case I’ll expect that the 8 or 9 most common pokemon after the first 4 will show up around 30 times each. Here’s what I’ve actually seen:
Spearow | 40 |
Pidgey | 37 |
Weedle | 31 |
Caterpie | 26 |
Voltorb | 25 |
Krabby | 20 |
Magnemite | 18 |
Tentacool | 15 |
Not perfect, but not unreasonable as an approximation. The fit gets better as I look at the other categories. For example, the common pokemon are ranked 31-39 based on rarity and I expect to see each of the common pokemon 4 times out of 753 (0.5%). Looking at my records, I have in fact seen each pokemon in places 31-39 between 3-5 times. If my model is correct, one of the three mystical pokemon will show up once every 1000 tries or so, since 3 (mystical types) * 8% * 2^{-8 }≈ 1/1000. The fact that I haven’t seen any in 753 tries yet fits the data. It also means that I’m frickin’ due to find one soon.
This then is my best guess at the probability distribution of pokemon of each type:
Group name | N in Group | Examples
(where the chart and I agree) |
P(each) |
Everywhere | 4 | Zubat, drowzee, pidgey | 8% |
Virt. Everywhere | 8 | Oddish, weedle | 4% |
Very Common | 9 | Magikarp, krabby | 2% |
Common | 9 | Nidoran, clefairy | 1% |
Uncommon | 9 | Geodude, jigglypuff | 0.5% |
Rare | 9 | Tangela, koffing | 0.25% |
Very Rare | 9 | Rhyhorn, scyther | 0.125% |
Epic | 6 | Electabuzz, hitmonlee | 0.063% |
Mystical | 3 | Snorlax, porygon | 0.031% |
I like this model: it matches both the chart and my own pokecounts, and instead of sixty six parameters we described the data using only nine: 9 categories; equal probability within each category; 4, 8, 6, 3 pokemon in some categories, 9 in the rest; 8% for the probability of each everywhere pokemon and 1/2 probability drop for each consecutive category. And wouldn’t you know it, when you add the probabilities of each pokemon appearing it all adds up to 100%.
Who’s the boss of Bayesian data analysis? I’m the boss.
(Just kidding, Andrew Gelman is the boss.)
OK, one more step to go: given the probabilities of coming across each pokemon type, how many pokemon will you see before seeing one of each? This question is too difficult for my puny brain to calculate using algebra, but it’s not too difficult for my laptop to calculate using brute force. Ladies and gentlemen, welcome to the Putanumonit simulations Github repository.
pokemon.py is a small Python program simulating Pokémon Go games. Each game consists of turns in which you see a single pokemon based on a given probability distribution. The function counts the turn on which you saw each pokemon for the first time. The sim function allows you to simulate many games and calculate aggregate statistics. Here’s what I got from running the simulation 10,000 times with the probability distribution I estimated above:
- The median number of turns before seeing each of the 66 pokemon types is 5,800, or 232 hours of play (at 25 pokemon/hour).
- The mean is higher, at 6,600 turns. That’s because the “unlucky” simulations do much worse (~20,000 turns) than the “lucky” ones do well (~2,000). The “unluckies” skew the mean upwards, but not the median.
- The quickest any of my 10,000 sims did was 1,150 turns, or 46 hours. We can assume that Nick Johnson is about 1-in-10,000 lucky himself, and it took him 100-120 hours. This means that catching all 142 distinct pokemon takes 2-3 times as long as catching just the 66 evolutionary types. This is a very rough estimate based on a single data point, but it doesn’t sound utterly unreasonable. Whatever the real ratio is, it’s probably closer to 2 or 3 than to 1.01 or to 30.
- So, my best estimate is that catching all 142 pokemon, if you aren’t as lucky as Nick, will take at least 500-700 hours, or about a year of your life if you play for 2 hours a day.
- Yes, that sounds like a lot, but what else were you really going to do with the rest of 2016-2017?
- In 65% of simulations the last pokemon caught was one of the three mysticals, and these simulations took 7,460 turns on average. In 30% the last pokemon caught was epic, these took 5,310 turns. In only 5% of the simulations the last pokemon caught was one of the 57 types that aren’t in the rarest two groups; these were the quickest games of all at 3,740 turns on average. This means that the length of the game is mostly determined by how long it takes you to catch the 3-6 pokemon that are rarest in your area. The amount of time to complete the other 60 common types in the pokedex is relatively fixed.
You can play around with the program and see how the games play out under different assumptions. You’re also welcome to comment with the data you gathered on your own hunts and other questions you want answered. From now on, this blog will forego any economical, philanthropic, political, or scientific distractions and focus exclusively on Pokémon until the madness subsides. Or until we catch ’em all.
Interesting.
LikeLike
So for the obsessive completists… that’s about a year, and also trips to Australia, North America, and Japan :-p (do the region ones work like that?)
LikeLike
Pretty much. The game is out in Aussie and NZ, most of Europe, US and Canada in the Americas and only Japan so far in Asia. I should start a cruise line taking Pokémon Go players around the world, with the ship moving slowly enough for all your 10Ks to hatch 🙂
LikeLike
You’ve said that 3/753 = .04% whereas it is actually 0.4%. I’m not sure how much this will effect your model
LikeLike
Great catch, thanks. I actually had the correct numbers in the table and the model (0.4% for the category, 0.063% for each pokemon in it) but I mistakenly added an extra 0 in the paragraph. The model is still good, the typo is fixed.
LikeLike
I’m really surprised your blog doesn’t have hundreds of comments on every post. I usually don’t even glance down here because I assume I’ll be one voice in a crowd, but it seems that everyone is thinking the same way. Anyway, I want to say I love your work, I check for new entries everyday and I make sure my friends read everything you write.
I can’t add much on Pokemon Go, I’m from Argentina and it came out just yesterday. The crime rates are quite higher down here, so it’s really going to be an adventure to go on a Pokewalk. Too many Team Rockets. Nice model, though!
LikeLiked by 1 person
Aw, thanks!
I think that people will want to comment once every post gets to 10-20 comments and it feels like an actual discussion is going on, like on “Vote Against” on “Secret Society”. I try to do what I can to encourage people to comment, especially if they’re writing to tell me I’m wrong. Bottom line: don’t be shy, if you have a thought on a blog post or someone else’s comment, go ahead and post it!
LikeLike
Once you formulate the model, it becomes the non-uniform Coupon Collector’s problem. See the last bullet point here
https://en.wikipedia.org/wiki/Coupon_collector's_problem#Extensions_and_generalizations
where p_i are the probabilities of catching a certain pokemon, i.e. the P in your table.
So instead of Monte-Carlo, one can simply numerically evaluate the integral there to get the exact result 🙂
LikeLiked by 1 person