Tails of Great Soccer Players

Isn’t it strange that the Chinese aren’t world champions in every single team sport? Here’s why it’s strange: China has 19% of the world’s population. For individual sports that may not be a huge deal: if tennis ability and opportunity are distributed equally around the world, there would be only a 19% chance that the best tennis player hails from China and 81% that he is Swiss, Serbian, Spanish, Scottish or from any other country. It is somewhat surprising seeing the top 5 superior servers and strikers of soft springy spheres with swings of stringed racquets all come from sovereign states that start with “S”, but that’s a separate story.

In team sports that should be different. If soccer talent was equally spread China should have on average 19 of the top 100 players in each generation, almost never less than 11. Countries like Spain, Germany and France on the other hand would expect to have 1 player in the top 100, maybe 2 or 3 if they’re lucky. That would be no match for the loaded Chinese squad. Even a top 3 player can’t dominate all by himself in a team-based sport like soccer, as evidenced by the below picture of sad Ronaldo.

Sad-Cristiano-Ronaldo-Portugal-2008-wallpaper


And yet, the Chinese team is not good at soccer, and I’m putting that milder than some. The Chinese men’s national soccer team is ranked 84th in the world, a few spots below Antigua and Barbuda – a nation with a population of 90,000. That’s roughly equal to a single neighborhood in Shanghai. Motivation is often brought up as an explanation: perhaps the Chinese have the talent and opportunity to play soccer, but all 1.3 billion of them choose not to. Perhaps instead of playing soccer they choose to study. Those that play soccer the least and study the most can go into medicine, and those that study hardest of all and have no room for soccer make it into top medical schools in the US.

Certainly we don’t expect those Chinese to play soccer at all, and yet below is a group photo of the Emory University medical school soccer club. The summer I was there we played at least 4 hours a week. You can easily find me on the photo, I’m one of three non-Chinese people on the team.emory soccer


The success of a national soccer team should depend on two factors: the pool of available players (population) and some combination of natural talent, infrastructure and opportunity that determine roughly how successful an average person in that country can be at soccer. I’ll call the combined second thing national soccer affinity, and will immediately note that it’s a huge simplification to throw so many disparate things into a single factor. My goal is to separate the effects of population, so affinity is basically everything that’s independent of a country’s total size. I am making no guesses regarding the components of soccer affinity (maybe it’s all about having enough sunshine days for kids to play outdoors), only in the comparison between countries. The question I want to investigate is:

Relative to their population, which countries are the best and worst at soccer? And why?

soccer bell curveIf we imagine that soccer affinity is normally distributed, a country’s population is the size of the bell curve and the national affinity is how far to the right on the ability axis the center of the bell curve is. The level of a country’s national team is how far on the ability axis the best 11 men and women are. Clearly, having a larger bell curve (more people at every level of play) and shifting the curve to the right (better players on average) should both contribute to boosting the level of the national team. The fact that there are over 15,000 Chinese for each Antiguan, and yet the soccer teams are comparable in level, presents the following puzzle:

Why does it seem that national team level depends on affinity much more than on population?


The answer to that puzzle is: Because the tails of a normal distribution fall much faster than you think. 

In plain(er) English: every point on a bell curve is some distance away from the middle (the mean). The further away from the mean you go the less points there are (lower curve). These distances are often measured in standard deviations, or SD, shown by the vertical red lines on the picture. On a standard bell curve, just over 68% of the points are found a distance of less than 1 SD from the mean in either direction.

norm_curv_12sdLooking naively at the familiar bell picture, it seems that the curve drops sharply over the first 2 or 3 SD to either side and then levels off around 0 when you move further away. That’s extremely misleading: the relative height of the curve actually drops faster the further out you go. It’s invisible on the chart because the line further than 3 SD out is squished very close to 0. The height of the curve at 1 SD is 4.5 times higher than that at 2 SD. The curve at 5 SD is 250 times higher than that at 6 SD and it keeps getting steeper and steeper.

The best male soccer player in China (Zheng Zhi?) is almost literally one in a billion, which means that he’s almost 6 standard deviation better than the average Chinese. If the population of China doubled (they’re working on it!), there would be 2 players as good as Zheng is. However, if the population of China became just one standard deviation better at soccer, there would be over 200 players at least as good, and a few dozen who are much better.


It could be that a normally distributed soccer skill model is wholly wrong, but it does seem to explain some of what we see in reality. For anything that’s distributed roughly like a bell curve, the quality of the best people in a large enough group (like a country) depends much more on small differences in the average level than on large differences in total population.

For illustration, let’s use the one trait that we can all agree is close to normally distributed and varies among nations: human height. The average Indian dude (sorry for the androcentrism, ladies, there’s just better data on male heights and male soccer teams) is 165 cm (5′ 5″) and there are roughly 630 million of them. The average Norwegian dude is 180 cm (5′ 11″) and there are 2.5 million. The standard deviation of male height is around 6 cm around the world. If heights were distributed in a perfect normal bell curve with those parameters they would look like:soccer-1

As we plot them side by side, the Indian curve completely dwarfs the Norwegian one, even for pretty tall dudes. There are 9 Indians who are exactly 180 cm (5′ 11″) tall for every Norwegian. 5′ 11″ is tall, but not super tall. The higher mean effect only kicks in for the real outliers, so let’s zoom the above plot in to the really tall dudes.

soccer-2

Here, the picture reverses completely. There are 100 times as many Norwegians above 195 cm (6′ 4″) as there are Indians. Under a normal distribution assumption, the tallest Indian at 6′ 7″ would only match the 1,000th tallest Norwegian.


It’s important to remember that a normal bell curve is a very simplistic model, real life is messy, and Dharmendra Singh is 8′ 1″. Even inside the realm of mathematics, a normal distribution has narrower tails (the height drops faster as you get away from the mean) than most other widely used distributions that look sorta like a bell curve (like the student’s t or the gamma distributions). A normal model underestimates the number of outliers and overstates the importance of shifting the mean.

With that said, my main point stands: it should not surprise anyone that the achievement of extreme performers doesn’t strongly depend on the population of a country but does on the average. There doesn’t have to be something horribly wrong with China to account for its disappointing soccer team, they could be just a little bit to the left of other countries on national soccer affinity. We still don’t know what makes up soccer affinity, just that it’s enough to explain the disconnect between populations and team performance. With the math lesson behind us comes the fun part: in the next posts we’ll rank the world’s countries by average soccer affinity, throw a bunch of data at it to see what it correlates with, and see if can get any insight into what makes countries good or bad at soccer.

43 thoughts on “Tails of Great Soccer Players

  1. The graphs of the normal bell curve here made me wonder exactly where those two inflection points on either side of the center are. It turns out the inflection points are at ±1 standard deviation, which I guess are the only locations that make sense.

    Like

    1. Yep, you can differentiate the normal density function twice and the second derivative is 0 at ±1 SD. I think this result is mainly useful when you’re drawing bell curves by hand on a piece of paper, I have gotten really good at this by the time I finished this post.

      Like

  2. Thank you for this post. This also goes a long way in explaining over/underrepresentation of ethnic groups among male porn actors (assuming a porn industry like the one in California that consistently casts only the most well-endowed male performers, and ignoring all cultural factors, such as audience demand for ethnicity of performers and performers’ own cultural affinity for joining the industry) despite statistically sound medical studies that show very great overlaps in the penis size distribution curves for ethnic groups.

    Like

    1. Aargh, I can’t believe I didn’t think to use it as an example myself :)

      Actually, is penis size really the main job requirement? I imagine that the work of a male porn actor is incredibly demanding both in talent and effort. There should be quite a variety among industry members.

      Liked by 1 person

  3. This makes even more sense if you consider that “proximity to a great soccer player” is probably also a condition for being an elite player. I would imagine that even if you have the talent to be at the very top of the world, you won’t get there unless you encounter someone fairly early on who is quite good who can coach you/give you feedback. Top players aren’t always great coaches, but many coaches were pretty good players. A slightly higher than average soccer affinity would really blow open the number of available coaches and the small population size would pretty much ensure one of these would be near enough to the raw-elite-talent people to make a difference. Population density may play a part here too, but I’m less sure about that.

    Like

  4. “There are 9 Indians who are exactly 180 cm (5′ 11″) tall for every Norwegian. 5′ 11″ is tall ”

    Since there are real statistics in this post, you probably want to say

    “There are 9 Indians who are *approximately* 180cm tall for every Norwegian

    Like

    1. Yeah, I of course meant “9 Indians whose height, rounded to the nearest 1 cm, is 180 cm”. I’m looking to balance scrupulous detail with ease of reading, I assume that my readers are pretty smart. With that said, I’ll take your comment as evidence and update the balance a bit in favor of precision.

      Like

  5. Lol this is such an unnecessary complication of things. It’s just very simple: population size doesn’t say a thing about how good a population is at something. Not at all. If you just apply your population size premise then the conquistadores should never ever had stood a chance against the native Americans. Or, to put it slightly more extremely, why the massive amount of ants around the world haven’t been able to achieve a similar level of civilization as that of human beings?

    Such a premise assumes there are certain “talents” proportionally distributed among the population, while in fact “talent” really is a myth and only resources and hard work matter when it comes to such “hard” skills like playing football. Certainly Messi is probably blessed with some exceptional genes for playing football (such as his low center of gravity and extreme agility) but most of top-flight players are just completely normal human beings no different than any other guy.

    The thing is just that football infrastructure has been so lacking in China and India. It takes time for the catching-up. Now there has been a huge flow of capital in China into football so things might be picking up, but for that we’ll have to wait and see. Capital doesn’t solve everything and English football has already illustrated that perfectly.

    Like

  6. Also I’m not so sure about Chinese students getting into medical schools. If they’re indeed Chinese nationals not American-born Chinese then their families are likely so rich that they’ve already been in the US for their undergraduate education or even pre-university ones. That doesn’t say much about the whole Chinese population.

    Like

  7. OP: “The curve at 5 SD is 250 times higher than that at 6 SD and it keeps getting steeper and steeper.”

    This can’t be correct, can it? Shouldn’t it say: The curve at 1 SD is 250 times higher than that at 6 SD and it keeps getting steeper and steeper?

    Like

    1. Thank you, Iggy, for emphasizing the point of how unintuitive the drop off is at the far tails of the normal distribution. The curve actually is 250 times higher at 5 SD than at 6, at 1 SD it’s 39,824,784 times higher! That’s what happens when you have an exponent of a square in there.

      You can verify this on any calculator, or even in Excel:
      =NORM.DIST(5,0,1,0) / NORM.DIST(6,0,1,0)

      Like

  8. Ranging from real-time strategy games, first person shooting, games to action RPG
    and much more. Be sure you do have a mouse trap handy in your garage looking at this.
    Best gamers mouse Advance computer accessories include web cam, microphones, gaming equipments, portable storage devices,
    CD and DVD recordable drives, network accessories, modem etc.

    Gamers sooo want to get their hands on free large mouse mats.

    The game includes a detailed tutorial that can explain how you can attach
    pins and hinges.

    Like

  9. Hello blogger, i must say you have hi quality articles here.
    Your website should go viral. You need initial traffic boost only.
    How to get it? Search for: Mertiso’s tips go viral

    Like

  10. What you posted was very reasonable. However, what about
    this? suppose you wrote a catchier title? I am not suggesting
    your information is not solid., but suppose
    you added a title that makes people desire more? I mean Tails of Great
    Soccer Players – Put A Number On It! is a little vanilla.
    You ought to peek at Yahoo’s home page and watch how they write post headlines to get people to open the links.
    You might add a related video or a related pic or two to get readers excited about everything’ve
    got to say. In my opinion, it could make your posts a little livelier.

    Like

  11. You are missing obvious factors: the amount of interest in, and tradition of a sport. How any great cricket players are there outside the five countries that play cricket?

    Like

Leave a comment