So far in this series we have seen why population doesn’t matter much for national soccer success, and then ranked the countries by their calculated average soccer ability. To conclude this series, I’ll present three things that do matter. The epistemic status of these ranges from “makes sense” to “wild speculation”. In other words: if you agree with me – I’ll take the praise; if you don’t – this is all just meant to stimulate discussion.
The hot new release in macroeconomics literature (yes, that’s a thing) is Hive Mind by Garrett Jones. Jones argues that the average IQ of a nation’s citizens affects their prosperity much more than individual smarts. The argument is that a modern economy requires collective intelligence, allowing all participants to learn, cooperate and benefit from each other. I propose that collective soccer ability can help explain why average skill trumps population size in determining national outcomes.
Weightlifting and sprinting can probably be practiced in isolation, but a soccer game requires dealing with 21 other players at all times. Soccer players improve at the level of their teammates and competition.
As we saw in part 1, small changes in average skill create large difference in the number of extreme performers. Let’s say that Lilliputans are a tiny bit better at soccer, on average, than Blefuscans are. At 5 years old, a Lilliputan of exceptional talent will play against the kids on his block which are almost the same level in each country. At age 8, his small town will have more decent players to challenge him than a similar town in Blefuscu. By the time he’s 12 and playing for a top youth team, the Lilliputan will find a lot more opponents high enough on the bell curve to hone his skill against, creating a large gap in soccer level.
In a real life example, the 1987-born cohort of Barcelona FC’s legendary youth academy had Lionel Messi, Cesc Fàbregas and Gerard Piqué playing together since they were 12 (joined by Pedro a few years later). Looking through the biographies of top players, it’s almost impossible to find any that weren’t playing against world class competition by age 14 at the latest (many start competing seriously around age 7-8). In contrast, even if China produced 20 children with enormous soccer potential in a generation, and even if they devoted themselves to soccer, it would be hard for them to improve as fast. The level of competition in their home locales is likely too low, while the size of China and the lack of specialized infrastructure makes it hard to concentrate them in a few super-academies.
I often hear discussions about whether some great American athlete such as Adrian Peterson could be great at soccer. If he stayed for elementary school in Palestine, Texas, a city of 18,712 that is very unlikely to contain even one other soccer player of similar talent, he could not have.
This is the section where you’ll learn something about math, not sure if you will learn anything about soccer.
Instead of correlating soccer level with a hundred different country variables and finding spurious jelly beans, I restricted myself to testing three hypotheses regarding the effect of the national economy on soccer:
- GDP per capita will increase soccer level because richer countries can afford better sports infrastructure.
- Government spending as % of GDP will increase soccer level because there’s no private profit in youth sports but it can be pushed by a big government.
- Unemployment rate will increase soccer level, because if it’s hard to find a job you may as well play soccer.
With data from Wikipedia and the IMF, I ran a regression with all three variables and the regions of the world.
2 out of 3 on what were pretty wild guesses! I’m ready to accept my PhD in macroeconomics. Richer countries and those with bigger governments do in fact get better at soccer, unemployment showed an effect in the predicted direction but it wasn’t significant.
Here’s the stats-geeky part of today’s show: it seemed interesting to combine the two significant variables into one – government spending per capita. That variable by itself has a strong positive correlation with soccer level, but when I added it to the regression it showed as insignificant, and suddenly so did GDP per capita! What happened to turn a good variable bad?
When you add a variable that is a product of two others to a regression, it measures the effect of the interaction between the two, not a separate factor. A high coefficient means that the effect of the first variable (i.e. GDP) increases when the second variable (Gov. spending) is high and decreases when it is low, and vice versa. This doesn’t happen here: richer countries are better by the same amount regardless of the size of their government.
Why did GDP per capita become insignificant? There aren’t huge differences in government spending between countries, it’s between 25%-42% for the majority. GDP numbers, however, are all over the map: a quarter of the countries are in the $300-$2000 range and another quarter are between $20,000-$150,000. A Liechtensteinian produces more in a day than a Malawian in a year. This caused the variation in government spend per capita to be almost wholly dependent on GDP per capita, the more volatile factor. The correlation between the two is 0.95.
A linear regression doesn’t handle well two predictors that correlate so closely, since it doesn’t know which one actually affects the result. Example: sex and drugs could both correlate with rock n’ roll. However, everyone who does sex also does drugs. If you correlated rock n’ roll on both sex and drugs, you wouldn’t know if both sex and drugs caused rock n’ roll or if for example drugs caused extreme rock n’ roll and sex ameliorated the effect a bit. Since sex and drugs go together, either one, or both could be the cause.
A problem with linear regression is that it’s just so… linear. It doesn’t account well for any other types of relationships, such as threshold effects. Perhaps rock n’ roll only kicks in for a certain dose of drugs? A fun way to explore other effect is with a decision tree model, each node is a yes/no question, the path along the nodes from the root leads to a prediction of the category or variable. Here’s a tree generated with a machine learning package in R, the numbers in the bottom nodes represent the best guess of the soccer level (which mostly ranges from ±1,000):
For each of the six extreme nodes I looked at the countries that fit the macroeconomic profile and tried to find commonalities between them. Being rich is good for soccer, especially if your pockets are full of a currency that isn’t Euros. If you do pay in Euros, you want to play soccer on a sunny Mediterranean beach. On the other side: being broke, scorching hot or having now or recently been under the thumb of communist dictators makes you bad at soccer. Basically if you’re wealthy, free and the weather is nice everything seems to be going for you, including soccer. If you live in a generally sucky place, soccer is no solace. Life just ain’t fair.
A commenter on my last post claimed that “talent” is a myth and that soccer is all about hard work. I refrained from asking him if this view extends to other sports, perhaps LeBron James’ diligence made him 6’8″ with a 40+ inch vertical jump? Endurance, agility, speed, strength, quickness, balance and accuracy are all critical in soccer and are to significant degrees inborn. The most naturally gifted players train as fanatically as everyone else. The only consolation for those that rail against congenital advantages has been that at least, unlike most other sports, short people are better at soccer. If only that were true.
Average male height correlates with national soccer level ability at 0.32, higher than 0.2 for both GDP and government spending and as high any single factor is likely to get for such a noisy measure of such a complex phenomenon.
Everyone’s favorite exception, Lionel Messi, does in fact prove the rule more than he refutes it. Messi is 170 cm tall (5’7″), barely an inch shorter than the median Argentinian. His teammates on the Argentine national team average 181.1 cm, a full 3 inches above their compatriots. Like the Little Corporal (who was shorter than Messi but taller than the average Frenchman of the time), Messi only appears short next to other soccer players. The players at the 2014 World Cup averaged 181.3 cm. The tallest in the tournament were the Germans, who ended up lifting the cup to their full 185.4 (6’1″) stature.
Messi’s own career was in danger when he was diagnosed with growth hormone deficiency and the Argentine clubs refused to pay for the drug. Barcelona got Leo two years of growth hormone (HGH) treatment which literally made him tall enough to play soccer, along with possibly leading to increased strength, motor development and reduced body fat. Perhaps what contributes to soccer ability isn’t height but rather HGH with all its other benefits, and height is just a measurable proxy. I couldn’t find data on international variation in HGH levels, but it definitely offers a huge boost to athletic performance (and as a result is a popular doping agent in various sports).
Soccer is a sport for short people, for values of “short” equal to “actually tall, but not freakishly tall”.
So what can China do to lift a World Cup? The easiest way is to bribe Sepp Blatter, of course. With that in mind, other options are on the table: China can send the most promising players out of kindergarten to a few strong youth academies built in subtropical beachfront towns with money that isn’t denominated in Euros. Or, it may be easier to just pump them all full of Chinese HGH and set them loose on an unsuspecting soccer world.
Next post is up, my apologies to anyone who thought that this is a soccer blog ;-)