Ladies, gentlemen, and cats walking randomly on keyboards: welcome to Putanumonit, the blog that puts a number on it. I have just written a long post detailing the philosophy and goals of this blog. This is not that post. Between us, philosophy just isn’t the most exciting subject for our first date. Let us instead grab a bottle of the second cheapest wine on the menu and jump straight into the claptrap that this blog will one day be famous for: answering questions of low consequence using numbers of low soundness.
Question #1 – Is putanumonit.com a good domain name?
What’s in a (domain) name? It’s intuitively obvious that shorter names made up of common words and a .com TLD are better than whtthehlllisthiscrp.info, but this blog isn’t about obvious intuitions. Let’s put a number on it!
MOZ has a ranked list of the top 500 domain names, going from facebook.com, twitter.com and google.com to 123-reg.co.uk and pcworld.com. Let’s see how the rank (lower is better) depends on the domain name length (between the www. and the .xyz):
We see a very slight upward trend in rank as length increases, but at least it’s in the right direction. There is one interesting thing that pops out: there are quite a few domain names of more than 11 characters, but not a single one is ranked higher than 59th (creativecommons.org). Here’s another visualization of the same data:
The height of the bar at a number N is the difference between the average ranks of domains longer than N and domains with up to N characters. Going from 2 to 5 letters only helps you, and 6-9 appears to be the sweet spot (facebook =8, twitter=7, google=6). The greatest difference is between names with up to 11 and over 11 characters: 58 ranking spots! The 12th letter costs 20 spots by itself.
A look at the TLDs confirms our intuitions as well: .com is 10 times as popular as any other TLD, and correlates with a very slight (15 spots) increase in rank.
Focusing on blogs exclusively, this snapshot of Technoratis’ top 100 blogs of 2014 tells a similar story. The median domain length is exactly 11, and .coms rule 88 of the top 100. Perhaps the success of all those blogs is due to top notch content, professional teams and affiliations with media empires rather than a good domain name, but who knows? Until I get the content, the team and the empire, it makes sense to launch at a .com domain comprising no more than 11 characters: putanumonit.
Question #2 – Is that result about 11 letters statistically significant?
If you test the hypothesis “Domains with 12 or more letters are lower ranked” by itself, you get a p-value of ~1% and can conclude that it’s statistically significant. That would also be a lie and a damned lie for a number of reasons I’ll get to in later posts. Remember kids: p-values are like drugs. They’re enticing, but they’re not your friends.
Significance aside, the chart above still looks cool and has non-zero educational value. Putanumonit will vary in rigor level from “here’s a number I found Googling, let’s tell a story about it!” (rigor minimis) to “here’s a huge dataset, let’s apply quadratic discriminant analysis to it!” (rigor mortis). The number of bad puns I use is a good measure of the seriousness of each post.
Question #3 – With the domain choice out of the way, let’s look at content. What will this blog actually be about?
My favorite site that puts numbers on things is fivethirtyeight.com, which publishes articles in 5 main categories: politics, economics, science, life and sports. Their site doesn’t have a categorized archive, but we can use a trick to estimate the number of posts by category: each category tab lists the last 10 articles written in it. By subtracting the date of the 10th article from today’s date we know the average time between publications, which is the inverse of the publication rate:
Politics: 10 articles in 9 days, 1.11 per day or 19% of total articles.
Economics: 10 articles in 21 days, 0.48 per day or 8% of total articles.
Science: 10 articles in 35 days, 0.29 per day or 5% of total articles.
Life: 10 articles in 7 days, 1.43 per day or 25% of total articles.
Sports: 10 articles in 4 days, 2.5 per day or 43% of total articles.
Politics is a strong category for fivethirtyeight (which is named after the number of US representatives) and the subject area that launched Nate Silver into geeky stardom. Unfortunately, I am not overly interested in it and am currently not eligible to vote in any country on the globe. I’ll write about metapolitics occasionally, including an analysis of the downsides of voting and what you can do instead on election day to achieve the same impact on democracy.
No one cares about economics and I slept through macroecon class, so this category may be sparse.
I do like science, but I don’t really have any original research to publish. I’ll occasionally critique the statistical “analysis” of some research papers that I come across, there’s one very popular TED talk in particular that I have my sights on.
Three quarters (or 403.5/538) of fivethirtyeight is not about life! I wonder what it’s about, then. I have a few posts planned on analysis of romance, from presenting a persona (or online profile) to choosing a long term partner. I also have another writeup on the estimated cost and importance of providing every person in Africa with access to clear water. If you disagree that quantitative analysis can teach people about dating or about helping the needy, maybe this blog isn’t for you.
Finally, sports is the #1 topic on fivethirtyeight! Who knew that so many people are into both sports and math? I only watched Moneyball for Brad Pitt’s dreamy eyes. Kidding aside, this is right in my wheelhouse. Sports posts in the pipeline: why some countries are good at soccer and India isn’t, how to optimize betting auctions like fantasy, why you should ignore and/or reverse most statistics based advise you hear about picking NFL games and much more.
Question #4 – Are you trying to imitate fivethirtyeight.com?
I’m mostly hoping to become 1/538th as good. In seriousness, my approach is quite different. Fivethirtyeight focuses on journalism: collecting and presenting a lot of information, with the analysis and narrative attached. I will focus on sharing (and discovering for myself) the art of understanding the world with numbers, hopefully inspiring my readers to apply the approaches I use in their own contexts.
Next post: my overarching philosophy and approach of putting numbers on things.