Player of Games

Does game theory apply to real life? It’s easy to fall into one of two errors:

  1. Thinking that human interactions can be precisely modeled by a 2×2 payout matrix, and being shocked by the vagaries of human psychology.
  2. Thinking that since humans aren’t rational game theory doesn’t apply to them, and being shocked when they follow incentives in predictable patterns after all.

Of course, human behavior is a combination of both elements: the mathematical structure of payoffs, incentives, and equilibria, the idiosyncrasies of culture, mood, and personality. Understanding both sides can be quite a superpower. I know people who made millions and bankrupted competitors by tweaking some small features of a public auction. This stuff is hard, but it works.

I took every available class on the math of game theory as an undergrad. Since then, I’ve been catching up on the squishy human part of the equation. There’s plenty of great research in this area, but the best way to learn about people is to observe them in their habitat. So when my friend Spencer told me that he’s organizing a live game theory party, I jumped at the opportunity.

I will break with tradition by not repeating the descriptions of common games that you’ve heard 100 times before, nor by deriving the rational strategies and equilibria. If you need to catch up on the math you can follow the links to Wikipedia etc.

What game is it anyway?

The first game we played had the following stated rules:

  1. Each person starts the game with 5 tokens and plays a single round with 6 different players.
  2. Each player can play “blue” or “red”, the players do so simultaneously.
  3. The payouts are as follows: if both picked “blue” they each receive one token from “the bank”, if both picked “red” they have to pay one token to the bank, and if they show different colors, “blue” pays two tokens to “red”.

Or in matrix form:

pd naive

What game is this? Take a minute to think for yourself.

If you said that this is the prisoner’s dilemma, with “blue” and “red” standing for “cooperate” and “defect”, congratulations! You paid attention in class and get an A-.

For extra credit, consider that we call both the following games “prisoner’s dilemma”:

Game 1 – If you defect you get $1. If you cooperate the other person gets $1,000,000.

pd friendly

Game 2 – If you defect, you get $1,000. If you cooperate, the other person gets $1,001.

pd vicious

Think what you’d actually do if you played a single round of each game with an unseen stranger. Is the first game even much of a dilemma? If you pride yourself on cooperation, would you really do so in the second game knowing that you can always donate the extra $1,000 to charity if the other person cooperates? How about if you play Eliezer’s version of the “true” prisoner’s dilemma?

“Prisoner’s dilemma” seems to cover many games that summon very different intuitions and behaviors. There’s more to understanding a game than just the preference ordering of the cells in the payoff matrix.

But let’s get back to the version of the game I actually played at the party. Here is some more information that may or may not be relevant to analyzing it:

  1. The reward for having the most tokens (out of 30 party participants) at the end of the game was a small box of gummy bears.
  2. The participants were educated young professionals who piled a table high with snacks and drinks that they brought to the meetup.

Does this change your mind about the sort of game being played?

In the sixth and final round of the game, I was matched with a woman that I spent some time talking to before the meetup had started. We had a pleasant chat and discovered that we had much in common. When our round started, I had one token of the original five remaining (can you guess my strategy?) and she had six.

She asked me what my strategy is, saying that she’s always trying to match what the other person will do. I said that I always cooperate, and showed her my single remaining token. She retorted that losing tokens isn’t proof that I cooperate. I answered that no proof is really possible, but my lack of tokens is at the very least strong evidence. Cooperating always nets you fewer tokens than defecting, so having fewer tokens has to be evidence of a cooperator.

We counted to 3. I showed blue. The lady showed red.

Who won the game?

Since I did, in fact, study game theory, I constructed a payoff matrix in my head before the game had started. I noted that I have little chance of acquiring the gummy bears, and also little desire to do so. My real matrix was as follows:

pd not pd.png

The best outcome in the real PD, when I defect and she cooperates, is now my worst outcome. The only thing I gain is a nasty reputation, since the gummy bears are in any case beyond my reach. We both prefer the cells where we match our strategies to those we don’t, and we prefer that both cooperate rather than both defect. I called this game “cheap virtue signaling”, but the actual name for it in the literature is a pure coordination game.

As expected, upon seeing my blue token the woman’s face fell and she spent the next minute profusely apologizing and offering excuses for her moral failing. By defecting against me she showed not only that she’s not a nice cooperator, but also that she’s a bad judge of character since she didn’t guess my strategy.

I finished the first game with no tokens and a big smile.

Tits and tats

The next game we played had a similar structure, except that we would play four times in a row with each of two partners.

My first matchup was a guy I knew from a couple of previous meetups. I guessed that:

  1. He was a generally nice person.
  2. He knew that I’m generally a nice person.
  3. He’s not familiar with the theory of the prisoner’s dilemma iterated over a fixed number of rounds.

I announced that I would play tit-for-tat: I will cooperate in the first round and subsequently do whatever he did in the preceding round. He agreed that it’s the sensible thing to do.

We both showed blue in the first round. And in the second round. And in the third. In the fourth round, he cooperated and I defected. I was up 5 tokens, and my partner seemed more impressed than upset.

Unlike the first game, where winning came down to being a very lucky defector, the iterated game had a bit of strategy involved and I thought that mine was close to optimal. If all went as planned I would win and get to demonstrate my cleverness, along with handing out gummy bears.

On the second go-round, I was matched with a gentleman I have never seen before. I explained my tit-for-tat plans, and he nodded in agreement. The first round started – I showed blue and he showed red.

I explained that in accordance with tit-for-tat I will now have to defect in the second round, but he can still make more money by cooperating. If he keeps defecting, he would be down three tokens, while if he cooperates and then we go blue-blue in the last two rounds he would be at 0. He nodded again and said that the math made sense. The second round started – I showed red, he showed red as well.

I realized that the first prize is now beyond me, and didn’t bother explaining any more math since it was obvious we’re both just going to defect in the last two rounds. In the third round, I showed red and the dude showed blue, then looked dismayed at my dastardly betrayal. In the fourth round, he indignantly defected and I cooperated just to mess with him.

Outside the box

If my second partner had gone along with my not-quite-tit-for-tat, I would have ended up with 10 tokens more than I had started with. Winning 10 required that my partners cooperate 8 out of 8 times without fail and that I would also get away with two defections. But when the winner for the second game was announced, he was up 14 tokens! What possible strategy could be that much more effective than mine?

Again, I advise you to take a minute to see if you can come up with the answer.

The winning strategy was simple – a couple conspired to have the woman give away all her tokens to her boyfriend by repeatedly cooperating while he defected.

When word of this got out, some participants were miffed at the winner for cheating. I was grateful! I came to the meetup to learn about people, and creative cheating is what people do.

Be the smartest or be dumb

Next, we played two collective games that required guessing what the rest of the room would do. In each game, we had to write down a number between 0 and 10,000 on a piece of paper.

In the first game, the winner was the person whose guess was closest to the group’s average. What’s your strategy?

I took a page from the winning couple in the last game and conspired with my wife, Terese. I told her that I’m putting 10,000 on my paper. Assuming that everyone else would average to 5,000, she put down 5,200. She ended up finishing second – the average was 5,720 and the winner had guessed 5,600.

In the second game, the winner was the person who would be closest to half the group’s average. I put 10,000 as usual, Terese put 400.

The distribution of guesses turned out to be this: three people put down 10,000 (including me), five people put 0, ten people put 1, and the remaining twelve were all over the range between 2 and 5,000. The average was close to 2,000, and a couple of people were closer to 1,000 than Terese was.

Wikipedia has a good analysis of the second game, which involves iterated reasoning about what the other players will do. The reasoning steps go something like this:

  1. Zeroth level thinking – thinking about other people is hard, so I’ll pick a number at random.
  2. First level thinking – if the naive average is 5,000, I’ll guess half of that, namely 2,500.
  3. Second level thinking – wait, if everyone realizes they should choose 2,500 I should choose 1,250. But everyone will realize that, so 625, then 312… I get it! The game converges to 0! I should guess 0!
  4. Third level thinking – but if everyone guesses 0, I should probably guess 1 to be unique. Then if one person guesses high, I win.
  5. Fourth level thinking – ???

Here’s the interesting part – writing a random number (0th level) is a reasonably good strategy that sometimes wins, especially if you adjust your guess based on experience with the group. But putting down 0 or 1 (2nd or 3rd level thinking), which half the people did, is a terrible strategy that almost never wins.

If a lot of people guess 1 they will at best split the prize, and it often takes just two players going high to dominate that strategy. In the presence of a single troll who puts 10,000 (which 10% of the people did in my case), 0s and 1s lose to all the guesses in the 2-350 range. People can end up in that range either with 4th+ level thinking, or just 0th level thinking and some luck. Third level thinking, in this case, works if and only if everyone else is on level 2, no higher and no lower. Picking a random low number, on the other hand, is quite robust to distributions of other players’ thinking levels and always gives you some chance of victory.

If you can’t be the smartest person in the room (fourth level and up, in this case), you want to be as many levels behind the frontier as you can – stupid and unpredictable. Going through 100 steps of reasoning in a game like this is guaranteed to lose if at least one person has gone through 101, or if everyone is at 30.

Being smart is only worth it if you’re smarter than everyone else. If you aren’t, it’s better to be dumb.

This logic applies to many situations where a lot of people are trying to guess at once what everyone else is thinking. The stock market offers a great illustration of this.

Suppose that you’re a trader, and you’ve followed a stock that has traded around $10 for a while. Everyone seems to agree that $10 is the fair price for it. An analyst report is published by a small sell-side bank (i.e., a bank whose job it is to sell you stocks) that praises the stock and gives it a “buy” rating, without disclosing any new insight or information. The stock immediately jumps to $14 and keeps trading there.

What’s the fair value of the stock? Should you buy or sell it at $14?

We can imagine a similar chain of reasoning to the averages game, each step taking into account more and more of other people’s thinking into account.

  1. I’m dumb, the markets are efficient, and if the stock trades at $14 then the fair price is $14.
  2. The stock was worth $10 five minutes ago, and the only thing that happened was the analyst report. But sell-side analyst reports are worthless. Obviously, they’re going to tell you that the stock they’re selling is great! Everyone knows that analyst ratings don’t mean anything and there’s no reason to update – Matt Levine said so. The fair price is $10.
  3. Ok, somebody thinks that the stock is worth at least $14. Otherwise, they wouldn’t keep buying it at $14. So if I think that it’s worth $10 and they think $14, I’ll be humble and accept that the true answer is somewhere in the middle. The fair price is maybe $12.
  4. Wait, there are also the people selling at $14. Why aren’t they selling at lower prices? If all the sellers thought that the fair price is $12, they would compete with each other to sell until the price fell. If I try to sell the stock at $13.99 and succeed, it means that nobody else was willing to sell that low, so not only the buyer but also the sellers all think that the price is at least $14. Fuck it, I guess $14 is the right price after all.

The higher up the chain of reasoning you go, the closer your answer gets to the position of maximum ignorance. In the stock market, being smart is worthless if you’re not the smartest, and being informed is worthless if you’re not the first to be informed.

If you want to read more in-depth examples of this sort of thing by someone who knows game theory, hop over to Zvi’s blog.

Theorem of the troll

In the game where the goal was to guess half the group’s average, three people guessed 10,000 – a number that is certain to lose. I did it to conspire with my wife, and the other two people did it… for the lulz. They didn’t even plan to tell anyone, I had to dig through the slips of paper to find their names and ask them.

And what about my second partner in the iterated PD/virtue signaling game, the one who messed up my tits and tats? Was he too stupid to understand the strategy? Was he playing 12-dimensional chess beyond my comprehension? Was he just trolling?

Jacob’s Theorem of the Troll – If you play a game with more than N people, at least one of them is a troll who will play the game in the most incorrect way possible.

Or from another angle: if at least N people are playing a game, at least one of them is playing a different game.

What’s N? If the stakes are low and the game isn’t the most interesting thing going on, N can be as low as 4-5 players. I think that in most cases, an N of 25 is sufficient to expect a troll loosely riffing off Scott’s Lizardman’s Constant which is equal to 4%.

You think that you have a game figured out, that the rules and the incentives and the strategies are clear for all to see, and yet without fail somebody will be doing the exact opposite of what they’re supposed to do. Maybe they’re confused. Maybe they are facing pressures and incentives that you’re not aware of. Maybe they’re much dumber than you are, and maybe much smarter. Maybe they just want to watch the world burn.

The theorem of the troll is true for your pickup soccer game, for an orgy, for your team at work, for geopolitics, for the NBA. With enough participants, a troll is inevitable.

Two examples:

The paper industry is perhaps the most boring industry in the world. The global market for paper products barely changes from year to year, and the market in each region is dominated by a few huge companies making small but stable profit margins. There’s little ongoing innovation in paper products, and a paper mill costs millions to build and runs 24/7 for a century.

If a single company started building new mills and increased supply beyond the current demand, it would crash the market and cause every single paper company to go from making small profits to suffering huge losses. But because there are so few players, it is possible to avoid trolls. No one increases capacity, and the industry keeps chugging along successfully.

On the other hand, the restaurant and bar industry in New York has around 25,000 participants. The average profit margins are 3-5%, which isn’t enough to cushion against the wild swings in the ebb and flow of business. 60% of NYC restaurants go out of business within three years of opening. And a big reason why it’s so hard to run an NYC restaurant at a profit is the staggering number of restaurants who are trolling the industry by operating at a loss.

Why would someone run a business at a loss? Because they always dreamed of doing it and never did the math. Because they suck at restaurateuring but are too arrogant to admit it. Because they’re burning through a loan and are going to declare bankruptcy anyway. Because they’re trying to make up for negative margins by increasing volume. You can open the best taco joint in the world and someone will troll you by selling cheap crap tacos across the street at a massive loss.

If you’re playing a game that’s sensitive to other people’s behavior, ignore the troll theorem at your peril.

Trolls vs. equilibria

But there’s an upside to trolls – they allow us to break out of inadequate equilibria.

Eliezer’s book describes how civilizations end up in traps where everyone is unhappy, but no one can fix their own or everyone else’s situation by changing their own behavior. A classic example is academic publishing. Researchers have to submit to prestigious journals so they can get good jobs, universities have to hire based on publications in prestigious journals so that they get good researchers, and readers have to pay for the journals to read the good research.

Of course, companies like Elsevier rope off the prestigious journals and extract massive rents in money and effort from all the other participants without contributing anything to the advancement of science. Nobody can unilaterally stop using the prestigious journals – neither readers, researchers, or universities. They’re all stuck in a bad equilibrium.

Except for people like Andrew Gelman, a prestigious academic who trolls journals with great gusto. He publishes his research early and for free on his website, takes blog comments as seriously as peer review, and occasionally muses about an academic world in which journals no longer exist.

And of course, there’s Sci-Hub (check the first link in this article), whose entire existence is nothing but a giant middle finger aimed at paywalled journals. Sci-Hub lets you access almost any journal article for free, and yet I’ve donated more money to it than I would ever have spent on buying PDFs. There’s no payoff matrix that explains why I donate to Sci-Hub and refuse to pay for journals, I’m just trolling the paywalled academic publishing game in hope that it dies.

Trolls mess up your careful strategies and favorite taco joints, but they also topple dictators, bust monopolies, and pirate journal articles. If you’re applying game theory to people, you have to account for the trolls because trolling never dies. Long live the trolls!

5 thoughts on “Player of Games

  1. The last part reminds me of a piece by Nate Soares on what used to be called “Dark Arts” in the halcyon days of 2014.

    “In many games there is no “absolutely optimal” strategy. Consider the Prisoner’s Dilemma. The optimal strategy depends entirely upon the strategies of the other players. Entirely.

    Intuitively, you may believe that there are some fixed “rational” strategies. Perhaps you think that even though complex behavior is dependent upon other players, there are still some constants, like “Never cooperate with DefectBot”. DefectBot always defects against you, so you should never cooperate with it. Cooperating with DefectBot would be insane. Right?

    Wrong. If you find yourself on a playing field where everyone else is a TrollBot (players who cooperate with you if and only if you cooperate with DefectBot) then you should cooperate with DefectBots and defect against TrollBots.

    Consider that. There are playing fields where you should cooperate with DefectBot, even though that looks completely insane from a naïve viewpoint. Optimality is not a feature of the strategy, it is a relationship between the strategy and the playing field.”

    You know how sometimes advice can sound stupidly obvious, but be worded in such a way that it slips through someone’s psychological barriers andfix certain ways in which they were thinking poorly? This bit was kind of like that for me back then. It really helped me get out of some weird underconfidence I had about trusting myself to notice when I was in a strange position.

    Liked by 2 people

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s