Monday, April 18, 2016

What's the Expected Marginal Impact of Voting?

I've always felt conflicted about voting. On one hand, I like the idea of participating in democracy. On the other hand, there is almost no chance of an election being so close that my voteone out of millionswill break a tie and have a marginal impact.

In the past, I have argued it is so unlikely my vote will matter that voting is not worthwhile, even if I altruistically account for the vast number of people whom elections affect. That is, I've argued that the expected outcome of my votethe chance it will swing the election, times the impact that would have on each person, times the number of people the election impactsis next to nothing. The expected causal effect of voting is so small, I claimed, that it would be more altruistic for me to write a nice letter to my grandmother than to vote.

Last week I finally thought through it carefully... and it turns out I was wrong.

I now think the expected net impact of one vote in a typical US presidential election is on the same order of magnitude as the impact of the election on one eligible voter. So if you care about the way your vote will affect the rest of the country and the world (and you think you know what effect it will have!), voting may be a very valuable use of your time.

In this post, I'll explain my old argument against voting, show why it was wrong, andwith minimal amounts of mathballpark a very rough estimate of the expected marginal impact of a vote. (If you're not interested in the fallacious argument, just skip ahead.)

Disclaimers

  1. This whole post works within the framework of a plurality (whoever gets the most votes wins) election between two parties. The electoral college is somewhat more complicated, with aggregation of votes happening at more local levels. For the purpose of the ballpark arguments in this post, I don't think these details matter, but I'm happy to discuss in the comments if anyone disagrees.
  2. I'm also ignoring the possibility that the election will be tied after I vote, and I'm ignoring the fact that very close elections are decided by complicated politcal processes I don't understand. Again, I don't think these matter to first order, but I'm happy to discuss in the comments.
  3. There are lots of arguments for (and against) voting, and their omission does not represent a stance on any of them. I am simply focusing on causal impact.
  4. None of the reasoning or math in this post is particularly sophisticated. I write it not because it is interesting but because it is important! And because I owe it to everyone to whom I've made the wrong argument.


My old, fallacious argument against voting

My thinking went like this: If we anonymize all of the $N$ ballots in an election and then consider them one by one, we can think of each as an identical random variable $X_i$. This random variable can take the value $-1$ (Democrat), $0$ (didn't vote), or $1$ (Republican). Since I don't know who voter $i$ is, and since he or she might make a mistake or forget to vote, $X_i$ could equal any of $-1, 0,$ or $1$, so it has some variance $\sigma^2$. Further, there is some bias to the votesi.e. people on average slightly prefer one candidate to the otherso $X_i$ has some non-zero expected value $b$.

What is the chance that I swing the election? It's the same as the chance that the election was tied before I voted. The probability of a tie is just
$$P(\sum X_i = 0) = P(\sum X_i \leq 0) - P(\sum X_i \leq -1) \\
= P(\sum X_i - E(X_i) \leq -b \cdot N) - P(\sum X_i - E(X_i) \leq -b \cdot N - 1) \\
= P(\frac{1}{\sigma \sqrt{N}} \sum X_i - E(X_i) \leq \frac{-b \sqrt{N}}{\sigma}) - P(\frac{1}{\sigma \sqrt{N}} \sum X_i - E(X_i) \leq \frac{-b \sqrt{N} - 1/\sqrt{N}}{\sigma})$$
Now, applying the central limit theoremwhich tells us that the average of many independent, identically distributed, mean-zero random variables converges to a normal distribution distributed "very tightly" around zerowe can say that the probability of a tie is
$$\approx  P(Z \leq \frac{-b \sqrt{N}}{\sigma}) - P(Z \leq \frac{-b \sqrt{N} - 1/\sqrt{N}}{\sigma}))  \\
\frac{1}{\sigma \sqrt{N}} \cdot \phi(-b \sqrt{N} / \sigma) = \frac{1}{ \sigma \sqrt{2 \pi N}} e^{-\frac{b^2 N}{2 \sigma^2}  }
$$
This is absurdly small for large $N$ (many voters). But more importantly, even when we compute my expected impact, multiplying the chance of swinging the election by the number of voters $N$ and the impact $I$ of the election on each voter, it is still basically zero. For example, if we suppose that the election has a whopping $\$$100,000 effect on each voter, that there are only one million voters, and that voters are biased toward the Democrats by only half a percent (so that $b=0.01$), the expected impact of my vote is
$$(number \ of \ voters)(impact \ per \ voter)(probability \ of \ impact) \\
= N \cdot I \cdot \frac{1}{\sigma \sqrt{ 2 \pi  N}} e^{-\frac{b^2 N}{2 \sigma^2}  }
= \frac{I}{\sqrt{2 \pi} \sigma} \sqrt{N} e^{-\frac{b^2 N}{2 \sigma^2}  }
\approx \frac{10^5}{\sqrt{2 \pi} \cdot 1} \sqrt{10^6} e^{-\frac{10^{-4} \cdot 10^6}{2 \cdot 1}  } \\
= \$0.0000000000000077
$$
If that didn't make much sense to you, here's the basic intuition: A million votes being cast is like a million coins being flipped. If each of the coins is weighted a bit toward heads, then after enough flips, more than half will be heads. The chance that there are comparably many heads and tails is exceptionally small; in fact, it falls off exponentially as the number of flips increases. Similarly, the chance that when the average voter is a little biased toward the Democrats, all of the random factors--mis-checkings of boxes, people forgettings to go vote after work--favor the Republicans declines exponentially as the number of voters increases. Meanwhile, the impact of election's outcome grows proportionally to the number of voters. And an exponential decline will always dominate a linear growth, when the numbers are big.

Why that argument was wrong

In the argument above, I said "there is some bias to the votesi.e. people on average slightly prefer one candidate to the other" and interpreted this by giving each vote some non-zero mean $b$.  While it's true that the votes will almost certainly have some bias one way or the other, it's suspicious of me to treat this bias as a fixed number, because I am uncertain of its value. Rather, I should model $b$ as another random variable. Of course, I can assign near-zero probability to $b=0$, but even an infinitesimal chance that $b=0$ may matter after I multiply it by $N$, the large number of voters whom the election affects.

Another way to understand the fallacy in my argument is to think I misapplied the central limit theorem. If we interpret $b$ not as the actual bias of each vote, but rather as my best guess of the bias of each vote, then the $X_i$ are not independentsince if the first hundred I observe have a sample mean that is, say, less than $b$, then I will expect that $b$ was an overestimate and that the one-hundred-and-first vote is less than $b$. Since the $X_i$ aren't independent, I can't apply the (standard) central limit theorem.

The (actual) expected impact of voting 

Rather than separately considering many sources of uncertaintywhat the net bias of the population is, how many people will accidentally check the wrong box, who will forget to show upwe can model them all simultaneously, by thinking about my subjective probability distribution on the sum of the votes.

So, the day of a presidential election, what does my subjective distribution of $\sum X_i$ look like? A quick google search suggests that on the day of an election, betting markets typically reflect about 90% odds in favor of one candidate. If I knew better than the betting markets, I could be making a lot of money, so it's reasonable to assume my beliefs are similar to theirs. That means I assign a 10% chance of the predicted-to-lose candidate getting 50% or more of the vote, andsince there's basically no chance that the predicted loser gets more than 60% of the votewe can say I assign a 10% chance that the predicted loser gets between 50% and 60% of the vote.

So how likely is it there is a tie? We'd expect that if the predicted loser wins, it will be by the skin of their teeth. But to be very conservative, let's say it's as likely that the predicted loser gets 50.0% as it is they get 50.1%, as it is they get 50.2%, ..., all the way to 60%. That is, conditional on winning by between 0% and 10%, each number of votes the predicted loser might get is equally likely. Receiving 50% to 60% of the vote corresponds to receiving $\frac{5N}{10}$ to $\frac{6N}{10}$ actual votes, so there are $\frac{N}{10}$ possible numbers of votes that candidate might receive. So if each such number is equally likely, then there is a $10\% / (\frac{N}{10}) = 1/N$ chance that they get exactly 50% of the votes.

If there is a $1/N$ chance that one candidate gets exactly half the votes, then there is a $1/N$ chance that I swing the election. So the expected impact of my vote is just
$$(number \ of \ voters)(impact \ per \ voter)(probability \ of \ impact)  = N \cdot I \cdot 1/N = I$$

Wait, really?

How can this be? No US presidential election has ever come within one vote. Is it really reasonable to think this might happen? 

These questions are tempting, but ultimately misguided. We've never seen a tie before, andsince there is only a $1/N \approx 0.0000003\%$ chance of it happening in each presidential electionwe shouldn't expect that we ever will. But on the off-chance that there is a tie, each vote will have a marginal impact whose magnitude is as large as the off-chance is small. Since our brains are bad at understanding both tiny probabilities and huge impacts, and since this problem requires us to weigh the two against each other, we shouldn't really expect this to be intuitive.

Loose ends

So far, I've left all estimates in terms of $I$, which I've called the average impact of the election on a voter. By this, I mean the expected difference in outcomes for an average person if your preferred candidate is selected instead of the other one.

It's important to be aware that $I$ may be negative; you might chose a candidate who will actually do a lot of harm to other voters (not to mention the rest of the world!). If you are really humble, you might think that you have no better idea of what is good for people than does anyone else, in which case your $I$ is about zero, and you'll need to find some other reasons to vote.

However, you might also think that you are better informed or better educated than other voters, or that your values are "better" than theirs in some moral sense. In this case, $I$ could be quite large, since different presidents have significantly different priorities. I'll guess that I'd put my own $I$ in the range of thousands or tens of thousands of dollars (though I cringe at the idea of trying to monetize outcomes across such a wide swath of topics, as well as at having to put a number on something about which I'm so uncertain). This is huge, considering that it will take me at most a few hours.

tl;dr

In case you weren't going to already, you should really voteand you should make an informed decision about whom to support. It is unfathomably unlikely that you will swing the election, but if you do, you will impact an unfathomably large number of people.

Thanks

To Margaret, for having a conversation about voting that finally prompted me to formalize these arguments. To Jake, for some helpful edits and comments.

No comments:

Post a Comment