Up to Elections: Results and Voting systems

by Mike Ossipoff

To write to Mike write nkklrp before the "@" sign, and then write hotmail.com after the "@" sign.

The strategy described in Approval Strategy III assumed that there are 2 candidates who are perceived as the likely frontrunners.

Now, say that the election is not zero-info, and that there are no completely unacceptable but winnable candidates, and that there's no estimate that some 2 candidates are more likely to be frontrunners.

In this message I'm going to describe a suggestion for estimating the Pi from the Wi. For estimating the candidates' individual probabilities of being in a tie or neartie if there is one, from the candidates individual estimated probabilities of winning. I'll call that Tideman's estimating method. But first let me mention a few other possibilities, only one of which seems as good as Tideman's estimate.

Of course you could just try to estimate Pij for each possible pair of candidates i & j. That would be difficult. I doubt that someone would have a feel for those probabilities, to estimate them.

Easier would be estimating directly the Pi. But, again, one doesn't have an intuitive feel for a candidate's probability to be in a tie or neartie if there is one. One could use those 2 methods, but the estimates won't come from an intuitive feel for those probabilities.

Here's another possibility: For a few candidates who seem to be the "villains", rate them numerically according to how likely they are to be the worse candidate in a tie or neartie--how likely they are to be the villain of the piece. Divide each villain's rating by the sum of the ratings of all those likely villains whom you've rated, to get that candidate's estimated probability of being the villain of the tie or neartie. Remember that we're not trying to do this exhaustively. We only rate a few most likely villains, and we assume that one of them _will_ be the villain of the tie or neartie.

Then, for each of those villains, rate each of the candidates better than him according to how likely they seem to be the hero who will tie or neartie him.

Again, for each of the heroes considered with respect to a particular villain, divide each hero's hero rating by the sum of all the heros hero ratings for that particular villain, to get that hero's probability of being the hero in the tie or neartie with that villain, in the event that that villain is the one who is worst in a tie or neartie.

While this procedure ignores all but the most likely villains, and, for each villain, ignores all but the most likely heroes who could tie or neartie him, it's accurate enough, because it considers the most likely heroes & villains. Of course they're the ones whose likelihoods are most natural to estimate anyway.

I repeat that approximations and simplifications can be justified by pointing out how great are our uncertainties in estimating probabilities and utilities.

So, the above procedure gives you the Pij for the more likely tie/neattie pairs of candidates. Not exhaustive, but good enough. I'll call that method hero/villain. Its estimates seem to be estimates that we'd have an intuitive feel for.

Now, Tideman's estimating method. Tideman & hero/villain are both good, for the situation described above when something like them is needed.

Tideman's estimate is based on the idea that we have a good feel for how surprised we'd be if a candidate won, or how confident we feel that a candidate will win, or how much we fear that a candidate will win. So we estimate the candidates' win-probabilities, the Wi.

The argument is that it's a reasonable guess to assume that the candidates' Pi are proportional the square root of their Wi. I'll tell how that guess is justified:

Tideman's argument is geometric. Please feel free to skip this. The geometric argument isn't the only argument for the square root estimate. I'll describe a simpler non-geometric argument later.

I'll describe this for 3 candidates. The principles are the same, and the regions that I describe exist, for any number of candidates. With more candidates the geometry has more dimensions. But don't worry-- there's no need to deal with many-dimensional geometry; though the number of dimensions varies with the number of candidates, the regions that I describe exist no matter how many candidates & dimensions there are, and the results regarding the relation of the Pi & Wi are the same.

Just for an example, let's say that there are 3 candidates: X, Y, & Z. Let's define a co-ordinate system, with 3 mutually perpendicular co-ordinate axes. For instance, they could be "up", "north", & "east". As in all such co-ordinate systems. The co-ordinate axes all start from a point called the "origin", the zero point for all the axes. If you point north, east, and up, and when so doing, you're pointing along the 3 co-ordinate lines, then you are at the origin. The point of zero displacement from where you are.

To say it differently, if we place a cube so that 3 of its edges coincide with the 3 co-ordinate axes, the corner of the cube from which orginate those edges is called the origin of the co-ordinate system.

Now, distances along each axis are measured by votes for a candidate. Distances on the X axis are measured by votes for X. Say we draw a cube like the one that I mentioned in the previous paragraph. The length of the edges of this cube that we draw are equal to the number of voters in the election, the number of people who voted.

Any point inside that cube defines a voting outcome-- a number of votes for X, a number of votes for Y, and a number of votes for Z.

Such a point is called an "outcome point". The space in that cube is called "outcome space".

The outcome point can be anywhere in that outcome space.

There's a region of that outcome space in which X wins. There's a region in the outcome space in which Y wins. And there's a region in which Z wins--in the event that the outcome point appears in a particular one of those win-regions.

Between the X-win region and the Y-win region is an XY-tie/neartie region, a thin plate of space joining those 2 win-regions.

So we have a win-region for each candidate, and a tie/neartie region for each possible pair of candidates.

Now, of course all the win-regions are the same size as eachother. And all the tie/neartie regions are the same size as eachother.

The probability density in a region is the probability per unit volume that the outcome point will be in that region. The probability density could vary continuously thoughout the outcome space.

But if we knew the probability density in the X-win region and in the Y-win region, then we could guess what it might be in the XY-tie region: Since that tie region is right between the 2 win-regions, we can guess that the probability density in the XY-tie region is equal to the average of the probability densities in the X-win and Y-win regions. That's a reasonable guess. I'm talking about the average probability density in those regions.

Now, because all the win regions are the same size, the probability density in the candidates' win regions are proportional to the probability of winning. And the probability density in the tie-regions is proportional to the probabilities that the adjacent candidates will tie.

So, just as the probability density in the XY-tie region is guessed to be the average of the density in the X & Y win regions, so we can, from that, derive a guess that the probability of the ties is proportional to the average of the corresponding win probabilities.

So we have:

Pij is proportional to the average of Wi & Wj Pij = k*average(Wi, Wj), for some constant k.

Geometric or Arithmetic mean will do. The geometric mean is what works computationally for this, and so:

Pij = k*square root(Wi*Wj)

Substituting Pi*Pj for Pij, a reasonable approximation,

Pi*Pj = k*square root( Wi*Wj)

= k*( square root(Wi)*square root(Wj) ).

If we're assuming that a candidate's Pi can be judged from his Wi, then it's reasonable from the above that candidates' Pi are proportional to the square root of their Wi.

That's what was intended to be demonstrated.

I said that there was a simpler demonstation:

In Approval Strategy III, in regards to the better-than-expectation strategy, I suggested a guess that candidates' Pi are proportional to their Wi. That's justified by the fact that a candidate hass a higher Pi because he's a stronger contender. And if he's a stronger contender that gives him a higher Wi.

There's nothing wrong with that appoximation, because, as I said, its errors aren't great compared to the uncertainties in our estimates of utilities & probabilities. Under the conditions for which methods like these are needed, I'd often use the better-than-expectation strategy.

But, as ok as that approximation is, is there a better one? A win for a particular candidate can be regarded as a 2-part achievement: He must be one of the top-2 votegetters; and he must be the one of those top-2 who outpolls the other. And if he's one of the top-2 votegetters, it must be that, if there's a tie or neartie for 1st place, then he's in that tie or neartie.

Since, for all that we know, someone who's twice as likely to be in a tie or neartie might well also be twice as likely to win if one of the top 2 votegeters, it's reasonable to guess that those 2 separate achievements may have probabilities that, among the candidates, are proportional to eachother.

And that means that we'd guess that the probability of winning would vary as the square of the probability of being in a tie or neartie if there is one, since to win requires being one of the top 2 votegetters and being the one of the top 2 who outpolls the other, and because if a candidate is one of the top 2, he must be in a tie or neartie if there is one.

So, among the candidates, the Pi are proportional to the square root of the Wi, we'd guess. This reasonable guess agrees with what the geometric argument suggested.

So it's been shown, geometically and otherwise, that it's a reasonable guess that the candidates' probabilities of being in a tie or neartie for 1st place are proportional to the square root of their probabilities of winning.

So we start by guessing the candidates' win probabilities, their Wi. Again, a good way to do that is to rate the candidates according to how likely they are to win, giving them some numerical rating. We could give a rating of 1 to the most winnable, and then rate the others as fractions of that likelihood. Or we could give a rating of 1 to the least winnable, and then judge how many times more likely to win the others are.

This time we don't need to calculate the actual win probabilities. The ratings that we've estimated are good enough; they're proportional to the Wi, and that's all we need. Likewise, when we take the square roots of those win-likelihood ratings, we have something that's proportional to the Pi. And , when we multiply those numbers for i & j together, we have something proportional to the candidate-pairs' Pij, and, again, that's all we need in the strategic value formula in Approval Strategy I.

As before, though, we don't have to include all the candidates in this calculation. We can just make those estimates for some of the more winnable-seeming ones, and treat the others' Wi as zero. We can leave those less winnable ones out of the calculation.

As I said, a good way to rate a candidate's likelihood of winning is to ask how surprised you'd be if s/he won, or how confident you feel that s/he'll win, or how afraid you are that s/he'll win. That's the justification for saying that win probabilities are something that we naturally have a feel for.

Lorrie Crannor discusses some other, more elaborate, ways of estimating the Pij, based on previouse election results.

I don't recommend those Pij estimating methods, unless you want lots of mathematical work.

One of those estimating methods is Hoffman's method. Hoffman's method is very similar to the geometric approach already described, except that Hoffman suggests integrating the probability distribution over each of the tie-zones. For that, of course, it becomes necessary to actually do the many-dimensional geometry.

The position of the outcome point of the previous election (with the same candidates or parties) is assumed to be the most likely outcome point position for the current election. Probability density for the outcome point is assumed to decrease with distance from that most likely outcome point, in accordance with the normal distribution.

Hoffman's method assigns, as the length of the outcome-space cube, the number of votes cast, rather than the number of voters. We don't know in advance how many votes will be cast? No, but we don't know how many people will vote, either. But it doesn't matter, because those quantities only affect the scale of the diagram, not the relative areas or relative probability densities. The reason for letting the cube-edge-length be the number of votes cast instead of the number of voters is that in so doing we reduce by 1 the number of dimensions of the locus of the outcome point. That might make a big difference in the computational work.

We can avoid the many-dimensional geometry by considering, for each candidate, the probability distribution for his vote percentage. That can be estimated on past results for him or his party. From that we can calculate the various Pij.

But, after all that, I'd like to repeat that if you agree with me that our elections have candidates who are completely unacceptable, but winnable, then Approval strategy is greatly simplified: Vote for all of the acceptables, and for none of the unacceptables. But be particular whom you call acceptable, ok?

Mike Ossipoff