Here is some analysis on the recent WA 2020 primary, particularly with a focus on the house races.Continue reading “Aug2020 Postmortem – WA House Races”
Votes in Washington State’s primary earlier this week are still being counted, but here are some results so far [as of 8/6/2020]. Data here is publicly available from the Secretary of State.
There was a competitive field of GOP candidates running against a 2-term incumbent Democrat. Here were the results of the top vote getters. The top 2 (Inslee and Culp) will go on to the general.
Here’s is a further breakdown of GOP candidates per county. Circle size is weighted per county population.
Here’s the combined Democrat (Blue) vs. Combined Republican (Red).
“Gerrymandering” is manipulating political boundaries to favor a party. Wikipedia has an excellent summary and examples of the concept:
We’ll take a purely data-driven approach to measure if Washington State’s legislative districts are gerrymandered. While there is no absolute mathematical definition for gerrymandering – and therefore no definitive test – there are good objective statistical tests to measure anomalies.
This article focuses on applying these approaches to the legislative boundaries in WA state.
- First, we’ll look at the actual election results and see if there’s anything suspicious on the surface.
- Then we’ll run a standard statistical test – the McGhee test developed at the University of Chicago.
- And then we’ll run some generic algorithms to produce actually gerrymandered maps and compare to actual results.
To simplify nomenclature throughout this analysis, we’ll provide summary from the GOP perspective. The results can all be directly flipped to switch to the Democrat perspective. (IE, almost always, a x% GOP result means a (100-x)% Democrat result).
To simplify nomenclature, we’ll look at results from the GOP There’s no definitive criteria for creating district boundaries. Districts must be contiguous and similar in population. However, even these criteria are tricky. For example, a district boundary is set for 10 years, so as population grows and shifts over time, districts’ populations may shift. Districts can’t be simple shapes like hexagons because they may need to account for geographic boundaries or roads.
Here is a map of legislative boundaries. Since districts are population based (and not based on square-miles), one can see the districts are more concentrated in dense population centers.
 Looking at actual election results
WA has 49 legislative districts, and each district has 2 house members and 1 senate member.
As of the last statewide legislative election in Nov ‘16, in the senate, the GOP / Democrat split was 24-25. In the house, the GOP/ Democrats split was 48- 50. Roughly 90% of the 49 districts have all three members from the same party, indicating that individual districts carry a definitive partisan bias. But when tallying up all the legislative races, the overall split is almost evenly divided between parties in both the house and senate caucus.
Here’s how the GOP results in the legislative caucuses compares to their 2016 statewide results between a Democrat and Republican candidate. [ Source: WA Secretary of State] :
|GOP Candidate||Percent Vote|
|2016 Secretary of State (Wyman)||54.74%|
|2016 GOP House||48.98%|
|2016 GOP Senate||48.98%|
|2012, Governor (McKenna)||48.50%|
|2016 Auditor (Miloscia)||47.69%|
|2016 Public Lands (McLaughlin)||46.84%|
|2016 Governor (Bryant)||45.61%|
|2016 Lt. Governor (McClendon)||45.61%|
|2016 Insurance Commissioner (Schrock)||41.66%|
|2016 Senate (Vance)||40.99%|
|2016 President (Trump)||38.07%|
So clearly the Republicans have done a better job in the legislature than at most statewide races. Only Kim Wyman has outperformed the caucuses.
Some may suggest gerrymandering as the only way that the Republican caucuses could outperform statewide races. But a statewide race requires a single candidate to appeal to all 49 districts. Whereas legislative districts have a different candidate per district, allowing each candidate to vary to “fit the district”.
The real test of boundaries is to focus on a single partisan candidate and compare what percent of legislative districts they’d “win” to their statewide percentage. For example, Trump got 38.07% of the statewide vote. He also won 19 / 49 legislative districts, which is 38.78% – nearly the same ratio that he had statewide. That is a strong indicator that the districts aren’t gerrymandered.
We can see the results from other partisan statewide candidates:
|GOP Candidate||% of Statewide vote||% of Districts won|
|2016 Secretary of State (Wyman)||54.74%||78%|
|2016 Auditor (Miloscia)||47.69%||53%|
|2016 Public Lands (McLaughlin)||46.84%||47%|
|2016 Governor (Bryant)||45.61%||47%|
|2016 Lt. Governor (McClendon)||45.61%||45%|
|2016 President (Trump)||38.07%||39%|
|2016 Insurance Commissioner (Schrock)||41.66%||33%|
|2016 Senate (Vance)||40.99%||27%|
This analysis is looking at a broad range of races across a 15% spread. If the districts were actually gerrymandered, we’d expect that GOP candidates consistently performed better (or from the Democrat’s perspective, worse) in ‘% of district won’ than by ‘% of statewide vote’. But they do not. There’s an almost linear correlation between these results (R2=.83). Candidates that won more statewide votes also won more individual districts. Some GOP candidates benefit from the legislative boundaries, some performed worse.
 Bring out the math – running the statistical tests
The mathematical test we’ll run is the McGhee test, developed by Eric McGhee from University of Chicago. “Wired” explains “In that paper, they proposed a simple measure of partisan symmetry, called the “efficiency gap,” which tries to capture just what it is that gerrymandering does. At its core, gerrymandering is about wasting your opponent’s votes: packing them where they aren’t needed and spreading them where they can’t win.”
The test defines a “wasted vote” as any vote that does not directly contribute to a victory. If you win a district, any vote past 50% is considered wasted (it wasn’t necessary to win); if you lose a district, all of the votes were wasted. Practically, this means:
- Unless you win a district with exactly 50%+ 1 votes, there are at least some “wasted” votes.
- large blowout victories and 49.9% “close calls” produce the most “wasted” votes.
It then defines an “efficiency gap” as the (difference in each party’s wasted vote divided by the total vote). There is no definitive threshold for the efficiency gap that defines gerrymandering, but McGhee calculated the average efficiency gap in 2012 was 6%, and the egregious gerrymandering examples have are over 10%.
We apply the McGhee test on the 2016 presidential race across the legislative districts using election data from the Secretary of State:
|GOP Candidate||Percent Vote||Egap|
|2016 Secretary of State (Wyman)||54.74%||-16.3%|
|2016 Auditor (Miloscia)||47.69%||-6.3%|
|2016 Public Lands (McLaughlin)||46.84%||-2.1%|
|2016 Lt. Governor (McClendon)||45.61%||-2.2%|
|2016 Governor (Bryant)||45.61%||-4.5%|
|2016 Insurance Commissioner (Schrock)||41.66%||2.6%|
|2016 Senate (Vance)||40.99%||6.8%|
|2016 President (Trump)||38.07%||-3.7%|
The average gap from this spectrum of WA races is 3.2%, well below the national average. So our statistical test suggest that the districts are not gerrymandered.
 What would gerrymandering look like?
A final approach we take is to work backwards: we can deliberately produce gerrymandered maps and compare them to the actual map.
Here, we use a genetic algorithm, which starts with an initial configuration and then mutates it over a series of iterations as it “evolves” towards a goal. Mutations must preserve certain rules like contiguous boundaries. In this case, the goal was to maximize the number of GOP legislative victories, where victories where calculated using a Monte Carlo simulation driven by previous election turnout results from record poor GOP years. We used election results that initially gave GOP only 21 of the 49 districts – simulating a “worst case scenario” for GOP that put them near their historical lows. After series of genetic mutations, the final result was a map with 26 of 49 GOP wins – a pickup of 5 seats. The chart here shows the evolution progressing along the top.
However, we notice that the boundaries here definitely look suspicious. They’re clearly warped and have unnatural borders designed to carve out an advantage.
What this also shows is that truly gerrymandered results could produce a significant GOP advantage – even in a year with record poor Republican voter turnout.
- The legislative results are within proximity of the statewide governor results. And when measured across a wide range of candidates, there is no consistent advantage from district boundaries over a pure statewide vote.
- The house and senate GOP caucus performances do perform exceptionally well – particularly compared to the statewide performance of most GOP candidates. But this appears to be more due to the caucuses picking candidates to fit their district rather than gerrymandering.
- If we deliberately create theoretical gerrymandered districts via computer simulation, the potential GOP advantage would be significantly higher than what we witness.
In the absence of any contradicting evidence, we would conclude that WA state’s legislative boundaries are fairly drawn and not gerrymandered.
WA State’s vote-by-mail is underway for the 2016 General election!
[Updated 11/15/2016 10am] – we’re up to 79% turnout. Here’s a further breakdown and some tools to help track ballot returns.
TRC is a canvassing tool that can pull data from a variety of different sources. For example:
|Voter names, age, addresses||Secretary of State VRDB||Perfect – the SOS is the source of truth.|
|Map view||Geocoding address to get a Latitude and Longitude||High – we try to get the pin right on the house.|
|GOTV – did you mail in your ballot?||County auditor||High – but there can be a lag between when the ballot is mailed and when the auditor reports it.|
|Voter history||Secretary of State historical files||Perfect|
|Past Precinct Results||Secretary of State||Perfect – although precinct boundaries and populations change over time.|
What about Party Id?
Party Id is determining which party a voter is aligned with. Democrat? Republican? Libertarian? Other? A common convention is assigning a “party id score” that’s a scale of 1 (hard gop) … 5 (hard democrat). 3 is independent, 0 means unknown. This is crude and deeply flawed (how do you represent people that split their tickets?), but it’s still widely used be campaigns.
While most of the data has an official source, there’s no definitive list of party identification. So organizations that provide party id must make an educated guess based on the data they do know – such as if you voted in the Democrat presidential primary or if your PDC donations show strong contributions to Republicans. If new data comes in, we update the guess. This gets awkward when if the first guess was right and then the 2nd guess is wrong.
Pinned vs Floating
TRC helps you cope with this uncertainty. Most tools treat the values as static numbers. This unfortunately means you don’t know the source or confidence of a value. TRC them as “Pinned” and “Floating”.
1. Once you change a value, it is “pinned” and that changed value should never get overwritten by somebody else. (You can see the full audit history here in the History tabs or in the Blame plugin.)
2. But before you change it in TRC, the value is “floating” and can be updated underneath you when we rebuild the models.
TRC addresses this by letting you “pin” values, and by giving each user their own “sandbox” that lets them track their own specific values.
We mark any Floating values with a “?” after the party id. This lets users know that it’s a guess and may change underneath you.
So for example, M Dunwiddie starts with:
The ‘?’ means that the data can change. The 5 means our guess is hard democrat. But what if we then see that M Dunwiddie voted in a Republican primary and donated $100 to a Republican candidate? The data team could pick up that data and update the model to a ‘1’. But even then, new data could flip it back to a ‘5’ (such as if the data team later found she donated $10,000 to a democrat).
But regardless of what the data team does with floating values, say I then go in and explicitly change her to a 1. The cell goes green, and the question mark is now removed.
And when I refresh the browser, the green highlights reset but the question mark stays removed. The lack of question mark tells me the value is now “pinned” and won’t change. This only applies to the Party column.
1. Once the cell is green, it’s saved to the server. This means If the cell does not turn green, it hasn’t been saved.
2. If the party column has a “?” next to it; the value may change on you. This means if it has the correct value, but has a ‘?’ next to it, then go in and deliberately change it and make it turn green. That will pin the correct value.