Partisan divide widening dramatically in Olympia

For the past few years I’ve posted on our interactive online tool that analyzes the partisan distribution of our state legislature.  The goal is to call out members with the courage to vote independently of their caucus.  Often candidates will run as moderate or independent during the campaign, but we find that their floor votes in Olympia are right down party lines.  This tool provides some transparency into their actions, versus their intent.

If you recall, the methodology is simple:  The partisanship score for each floor vote is calculated as the percentage of Republican supporters minus the percentage of Democrat supporters, giving each a range from 100 (exclusively Republican) to -100 (exclusively Democrat) with unanimous votes scoring zero.  The member’s aggregate score is just the average of all the scores of floor votes they supported, minus the scores from those they opposed.

Looking back to 2003, we see a relatively normal distribution curve for both parties.  And while there isn’t as much overlap in the middle as there’s been in generations past, we do see that there are moderates on both sides of the aisle and even some true independents that have represented us in Olympia.

Partisan Leaderboard - All Policy Areas, Both Chambers (2013-2020)

It is interesting to note that during this time period the most independent members have run as Republicans and that Democrats are generally much less likely to vote against their party.  You can also easily identify the three members who have switched caucuses.

So now, consider the 2019 legislative session results:

Partisan Leaderboard - All Policy Areas, Both Chambers (2019-2020)

Notice a problem?

Both parties are now considerably more partisan and there are no independents (or arguably even moderates) left in the state legislature.

So how has this changed over time?  Let’s take a look…

Partisan Distribution by Party (2003-19)

In the above graph, range lines show a standard deviation above and below the mean.  Markers represent the median value.  The bars in the center represent the party balance, which has almost always favored Democrats.

What can we conclude?

  • Both parties have trended more partisan during this time period.
  • The median tends to consistently fall the left of the caucus mean with Democrats.
  • Democrats are now over twice as partisan as they were in 2003-04 under Gov. Gary Locke.
  • The partisan divide is almost twice as wide now as it was in 2007-08, when the Democrats had a 42 seat advantage.


This “death of the middle” I see as a unhealthy development for our state (and not just because I was one of the moderates unseated with this wave of political polarization).  Compromise is a necessary part of the political process, and we need moderates on both sides of the aisle willing to bridge the divide to find common ground.  During the 5 years there was divided control of the legislature, it admittedly took much longer to hammer out bipartisan agreements…but the resulting work product was worth it.  Our bipartisan budgets typically passed with 90% support, while only 57% voted for this last biennial budget (including no Republicans and not even every Democrat).

Clearly, changes are needed.

How partisan are WA organizational donors?

In previous posts we’ve looked at how our political system has become increasingly polarized, with our analysis of voting records showing that many self-styled moderates and independents are exhibiting much more partisan behavior when voting on the House or Senate floor.  Having served for a few years now on a PDC open data advisory group, I realized that the same type of analysis could be done for organizational donors.  Despite this being public data, not many people in Washington state know who the biggest political donors are, much less who they support.

So, are many of these “non-partisan” or “bi-partisan” organizational donors more partisan than they let on?

We can figure this out, and the methodology is pretty simple.  The PDC Open Data Portal publishes all contributions to candidates and political committees back to 2008.  For this first analysis, we looked at donations to partisan legislative and statewide candidates from businesses, unions and political action committees.  The partisanship score for each donor is the difference in percentage of donations by party, giving each a range from 100 (exclusively Republican) to -100 (exclusively Democrat).

Most of the effort here is involved with the fuzzy matching algorithm, since candidates are very inconsistent on how they report the names of these organizations to the PDC.  I use a trigram matching method, which isn’t perfect…but it does pretty well.  Currently, contributions are grouped under the same donor if 80% or more of the trigrams match, but this is adjustable.

Here are the results:

Partisan Donors (2008-18)

If you visit, you’ll now find an interactive and animated version of our organizational donor partisan analysis that allows you to filter by jurisdiction type and date range.


If you’d like to do further analysis, feel free to hit the Download button for a tab-delimited table of the charted data.  Note that for now we don’t display or download the complete donor dataset, as that includes over 8,000 organizations.

A few things become apparent when reviewing the results:

  • Some of the biggest legislative donors give heavily to candidates from both parties.
  • About 6% more is given to Democrats than Republicans in legislative races, but that increases to 28% when looking at statewide races.

Up next:  We’ll take a close look at hard money vs. soft money contributions.

Why should you save your data back to the cloud

A major benefit of a mobile canvassing app is that your work is automatically recorded. However,  it’s common for people to export their data to another system and work off that; or print out their lists and work off a printed walk list. In those cases, be sure to update your data in TRC afterwards! When using paper lists, here’s why it’s worth the extra effort to save your results back to TRC:


1. Ensure your data is saved and secure

Paper can get lost or stolen.  Whereas data in TRC is safe and secure. It’s saved on the cloud and TRC’s sandbox model guarantees that your data will never get accidentally overwritten.

2. Easy sharing with other campaigns

Once your data is in TRC, it’s easy to conditionally share portions of it with other campaigns. For example, suppose you’re running for a city council race which overlaps another schoolboard race. TRC can automatically figure out just the overlapping records and just share those. That analysis is hard to do with paper.

Furthermore, suppose you’re canvassing team is asking three separate questions and you only want to share results from one of the questions. Again, once your data is in TRC, you can easily do that controlled granular sharing.

3. Make sure you don’t double-contact the same people.

Updating the data on the server ensures your campaign doesn’t accidentally contact the same person multiple times, especially when you have multiple canvassers operating independently.  Accidentally contacting the same people multiple times would be wasting resources and could also be perceived as harassment.

4. Enables searching for patterns and identifying new supporters

Knowing your specific supporters lets us run predictive analytics to identify other potential supporters.  For example, suppose your district has 50,000 voters. If your canvassing activity identities 200 supporters and another 100 non-supporters, we can then use analytics to search for patterns. Perhaps you’re doing well among certain issues, we can then use predictive analytics to find new likely supporters that are also interested in those issues. That can further refine your target.

5. Get GOTV reporting

TRC provides campaign-wide reports for Get-out-the-vote and election predictions. In 2016, these reports were frequently 99% accurate for legislative district races. The more data you provide back to TRC, the more accurate predictions and reports it can provide back you.

The TRC Sandbox

The TRC canvassing app is sitting in front of a powerful general-purpose service for sharing tabular data files like CSVs.

This provides:

  1. Integrations – a means to two-way sync that tabular data with external data sources (like NationBuilder, Salesforce, Google sheets, dropbox, etc)
  2. User-management – you can share out a sheet with users, track per-user activity, and revoke permissions.
  3. Source Control – TRC provides both branching and full history tracking. Branching means you can create a “child” sheet that has a subset view of the “parent” sheet. Full history tracking means you can track every change.
  4. A compute engine – TRC has a general compute fabric for process joins, merges, and aggregations on your data.

Sandbox and Isolation

Each campaign gets its own isolated view of the data, which we call a “sandbox”.  This means that two candidates can running against each other for the same position against each other in a primary,

Sandboxes can then be further partitioned among your volunteers.


For example, R1…R5 are individual records.   User U1 is synchronizing the data with some external data source, such as NationBuilder or Salesforce.  S1 refers to U1’s sandbox.

U1 then shares out subsets of the sheet with volunteers on the campaign.  This creates a new sandbox S2 for User U2, which has access to rows R3,R4. And another new sandbox, S3, for users U5,U6,U7 which get access to R1 and R2.  Since U5,U6, U7 are in the same sandbox (S3), they can see each other’s changes.

The system is a distributed hierarchy, so any operation U1 can do in S1; U2 can do in S2. U2 is fully empowered within their sandbox! So U2 can divide their own sandbox and create a new sandbox for U3 and U4 to each edit R4.  Since U3 and U4 each have their own sandbox (in contrast to U5,U6,U7 who all share a sandbox),  U3 and U4 can edit the same record without conflicting with each other.   Their parent (U2) can then resolve any conflict.

This has several benefits:

  1. It allows multiple candidates to run against each other in a primary. They each get their own sandbox.
  2. Within a sandbox, it ensures that your data is never overwritten.
  3. It allows an untrusted people to submit data to your campaign. The data is just quarantined in its own sandbox and not integrated until proven safe.
  4. The full audit log allows purging bad data even after an integration.

Analyzing the WA Political Spectrum

It’s no secret that our political system has become increasingly polarized in recent years.  In fact, Pew Research regularly publishes studies on the topic and the situation is perhaps more grim than you might think.  Looking back over the past 60 years, Congress has never been so politically divided, and the result is D.C. gridlock.

So how bad is the situation here in Washington State?

After serving two terms now in the House of Representatives and being elected to caucus leadership, I have my opinions.  There’s certainly evidence to support the assertion that the Legislature is doing much better than Congress, but the partisan divide is alive and well in Olympia.  Rather than rely on anecdotal evidence, I was inspired by what Pew and others have done and set out to quantify the problem.

This slideshow requires JavaScript.

A couple of weeks ago I published my first analysis of the partisan distribution of the Washington State Legislature.  It also included a Partisan Leaderboard, calling out the members who are least and most likely to cross the aisle.

The methodology is simple:  The partisanship score for each floor vote is calculated as the percentage of Republican supporters minus the percentage of Democrat supporters, giving each a range from 100 (exclusively Republican) to -100 (exclusively Democrat) with unanimous votes scoring zero.  The member’s aggregate score is just the average of all the scores of floor votes they supported, minus the scores from those they opposed.

This approach is different than many other studies like the McCarty & Shor (2015) Measuring American Legislatures Project, which use the Political Courage Test (former National Political Awareness Test).  Theirs are based on subjective questionnaires, while our scores are based on recorded floor votes.

Now we’ve taken this analysis to the next level…


If you visit, you’ll now find an interactive and animated version of our partisan analysis that allows you to filter by policy area, chamber and date range.  Each member is “stacked” in a histogram, allowing you to roll over the chart and see more detailed information in tooltips.  Also, the median Democrat and Republican scores are indicated as vertical lines.  (Technically, the median is more meaningful than the average for these scores, with half the members being above and half below the line.)

And finally, this updated analysis includes floor votes on amendments, so the scores may be slightly different than those previously published.


If you’d like to do further analysis, feel free to hit the Download button for a tab-delimited table of the charted data.

A few things become apparent when reviewing the results:

  • Democrats are typically over twice as partisan as Republicans, and even more so in years when there was divided control of the House and Senate.
  • The majority party will control the floor agenda, and so will exhibit more “cohesion” and be less likely to allow members to cross the aisle.
  • The Democrat-controlled House was over twice as likely than the Republican-controlled Senate to bring bills to the floor that were rejected by the opposing party.
  • Some members show willingness to regularly cross the aisle for specific policy areas that are important to them or their constituents.  Lobbyists and advocates should take note of these policy areas.


Our goal here is simple:  To provide some transparency into partisan behavior in our state legislature.

There’s nothing inherently wrong with being a partisan voter, and one could argue that a certain degree of caucus cohesion is necessary to be a functioning majority.  However, many constituents elected their representatives with the expectation that they would exercise independent thought and work aggressively across the aisle to get results.  What they may discover looking at this data is that some of their representatives are more loyal to their partisan ideology than they are to a process that involves compromise to find common ground.  Given the example being set in Washington D.C., I also think we should take this moment to recognize those members on both sides of the aisle with the courage to break the partisan gridlock and work in the best interests of our entire state.  If you consider yourself an independent, these members deserve your support.


Ballot Chase!

Washington State is a vote-by-mail state and voters have about 3 weeks before the election to mail in their ballots.  Voter-Science tracks the ballots that are received and provides several tools to aide in your Get-Out-The-Vote (GOTV) efforts.

1) The GOTV Reports

Voter-Science provides GOTV reports – see the Turnout Report  plugin. Note – your account must be enabled for Ballot chase in order for this to plugin to work.

This report includes useful information like:

  • voter turnout statistics
  • breakdowns by party and targets
  • breakdown by result of canvassing
  • identified supporters that haven’t yet voted
  • pre-precinct breakdowns
  • and even heat maps of turnout:


2) Names are crossed off in the List View

For example, in the screen shot below, Nancy and Marvin have already voted and so their names have been automatically crossed off.


This is critical for get-out-the-vote: if somebody has already cast their ballot, no need to contact them further for gotv.

3) Mobile app tells you the ballot received

The mobile apps will tell you the ballot is received

4) Usage with Filters  

Ballots are tracked by creating a new “XVoted” column in your sheet.  It’s a ‘1’ if the ballot has been received. You can also use the Filter tool to filter on XVoted just like any other column and use that to create custom heat maps (Supporters that haven’t voted) or specific child sheets.

For map users, a common “Targeted voters” filter is “IsFalse(XVoted) && IsTrue(XTargetPri)”.   This means “only include people whose ballot is not yet received and who are on the targeted list”. 


Technical details

There is some delay between when a person puts their ballot into the mail, it’s received by the county auditor, and the auditor reports having received it. This is tracked per-county, and counties report at different speeds.  This means that if you see a name crossed off, you can be confident the ballot was received.

Can you win with 49%?

If voters are 50% likely to vote for you, you obviously have a 50% of winning the election. But what are your odds of winning if voters are only 49.9% likely to vote for you?

Let’s do the math …

The Model and Assumptions
For simplicity, assume that all voters have the same percent P of voting for you. In practice, this more resembles just the “swing” voters, and you will have different categories of voters such as your “base” that is very likely to vote for you and your opponent’s “base” that will never vote for you. But this simplification is still sufficient to illustrate the concepts.
So if P = 50%, obviously you have a 50% chance of winning the election.

But what if P drops to 49.5%?  Perhaps there’s a natural bias against you due to party, etc.  Certainly, your odds of pulling an upset and winning are still greater than 0. But it’s not still 49.5% either. So what are the odds?

Doing the math
We’ll  compute these numbers using a Monte Carlo simulation.  Source code is available at:
Say the district size is N. If N=5000 people, dropping P from 50% to 49.5% support means your chances of winning the election would drop from 50% to about 23%!  And when P drops to 49%, odds of victory are 7%.

Here’s a chart showing the full curve.  The horizontal axis is P (the % that an individual voter will vote for you).  The vertical axis is the % that you’ll win the overall election (assuming population size 5000.)


Note that this is not linear! Your chances of winning are not just P*N.

How does this depend on population size?
It turns out due to the Law of Large numbers, this curve gets even sharper as the population size (N) increases.  The law of large numbers means that the larger your sample size, the lower a chance of anomalies occurring. It’s easy to flip 2 heads in a row (25% odds). It’s less likely to flip 10 heads in a row. (.1% chance).   In this case, winning an election when P < 50% is “anomaly”.
Say voters are 49.9% likely to vote for you. Your odds of winning drop off rapidly as the population increases.


1. If voters are only 49.9% likely to vote for you, you still have a chance of winning the election. But it’s a steep dropoff (the blue chart).
2. The chances drop rapidly with population size (the red chart)
3. A 49% – 51% election result is actually a solid loss if the population is large.

Voter Database Decay Rate

How much should you pay to keep your data up to date? How much does it cost you to use stale date?

Let’s look at a real example using the voter database (VRDB) provided by the secretary of state. This tells you the voters in your district and a campaign uses this to know who to contact for voter outreach. This perhaps the most critical piece of data for any campaign.

Suppose it costs you $1 to mail a postcard to a voter. If 1000 voters have moved out of your district and can no longer vote for you, you’d arguably be wasting $1000 to continue sending them postcards asking for their vote.

So not updating your copy of the voter-database costs you money in wasted resources. But ingressing a new copy of the voter-database costs you something too. So where’s the sweet spot? How frequently do you need to refresh your copy of the voter database?

Don’t guess! Let’s measure it …

1. Establish a “difference function”
We must establish a difference function to compare to tables (or CSV files).  This is somewhat arbitrary, but we’ll count the “decay” as the number of deltas to convert the first file into the second. The difference function should be symmetric.

We’ll count the following as differences:
– if a voter is in one file but not the other. This may mean a voter has moved into the district or left the district.
– If a voter has changed (such as a different last name or different precinct number). This may mean the voter has moved within the district or changed their name.

If the two files have N1 and N2 rows respectively, then the maximum number of differences would be (N1+N2).

For this study, we use an implementation from

2. Get the data
Here, we’ll look at VRDBs from Oct’12 through Feb’16.  Voter Databases can be obtained from the Secretary of State at

3. Apply

Here’s the result of applying the difference function. We start with Oct’12 and use that as a baseline, and comparing each VRDB back it.




4. Observation and Analysis
Within 1.5 years, there were over a million differences. If it costs $1 a contact, that could be potentially wasting $1 million in a statewide campaign by operating with stale data with a VRDB that’s even 2 years out of date.

We’d expect the decay to slow down and not be linear. Once a person moves, subsequent moves don’t count as additional differences. For example, say person X starts in precinct p1 in Jan’14, moves to precinct p2 in June ’14, and then moves to precinct p3 in Dec ’14. That’s only 1 total difference from Jan’14 to Dec ’14 (moving from P1 to P3) even though there were 2 moves.

5. Next steps?
Possible future explorations here:

  1. Refine the difference function. Is there an ideal difference function?
  2. Rrepeat with more data
  3. This was for Washington state. Compare to other states.
  4. Compare the vrdb decay rate in urban vs. rural counties.
  5. Analyze the empirical data and correlate it with specific events. For example, why was their a decrease in voter registration records in Mar’14.
  6. Develop a theoretical model and match to the empirical data here.

Analytics Blog Entry Contest with a Cash Prize

Voter-Science is hosting a contest with a $500 cash prize! The winner will be announced at the TechRoanoke conference on May 14th. You can register for the conference here.

The challenge is to write a blog entry demonstrating data, statistics, and analytics as applied to the campaign or political sphere. Possible ideas:

  • Propose an algorithm for rating the quality of predictions on party id.
  • Show statistically which is easier: winning a single statewide race or winning the majority of legislative seats?
  • Show a unique visualization that makes an argument on a pertinent issue.
  • See an example of showing voter database decay rate.

Articles will be judged by Voter-Science and appear on the voter-science blog. Criteria include a) technical rigor, b) innovation and relevance, c) general blog quality.

1st place prize is $500 in cash. 2nd place is $250.
Feel free to ask for any clarifications.

Entries must be submitted to by May 12 8am PST.