The Data Lifecycle

This article describes a best-practice for campaigns in using data for targeting voters and how you can achieve that with TRC.

The flow here is to start with getting initial public data from the county auditor, merge in microtargeting information, choose the targets, canvass, and iterate on the model.


Conceptually, we can think of data like a giant spreadsheet (CSV file). Each row is a voter, and columns are information about that voter.  We’ll walk through the different phases with a small sample of 8 records, but TRC can help you do this with your entire district of 80,000 records

1. Start with initial voter lists


First, you get an initial voter list telling you the people in your district. This is often public information that can be obtained from a county auditor or Secretary of State. Voter Databases (aka “Voter rolls” or “vrdb”) often have over 20 columns of information, but it’s generally name, address, district, gender, and age of all the registered voters.
In this article, we’ll start with a simple example and just use Name and Address columns to keep it more concise:


Please note:

  • This will have a primary key that uniquely identifies each voter. In this case, it’s the first column, “RecId”. County and State Auditors may have different ids.
  • The order of the rows doesn’t matter. If there’s an import ordering, such as priority or walk-order, then that should be captured as another column that you can sort on.

You will need to geocode addresses into a (Latitude, Longitude) to display it on a map. Services like google will do this for a small number of addresses, but accurately geocoding a large volume can be tricky. TRC can geocode for you.

The sample here is just 8 voters so that we can see it all on one page, but a legislative district is commonly around 80,000 voters. At those sizes, it’s prohibitively expensive to send everybody a mailer or knock on every door. That’s where targeting comes in. What’s the subset of these 80,000 that you need to target?

2. Add in microtargeting


Next, add in additional columns of data that will help with the targeting. Common categories of data are:

  • party affiliation – Some states track party affiliation in their voter database. WA state does not.
  • issue preference
  • likeliness to vote.
  • “have they voted in the current election” – this is critical during a ballot chase for vote-by-mail states and can involve daily updates.

Many political organizations, especially the RNC and DNC, maintain a database with voter rolls and some targeting information. Often you need to buy additional microtargeting from a data vendor. Data Vendors in turn get this data by first buying lists from other sources like advertisers, online profiles, magazine subscriptions, and conference attendance lists, and then using analytic techniques to merge them into meaningful columns. For example, a hunting magazine may sell its subscription list to a data vendor who then matches those names to the voter roll and tags them as socially conservative.

Think of everything as a spreadsheet, and these new microtargeting files are just more spreadsheets that will get merged into our master sheet.
Suppose we’re able to get a party id model, like so:


This has a primary key that matches our master sheet, so we can easily merge this back into our master sheet.
Suppose we also get a social and fiscal model, but in this case, we just have names and no primary key, like so:

When you’re pulling data from different sources, you need to match everything to the same primary key so that you can merge them. (This is referred to as “voter matching” or sometimes “mapping”). So in this case, we’d need to match the names back to the voter database and get a primary key. The rows here are also in a different order than our master sheet.
Once we’ve matched the names, we then have a RecId column that we can use for the merge. Note that some names, like Ziggy, didn’t match back to anything. Other names, like George (R7) are missing.


Once we have the different files matched back to the voter database via their primary key (RecId), we can merge them back into a single unified sheet:


[TRC Tip]: With TRC, you can upload these additional columns to your sheet via the DataUploader. You can also share these microtargeting columns with others or buy them from Voter-Science already matched.

Tip: Data is often sparse. It can be difficult and expensive to get meaningful data on every single voter. For example, a vendor may offer to sell you 400 columns worth of data, but you may find that the columns are only 5% filled in.

Tip: Be aware that data goes stale and gets out of date (often called a “decay rate”). Voters move around so voter rolls get out date (and you can measure it). Swing voters may switch parties.

Tip: Be careful with email and phone numbers. Campaigns often want email addresses and phone numbers for all voters. These are difficult to obtain accurately, and there are restrictions on how you’re allowed to use them, so be sure you know the law and have a plan.

See also: Pinned vs. Floating values

3. Choose your targets (“build a model”)


Now that you have some data, you need to decide which voters to target. You don’t want to waste dollars attempting to persuade people that will never vote for you. But you need to target enough people to actually win.

There’s tremendous strategy here and it can be an iterative process. For example, in our sheet, we may start by targeting GOP, but find that’s only 3 of the 8 (37.5% of total) and not enough votes to win. Then we may expand the targets to be “GOP or Fiscal conservatives”, in which case we pick up two more targeted voters (R4 and R6). “Targeted voters” becomes yet another column in our sheet, like so:


(we’ve highlighted the cells that matched our criteria to make it a target).

Here, a “1” means the voter in that row is targeted. A blank means they’re not.
The act of determining a target list is called “building a model”. We’re building simple models here based on basic field selection criteria. We may bring the names up on a map and target geographically as well. But for larger data sets, there are more advanced mathematical techniques. Analytics is about building better models. And the more data available, the better the model can be.

Tip: You should be measuring the accuracy of your model, and always continue to improve your models.

Tip: Generally, target the swing voters. Your base will already vote for you, although you may wish to still target them to recruit them to your campaign as volunteers or donors.

4. Create your survey

Now come up with the survey questions to ask. An obvious one should be “Do you support my campaign?”, but you can include other questions relevant to your race.
Survey questions are conceptually just more columns in our sheet, but these editable-columns where as our previous columns where all read-only. As we canvass, we’re effectively filling in the survey columns.


TRC provides a default set of questions or you can edit them. See here for details.

5. Partition: Assign to workers!

Now that we have some targets, we need to partition the list out to canvassers. Canvassers will go to each target, ask them questions, and spread the word about your candidate.
When you have an entire district of 80k records, partitioning can be a challenge.

TRC lets you partition several ways:

  • Automatically split by precinct or some other column. For example, an average legislative district can be split into about 100 precincts.
  • GeoFence – where you bring up names on a map and can draw boundaries. See here for details.
  • Filter – You can filter on a combination of column selection and geography. See here for details.

We refer to act of schema of partitioning a sheet as the “topology“.

Once you’ve partitioned, TRC lets you quickly share out you’re your team.

6. Canvass (Collect the data)


Canvassing is going door to door to collect the information. This could also include phone banking, surveys, or any other means of collecting information back from the voters.
Conceptually, this is filling in cells in the survey columns.

A canvassing app will also capture additional columns for the questions that track information about the actual mechanics of administering the survey, such as:

  • who canvassed it?
  • when did they do it?
  • where were they?
  • what device did they use to canvass?

TRC will automatically collect this for you and use it in reporting. You can then use this “meta” information to evaluate your canvassing operation itself. For example, you can see if the timestamps and location of canvassing match what you’d expect.
If you visualize all that meta-information as yet more rows in your sheet, it might look like this once the surveys are filled out:


That may seem like a lot of information, but that’s incredibly useful for monitoring your field operation and measuring its effectiveness. You can also mine it for anomalies to verify the canvassers hit the area you expected.
TRC uses this information for a variety of reports – see below.

7. Reporting + Analysis

As you collect data from the field, you should be analyzing it daily to determine two things:

Is your field operation effective? Look at the reporting. Are you actually contacting the voters or just running a literature-drop operation? Which of your canvassers are the top performers and which need coaching?

TRC generates several reports automatically to help you with this. See TRC reporting for details.

Can you use the data to update your model?
For example, you may notice trends that help you adjust targeting.
Or you may get in new information about who has voted and use that to refine your list.

Tip: A primary election is an excellent source of data because it will provide you with per-precinct results and voter turnout. But this means you need to be very active before the primary so that you have something to measure!

  • Look at the precincts you hit and see where your efforts made a measurable difference in the precinct results.
  • Compare your canvassing to the actual voter turnout. Did the people you talk to have a higher turnout than the people you didn’t?

Update your model and repeat!

Given the new data that you’ve collected, go back and update your model!
After all, those survey columns can also be used as data columns. This means data is also flowing in a cycle. Targets drive canvassing, and then canvassing results along with microtargeting influence your targets.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s