Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan...

14
Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang * Eric Kim Nicholas Carlini Ivan Motyashov Daniel Nguyen David Wagner * University of California, San Diego University of California, Berkeley Abstract We present OpenCount: a system that tabulates scanned ballots from an election by combining computer vision algorithms with focused operator assistance. OpenCount is designed to support risk-limiting audits and to be scal- able to large elections, robust to conditions encountered using typical scanner hardware, and general to a wide class of ballot types—all without the need for integra- tion with any vendor systems. To achieve these goals, we introduce a novel operator-in-the-loop computer vi- sion pipeline for automatically processing scanned bal- lots while allowing the operator to intervene in a simple, intuitive manner. We evaluate our system on data col- lected from five risk-limiting audit pilots conducted in California in 2011. 1 Introduction In this paper, we develop techniques to count optical scan ballots, based only upon scanned images of those ballots. There are several reasons why it might be useful to be able to count the ballots using a system that is indepen- dent of the official, certified voting system. Some have suggested that, to gain confidence in the election results, all ballots should be scanned and the images should be published, so that anyone interested can count the ballots on their own [7] [16, § 3.3]. We build a tool that can be used to perform the counting process, and thus could be used for this purpose. Alternatively, our tool could be used by election officials to check the accuracy of official vote tallies before they are certified. Perhaps most importantly, our tool can play an im- portant role in supporting risk-limiting audits of elec- tions [12]. Risk-limiting audits require the ability to ex- port cast vote records (CVRs) and vote totals from the voting system, separated by precinct and other criteria. Unfortunately, many current voting systems cannot re- port vote tallies in this fashion, and provide no way to export cast vote records [20], posing a barrier to adop- tion of risk-limiting audits. Our tool provides a way to generate CVRs and makes it easy to generate vote tallies for arbitrary batch sizes, enabling risk-limiting audits to be more efficient. Thus, our tool eliminates a key barrier to adoption of risk-limiting auditing [6, § 4]. Finally, our tool can support ballot-level audits [4, 3]. For example, if the order of ballots is maintained after scanning, our tool enables ballot-level audits using the order of ballots [3, § 2]. Thus, our work eliminates a barrier to ballot-level auditing that was identified in prior work [19, § 6]. See Section 2.4 for further discussion. We are not the first to articulate this vision. The Humboldt Election Transparency Project initially pro- posed parallel tabulation [7], and in support of this goal, the ground-breaking TEVS system [21] (the successor to Ballot Browser) was built to automate the process of tabulating an election from scanned ballot images. Our work was motivated by an attempt to use TEVS during several risk-limiting audit pilots in California in 2011. We found that at this point in its development, TEVS required some adaptations to the code and/or (due to its reliance upon OCR) manual post-processing of its output for each election [17, 18]. We design techniques to ad- dress these problems and build a tool, OpenCount, that provides software support for risk-limiting audits. Our early prototype was used in California pilots in 2011, and we plan to use the refined tool to assist with risk-limiting audits in 6 more counties in 2012. This paper makes the following contributions: We design techniques to recognize and tabulate votes, given only scanned ballot images. Our meth- ods require no support from the official voting sys- tem and do not rely upon election definition files. We implement these techniques and show that they scale to large elections and are robust enough to handle the cases that arise in practice. 1

Transcript of Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan...

Page 1: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

Operator-Assisted Tabulation of Optical Scan Ballots

Kai Wang∗ Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

David Wagner†

∗University of California, San Diego †University of California, Berkeley

Abstract

We present OpenCount: a system that tabulates scannedballots from an election by combining computer visionalgorithms with focused operator assistance. OpenCountis designed to support risk-limiting audits and to be scal-able to large elections, robust to conditions encounteredusing typical scanner hardware, and general to a wideclass of ballot types—all without the need for integra-tion with any vendor systems. To achieve these goals,we introduce a novel operator-in-the-loop computer vi-sion pipeline for automatically processing scanned bal-lots while allowing the operator to intervene in a simple,intuitive manner. We evaluate our system on data col-lected from five risk-limiting audit pilots conducted inCalifornia in 2011.

1 Introduction

In this paper, we develop techniques to count optical scanballots, based only upon scanned images of those ballots.

There are several reasons why it might be useful to beable to count the ballots using a system that is indepen-dent of the official, certified voting system. Some havesuggested that, to gain confidence in the election results,all ballots should be scanned and the images should bepublished, so that anyone interested can count the ballotson their own [7] [16, § 3.3]. We build a tool that canbe used to perform the counting process, and thus couldbe used for this purpose. Alternatively, our tool could beused by election officials to check the accuracy of officialvote tallies before they are certified.

Perhaps most importantly, our tool can play an im-portant role in supporting risk-limiting audits of elec-tions [12]. Risk-limiting audits require the ability to ex-port cast vote records (CVRs) and vote totals from thevoting system, separated by precinct and other criteria.Unfortunately, many current voting systems cannot re-port vote tallies in this fashion, and provide no way to

export cast vote records [20], posing a barrier to adop-tion of risk-limiting audits. Our tool provides a way togenerate CVRs and makes it easy to generate vote talliesfor arbitrary batch sizes, enabling risk-limiting audits tobe more efficient. Thus, our tool eliminates a key barrierto adoption of risk-limiting auditing [6, § 4].

Finally, our tool can support ballot-level audits [4, 3].For example, if the order of ballots is maintained afterscanning, our tool enables ballot-level audits using theorder of ballots [3, § 2]. Thus, our work eliminates abarrier to ballot-level auditing that was identified in priorwork [19, § 6]. See Section 2.4 for further discussion.

We are not the first to articulate this vision. TheHumboldt Election Transparency Project initially pro-posed parallel tabulation [7], and in support of this goal,the ground-breaking TEVS system [21] (the successorto Ballot Browser) was built to automate the process oftabulating an election from scanned ballot images. Ourwork was motivated by an attempt to use TEVS duringseveral risk-limiting audit pilots in California in 2011.We found that at this point in its development, TEVSrequired some adaptations to the code and/or (due to itsreliance upon OCR) manual post-processing of its outputfor each election [17, 18]. We design techniques to ad-dress these problems and build a tool, OpenCount, thatprovides software support for risk-limiting audits. Ourearly prototype was used in California pilots in 2011, andwe plan to use the refined tool to assist with risk-limitingaudits in 6 more counties in 2012.

This paper makes the following contributions:

• We design techniques to recognize and tabulatevotes, given only scanned ballot images. Our meth-ods require no support from the official voting sys-tem and do not rely upon election definition files.

• We implement these techniques and show that theyscale to large elections and are robust enough tohandle the cases that arise in practice.

1

Page 2: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

• We demonstrate our techniques and our tool on fiveelections held in five California counties in 2011and show that OpenCount is at least as accurate ascurrently deployed voting systems for this dataset.

2 Problem Statement

2.1 TerminologyWe define some terms we will use throughout the pa-per. A blank ballot is a paper ballot as it was originallyprinted, with no voter marks on it. A voted ballot is theballot after a voter has marked his/her ballot and cast it.

Each ballot contains a set of contests. A contest in-cludes a list of candidates, with one voting target per can-didate. A voting target is an empty oval, broken arrow, orother location on the ballot where the voter should markher ballot, if she wants to indicate a vote for the associ-ated candidate. A cast vote record (CVR) is a record ofall selections made by the voter on a single voted ballot.

The ballot style is the set of contests found on the bal-lot as well as the visual organization of these contests onthe ballot. For example, an English-language ballot mayhave the same set of contests as a Spanish-language bal-lot, but because their text is different, we consider themas two different ballot styles. Ballots may also contain aprecinct number or a tally group for accumulation (e.g.,absentee vs. polling-place). We do not distinguish be-tween blank ballots whose content is visually identical.

The grouping patch is a region on the ballot thatuniquely determines the ballot style and blank ballot. Forexample, the grouping patch might be a location on theballot where the precinct number is printed. In someelections, we use multiple grouping patches: for exam-ple, one grouping patch might encode the precinct, an-other might encode the language, and (in a primary elec-tion) a third might encode the party affiliation.

2.2 GoalOur goal is to count a set of paper ballots, given just twokinds of information: scans of all of the voted ballots,and scans of blank ballots. In particular, we use only thehuman-readable information that is found on the face ofthe ballots—the same information that voters see. Wewant to compute vote tallies and CVRs, given this set ofscans.

Deployed voting systems typically use election defini-tion files, generated by the ballot layout definition sys-tem, to determine where to look for voter marks andassociate them with particular candidates. We deliber-ately avoid relying upon election definition files, bothbecause the election definition file can be faulty (caus-ing deployed systems to mis-count votes) and because

we want to independently count the ballots without anydependencies on other voting software.

We assume that someone has scanned all of the votedballots, and has scanned one instance of each visually-unique blank ballot, using a standard document scanner.The logistical details of the scanning process are out ofscope for this paper.

2.3 Technical Challenges

There are several non-trivial challenges in building a sys-tem to count the ballots. First, our system must be scal-able: it must be able to process all of the data associ-ated with large elections, in a reasonable amount of time(hours or days, not weeks). This imposes severe restric-tions on the kinds of algorithms that we can use: a systemthat takes 1 second of computation per ballot may be ac-ceptable, but one that takes 100 seconds per ballot wouldbe unworkable (for an election with 100,000 ballots, pro-cessing would take about 16 weeks).

Second, and closely related, the system must be ro-bust. We have found that, when dealing with large num-bers of ballots, unusual cases are common. There willalways be some ballots that are scanned imperfectly –effects from transformation (rotation, skew), distortion(scanner noise, specks of dust), and illumination differ-ences may be prominent. An extreme (yet observed) caseincludes the physical destruction of the paper ballot itself(tears, creases, and other damage). Our system must beable to handle all of these cases gracefully. A methodthat works on 99.9% of ballots is too fragile; in an elec-tion with 100,000 ballots, such a method would fail on100 ballots. The robustness requirements rule out manytechniques that may at first glance appear promising.

Third, the system should be general: it should not relyupon hard-coded assumptions that are specific to ballotsfrom a particular voting vendor. For instance, it shouldnot rely upon decoding barcodes or timing marks thatmay be found on the ballot, as those are vendor-specific.We do not want the system to be limited to handling bal-lots from a particular set of voting system vendors.

Lastly, the system must be self-contained: it must notrequire any additional data sources beyond the scannedimages. This poses algorithmic challenges. For instance,when processing a voted ballot, we need to identify thecorresponding blank ballot, so that we can identify theset of contests on the ballot, find the location of all vot-ing targets, and then check for marks at those locations.However, when an election might have 100 or 1000 dif-ferent types of blank ballots, it is not clear how to quicklymap each voted ballot to its corresponding blank ballot,within the time budget. We devise algorithms to solvethis problem efficiently.

2

Page 3: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

2.4 Applications to AuditingOpenCount could be used to support election auditing inthree ways, elaborated below.

Parallel audits. If election officials scanned all pa-per ballots and made the scans available to the public,OpenCount could be used to count those scanned bal-lots. Because OpenCount is open source, any candidate,observer, or interested member of the public could useOpenCount to interpret and count the ballots. However,this procedure assumes that OpenCount is correct andthat the published scans accurately reflect the paper bal-lots actually cast in the election [16], which limits thelevel of confidence attainable through such an approach.

Batch-level risk-limiting audits. OpenCount can beused to support risk-limiting audits, performed at a batchgranularity. Generally speaking, the smaller the batch,the more efficient the audit is, so there are good reasonsto want to use batches that are as small as possible. Theaudit procedure needs vote tallies for each batch. Unfor-tunately, many currently deployed voting systems cannotproduce vote tallies for batches that are smaller than aprecinct. This is a key barrier to broader use of small-batch risk-limiting audits [6, § 4].

OpenCount meets this need: it supports countingarbitrarily-defined batches. Batches can be defined byany combination of attributes printed on the ballot (e.g.,precinct number, mode, ballot type, party affiliation, lan-guage), as well as any structure inherent in how the bal-lots are scanned (e.g., if ballots are scanned in batches,OpenCount can produce corresponding batch-level tal-lies). Thus, OpenCount satisfies a prerequisite for adop-tion of small-batch risk-limiting audits. Election officialscan scan their ballots in batches using an ordinary doc-ument scanner, process the ballots using OpenCount toobtain batch-level tallies, and then verify those tallies us-ing a batch-level risk-limiting audit.

OpenCount also solves a related challenge associatedwith auditing of absentee ballots. Some jurisdictions donot sort their absentee ballots by precinct; instead, theyscan them in batches. The natural way to audit such bal-lots is using those same batches. Unfortunately, manydeployed voting systems cannot export vote tallies foreach batch, which makes it difficult to audit the absenteeballots in this situation. OpenCount solves this problem.

Ballot-level risk-limiting audits. Ballot-level auditshave the potential to be especially efficient, because thebatch size is a single ballot. However, most deployed vot-ing systems do not provide the information that is neededto perform ballot-level audits [3, § 2].

Ballot-level audits require a cast vote record (CVR)for each ballot and some way to uniquely associate eachpaper ballot to its CVR. Most deployed voting systems

cannot provide this information. OpenCount meets thisneed. Election officials can scan the ballots, maintainthem in the order they were scanned, and process thescanned images using OpenCount. OpenCount can pro-duce a CVR for each ballot, in the same order as the pa-per ballots. The ordering of the ballots provides a wayto link each paper ballot to its CVR. Then, given theseCVRs, one could apply SOBA [3] or another ballot-levelrisk-limiting audit method.

Alternatively, if ballots are stamped with a uniquenumber while they are scanned (a feature that is sup-ported by many commercial scanners), OpenCount canbe used to enable ballot-level auditing without any sup-port from or changes to the official certified voting sys-tem [4, 3].

Note that using OpenCount to facilitate risk-limitingaudits does not require trusting OpenCount; the auditprocess verifies the accuracy of OpenCount’s results.This is a transitive audit [12]: to address limitations inthe official voting system, we count the ballots a secondtime using OpenCount, check that OpenCount reportsthe same winner as the official voting system, and thenuse a risk-limiting audit to verify the accuracy of Open-Count’s results. The benefit of using OpenCount is thatit can produce the information needed by risk-limitingaudit procedures—something that is beyond the capabil-ities of many currently deployed voting systems.

3 Overview of Approach

3.1 Principles

Our design is influenced by several principles. Firstand foremost, OpenCount uses a hybrid of computer vi-sion algorithms together with human assistance. Neitheralone is sufficient; human classification does not scale,but computer vision cannot ensure that ballots are inter-preted as a voter or election official would. Thus, weuse vision-based algorithms for scalability, and we relyupon assistance from a human operator to guide the al-gorithms in edge cases or situations where voter intent isambiguous. The technical challenge is to identify how todirect the computation so the operator’s workload is min-imized. In short, OpenCount can be thought of as a hy-brid man/machine system; one of the novel contributionsof OpenCount lies in the specific details of which tasksare performed by algorithms and which are performed bythe operator.

Second, to ensure accuracy, OpenCount emphasizesverification. Because computer vision algorithms canmake mistakes, we ensure that every computation per-formed has a simple operator-assisted verification step.

3

Page 4: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

YES NO YES NO

Target+contestdetection

Ballot grouping Grouping verification

Targetextraction

Targetverification

Preprocessing

Preprocessing

Grouping patchselection

Contest dataentry

Figure 1: The OpenCount pipeline. Blank ballots are shown in black, and voted ballots in blue. Steps requiringoperator assistance are labeled with the person icon; all others are automated. The steps shown in grey are onlyrequired when the election has multiple blank ballots. Section 3.2 describes each step.

3.2 Phases of computation

We now provide a brief overview of each phase of Open-Count. Figure 1 shows a diagram of the major steps inour system. Our pipeline is split up into two phases:blank ballot processing and voted ballot processing.

3.2.1 Blank ballot processing

The first phase involves interleaving computer vision andoperator assistance in order to annotate the blank ballotimages.

Preprocessing. The preprocessing stage transformsthe scanned images (both the blank ballots and the votedballots) into a normalized state which makes them easierto manipulate during the pipeline. In particular, prepro-cessing first straightens the images, and then resizes allimages. These two steps simplify further computation.

Target and contest detection. In this step, the oper-ator identifies the voting targets: the operator draws abounding box around an example of a voting target, andOpenCount uses this to automatically find other votingtargets. The operator can inspect the result and adjust theautomatic results if necessary. Then, the operator collab-orates with OpenCount to cluster the targets by contest.

Contest data entry. Next, the operator enters the titleand candidate list for each contest. The purpose of thisstep is twofold: first, it allows OpenCount to correctlylink all appearances of a contest on multiple ballots (e.g.,the President contest may appear on many different bal-lot styles); and second, it allows the result of the electionto be accumulated in a human-readable format. We donot rely on Optical Character Recognition (OCR); it isnot reliable enough for our purposes.

Grouping patch selection. If the election has morethan one blank ballot, the operator draws a boundingbox around the grouping patch for each blank ballot. Ifthe election has just a single blank ballot, this step isskipped. The grouping patch is a part of the scanned bal-lot that allows us to distinguish between blank ballots,based solely upon the contents of this part of the ballot.We assume elections have some grouping patch whichcan be used for this purpose.

3.2.2 Voted ballot processing

After the blank ballots have been annotated for an elec-tion, we perform processing on voted ballots for tabula-tion.

Ballot grouping. “Grouping” is the process of associ-ating each voted ballot with its corresponding blank bal-lot. OpenCount examines the grouping patch of eachvoted ballot to find its corresponding blank ballot. Thisallows OpenCount to accurately locate all voting targetson each voted ballot.

Grouping verification. The operator then verifies thatthe grouping was accurate. We describe our methods fordoing this efficiently in Section 4.6. Any mistakes foundduring verification can be corrected.

Target extraction. After all ballots have been grouped,we extract the voting targets from each voted ballot. Foreach voting target on its corresponding blank ballot, weextract the region at the same location on the voted ballot:this is the voting target as it appears on the voted ballot.The result is a collection of images of the voting targets(whether marked or not) on each voted ballot.

Target classification and verification. Next, Open-Count helps the operator classify each extracted votingtarget image as either marked or unmarked. OpenCountdisplays these images, sorted by average pixel intensity

4

Page 5: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

(a) First pass (b) Second pass

Figure 2: Preprocessing detects and corrects for rotationby detecting long lines. We show the lines discovered bythe first pass and second pass of our method.

and organized in a grid. The operator sets a threshold cut-off value that is used to separate marked and unmarkedtargets. The operator then corrects any classification er-rors that may remain. The operator can also view the en-tire ballot associated with a single target if more contextis needed in order to determine the intent of a mark.

Postprocessing. Finally, OpenCount collects the targetclassifications and generates a CVR for each voted ballot.

4 Algorithms

4.1 PreprocessingDuring the ballot scanning process, images typically ex-perience small, random amounts of rotation and trans-lation (i.e., a shift in some direction) between each scan.To better manage this variability, we perform preprocess-ing on all ballots to undo rotation at a coarse level.

In order to detect rotation, we use a linear Houghtransform that is selective for near-vertical and near-horizontal lines, defined as being at some relatively smallangle θ (θ ≤ 4◦) to either the vertical or the horizontalaxis. We operate under the assumption that each ballottype will contain 2 or more relatively long vertical or hor-izontal lines. The parameters to the Hough transform areselected to detect lines whose length is at least 4

5 of theballot width; they are adjusted dynamically if no linesare found or if too many lines are found. The rotationdetection occurs in two passes.

First pass. We first run a rough Hough transform, sen-sitive to 0.1◦ for the entire range of possibilities (e.g.,

Figure 3: After the operator selected one voting target onthis blank ballot from Merced County, OpenCount auto-matically detected the location of most of the voting tar-gets. The green bounding-box is the exemplar target—the rest were auto-detected by OpenCount. Note the twoundetected targets: the operator will have to identify an-other exemplar target to help OpenCount detect the rest.

4◦ from the horizontal or the vertical in either direction).For each discovered line, we calculate its angle of rota-tion from the vertical axis. To filter out possible outliers,we take a trimmed mean of the discovered angles as ourfirst-order approximation of the rotation angle.

Second pass. To improve precision, we run anotherHough transform, this time over a narrower range of pos-sible angles (0.1◦ in either direction from the estimateobtained in the first pass), sensitive to 0.01◦. To estimatethe overall rotation angle, we take the median of the re-sulting values. Figure 2 displays the lines found by bothpasses on a sample ballot.

4.2 Voting target detectionOpenCount needs to identify the locations of all votingtargets on all ballots. This task is nontrivial. Differ-ent ballot vendors deploy widely-varying ballot styles,each with their own style of voting target. In addition,while empty voting targets (with no voting marks) alllook identical, marked voting targets exhibit broad varia-tion, as different voters may mark their ballot differently.This makes it difficult to identify the location of votingtargets from the voted ballots.

5

Page 6: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

Our solution is to find the location of all voting tar-gets by analyzing the blank ballots. On the blank ballots,the voting targets are uniformly empty (unmarked) andlook visually identical, so it is easy to recognize all tar-gets if we are given information about what an emptyvoting target looks like. We ask the operator to iden-tify an example of an empty voting target by drawing abounding box around it. OpenCount then automaticallyidentifies other targets using template matching, a pro-cess for finding other places where a template image ap-pears in a second image. The operator can inspect theresults; if this automatic process has missed any votingtargets, the operator can indicate them and OpenCountwill use template matching to find additional instances itmissed in the first round, repeating until all targets havebeen found. Figure 3 shows the result after OpenCounthas located voting targets using one round of templatematching.

Template matching takes as input two images: an im-age I, and a template image T . A search is performedon I in order to try to identify possible locations where Tmight appear in I by comparing all T -sized patches in Ito T with some comparison metric. OpenCount currentlyuses the Normalized Correlation Coefficient (NCC) [11]as the metric.

Due to differences in image conditions across ballotscans (e.g., scanner noise, skew), template matching maynot detect all voting targets. If this happens, the operatorcan draw a second bounding box around another votingtarget. This allows OpenCount to template match on thatvoting target as well. If template matching produces falsematches, as a last resort the operator can move, resize,or delete targets. In our experience, typically it sufficesto identify one or two examples of voting targets. Theoperator must then review the detected target locationson all blank ballots, but this can usually be done rapidly.

4.3 Clustering targets into contests

Next, OpenCount attempts to automatically identifywhich voting targets are part of the same contest, usingheuristics based upon the spacing and layout of the tar-gets. The operator can inspect this clustering and adjustit manually if needed.

The intuition behind our algorithm is simple: on anygiven ballot, within a contest C, there typically is a fixeddistance D between all voting targets within C. We usethe smallest distance D′ between any two targets as anestimate for D. Then, if two voting targets T1 and T2 arewithin (1+ε) ·D′ pixels of each other (where ε is a smallerror factor), we merge T1 and T2 into the same contest.All distances are measured using the Euclidean distance.

Figure 4: The user interface for entering contest titlesand candidate names.

4.4 Labeling contestsNext, OpenCount asks the operator to label each con-test with its title (e.g., “President”), as well as the namesof all candidates running in each contest (e.g., “BarackObama”, “John McCain”, etc.). The operator manuallyenters this information for each contest on each differ-ent kind of blank ballot. If same contest appears onmany ballots, the operator only needs to enter the textonce; thereafter, the operator can select the contest froma drop-down of previously entered contests.

While this may seem like a scenario in which Opti-cal Character Recognition (OCR) is easily applicable,we have found that open-source OCR libraries make toomany errors and are too unreliable for this purpose. Inaddition, OCR is a “black box”, so if it does not workwell for some inputs, one has limited recourse, whichmakes systems that rely upon OCR more fragile. Thismakes OCR an unappealing choice for our purposes.

The user interface is shown in Figure 4. This screen in-cludes a picture of the entire ballot (with color highlightsused to indicate which contest the operator is currentlyworking on and which contests have been completed)and an enlarged display of the current contest, which canbe panned or zoomed by the operator if needed. Open-Count uses several heuristics to identify which part ofthe ballot to display enlarged; however, because the op-erator can pan this image, the heuristics do not need tobe perfect.

4.5 GroupingOpenCount automatically finds, for each voted ballots,its corresponding blank ballot by checking which one hasa matching grouping patch.

The grouping computation assumes that ballots con-tain an image region, called the grouping patch, that al-lows for a unique mapping from a voted ballot to a blank

6

Page 7: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

(a) San Luis Obispo County (b) Alameda County

Figure 5: Examples of grouping patches shown in bluefor two types of ballots. For ballot (a), we use the timingmarks as the grouping patch. For ballot (b), we used aprecinct number and some text (which uniquely identi-fies the language) as the grouping patch.

ballot. This might be a portion of the timing marks (seeFigure 5(a)), or it might be a region that captures theprecinct number and ballot language (see Figure 5(b); thelanguage was crucial in this example, because the votingtargets are at different locations on different-languageballots). We utilize the NCC-based template matchingtechnique, along with a direct pixel-based registration al-gorithm, to find the blank ballot whose grouping patchbest matches that of the voted ballot. For each voted bal-lot, the algorithm simply scores how well the patch fromthe voted ballot matches the patch from each possibleblank ballot; the closest match is assumed to be correct.

Direct pixel-based registration. A key challenge isthat the voted ballot might not be perfectly aligned tothe blank ballot, so we must correct for this. We modeltransformations between a voted ballot and a blank ballotusing a rigid model. Motion between a point x and x′ isgoverned by the equation x′ = Rθ x+ t. Rθ is a rotationmatrix, parameterized by θ , the degree of rotation, andt is a translation vector. This model allows for transla-tion and rotation between the two images, but does notallow for other changes, such as scale or skew. Thoughother models have the power to capture broader familiesof transformation, the rigid model has fewer degrees offreedom, making it easier to estimate. Furthermore, therigid model is sufficient to capture the variation typicallyintroduced during the scanning process. We adopt theLucas-Kanade [14] algorithm, which finds the parame-ters θ , t that minimize the least squared error betweentwo images.

squared error = ∑x[I(W (x;θ , t))−T (x)]2

Algorithm 1 Pyramid ballot groupingb: Voted ballot imageG: Set of grouping patches from blank ballotsmx: Maximum scale

1: procedure BALLOTGROUPING(b,G,mx)2: p← SUPERREGION(b,G)3: s← INITMINSCALE(G)4: step = (mx− s)/ log2 |G|5: while |G|> 1 do6: for i = 1,2, . . . , |G| do7: ri← NCC+LK(p,Gi,s)8: end for9: Sort r, to get π such that rπ(1) ≤ ·· · ≤ rπ(|G|).

10: G←{Gπ(1),Gπ(2), . . . ,Gπ(|G|/2)}11: s← s+ step12: end while13: end procedure

The function W (x;θ , t) represents the location of pixel xafter being warped by the transformation with parame-ters θ and t, which capture rotation and translation. Werefer the reader to Lucas, Baker, et al. [14, 2] for a de-tailed review of this class of algorithms.

NCC+LK. To compute the similarity between a bal-lot and a particular grouping patch, we pair the NCCtemplate-matching algorithm with Lucas-Kanade. Thisis done for two reasons: (1) variation in the scanningmay affect the exact location of the grouping patch and(2) Lucas-Kanade is a gradient descent-type algorithm,sensitive to good initialization.

To score how well a grouping patch matches a blankballot, we first perform NCC using the patch, then runLucas-Kanade on the best matching location. The scoreoutput by NCC+LK is the least-squares registration errorfrom Lucas-Kanade.

The pyramid optimization. We have developed anoptimization that significantly speeds up this compu-tation, by first performing the comparison on smaller,downsampled (lower-resolution) versions of the imagesto quickly prune away poor matches.

In computing the NCC+LK, we only consider the re-gion in the ballot around the union of the grouping patchregions. We define SUPERREGION(b,G) to computethe union of the bounding boxes of G, expanded, andcropped from ballot b. We adopt a pyramid scheme toefficiently find the closest match in G. The pyramidscheme initially applies NCC+LK to smaller-scale ver-sions of the patches, then throws away the worst matchesand repeats the process on the surviving pairs at a higherresolution.

Let the initial set of grouping patches be G. Letp← SUPERREGION(b,G). We downsample p and the

7

Page 8: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

Algorithm 2 Starting scale computationT : Set of blank ballot imagesG: Set of grouping patches from Tmn: Minimum scalemx: Maximum scaled: Scale step

1: procedure INITMINSCALE(T,G,mn,mx,d)2: for i = 1,2, . . . , |G| do3: G′← G\{Gi}4: g← SUPERREGION(Ti,G′)5: s← mx6: for j = 1,2, . . . , |G′| do7: r j← NCC+LK(g,G′j,s)8: end for9: Let x be the index that minimizes rx.

10: repeat11: s← s−d12: for j = 1,2, . . . , |G′| do13: r j← NCC+LK(g,G′j,s)14: end for15: Sort r, so rπ(1) ≤ ·· · ≤ rπ(|G|).16: until π−1(x)≥ 1

2 |G′| or s≤ mn

17: yi← s18: end for19: return MAX(y)20: end procedure

patches in G to an initial minimum scale s. Next we mea-sure the NCC+LK response of each patch in G to p andprune away the worst scoring half, retaining the remain-ing set. Then we increment the scale s and repeat thematching and pruning process on the pruned set G. Thiscontinues until only a single element remains in G. Thealgorithm is summarized in Algorithm 1. Gi representsthe ith element of G.

This approach depends on knowing the smallest scaleto begin the pyramid searching. Given the blank ballotsT and patches G, we compute the smallest scale that al-lows for discrimination between grouping patches. Theintuition behind our approach is that for any given blankballot, we find the most similar grouping patch at the fullscale, then continue reducing the scale until that patchis no longer among the top half closest matches to thatblank ballot.

In more detail, consider a blank ballot Ti. We crop outthe patch from Ti, call it g. We compare g to the patchesG′ from all other blank ballots, computing the NCC+LKresponse for each at full scale. Suppose g∗ ∈ G′ is theclosest match to g, at full scale. Intuitively, this meansthat g∗ is the patch that is hardest to distinguish fromg. We repeatedly reduce the scale and recompute theNCC+LK response at each scale until the patch g∗ falls

(a) Cluster needing split (b) New cluster after split

Figure 6: A screenshot of the grouping verification UI.The figure on the left (a) shows a cluster overlay that con-tains at least one erroneously grouped ballot; the opera-tor should click split to separate these ballots into two ormore sub-clusters. The figure on the right (b) shows theresulting sub-cluster. One can readily verify that all ofthe ballots in the new sub-cluster are correctly grouped.

outside the top half of responses. This algorithm is sum-marized in Algorithm 2. We track the order of similarityof elements G to be used later in grouping verification.

In our pyramid approach, the most NCC+LK compar-isons are performed at the smallest scale, with the num-ber of comparisons halved at every scale increase. Theperformance of both NCC and Lucas-Kanade is highlydependent on the scale of the image patches: they aremuch more efficient at small scales. With this approach,we can prune away the most dissimilar patches at lowerresolutions while only comparing the most similar ele-ments at the highest resolutions. In the situation wherewe have a large number of blank ballots, this results insignificant computational gains. In practice this proce-dure is run twice — one for the original ballot image anda version rotated by 180◦ — to handle flipped ballots.

4.6 Grouping verificationFollowing the automatic grouping, the operator is pre-sented with a simple interface to correct and verify theresulting mapping from voted ballots to blank ballots.We refer to the set of ballots matched to the same blankballot as a ballot cluster. The operator views a summaryimage of all ballots in a ballot cluster, one cluster at atime (see Figure 6). For each cluster, the operator can(a) accept the cluster as accurately grouped, (b) indicatethat the cluster contains voted ballots from multiple dif-ferent underlying blank ballots, erroneously merged, or(b) correct errors by changing the blank ballot that all ofthe ballots in the cluster are associated with.

Verification. We visually summarize all of the ballotsin the cluster using an image overlay [5]. An overlay is a

8

Page 9: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

Inputs: ballot (left) and template (right)

Global alignment Isolate region Local alignment

Figure 7: The target extraction procedure. The voted ballot is shown in blue and the blank ballot is in black. (1)First, we align the ballot to the blank ballot, globally. Since the transformation introduced by scanning is not perfectlyuniform across a ballot, we (2) isolate a region around each voting target and (3) perform alignment once more, locally.Finally, we extract the bounding box corresponding to the location of the target on the blank ballot.

concise way of summarizing a large collection of imagesthat are allegedly nearly identical. We show both a minoverlay, which intuitively contains the union of the blackpixels in the collection (if any image in the collection hasa black pixel at location (x,y), then so does the overlayimage), and a max overlay, which contains the intersec-tion of the black pixels. To verify a cluster, the oper-ator views an overlay of all the grouping patches fromthe cluster, juxtaposed with the grouping patch from theblank ballot that these were associated with, and the op-erator judges whether or not the association is correct.

Oor experience is that overlays make it easy for a hu-man operator to quickly check whether all of the bal-lots were correctly associated to their matching groupingpatch. See, for instance, Figure 6 for an example wherethe cluster contains at least one incorrectly grouped bal-lot (Fig. 6(a)) and an example where all ballots were cor-rectly grouped (Fig. 6(b)).

Re-label cluster. Another action the operator can per-form is to re-label the cluster. If the operator observesthat the overlays have been incorrectly labelled (i.e.,associated with the wrong blank ballot), she may usea drop-down menu to correct the error and select thematching blank ballot. The choices are ordered usingscores from the grouping stage.

The ordering of blank ballots for a cluster is createdin the following way. From the grouping step we havea ranked list of blank ballots, called Lb for a ballot b.Our goal is to combine the lists of all ballots in a clus-ter to form a single ranked list, to be used for the drop-down menu. We implement the following weighted vot-ing scheme: for each ballot, go through its ordered listof Lb and cast a vote worth 2− j for each blank ballot,where j is the position of the blank ballot in Lb. Thus,a blank ballot will receive 1 vote from ballots for whichit is the best match, 1

2 votes from ballots for which it isthe second-best match, etc. The final ranked list of thecluster is obtained by sorting these votes in descendingorder.

Split cluster. Finally, if the operator observes that thecluster of ballots appears to represent multiple differentblank ballots (i.e., some ballots were wrongly mergedinto this cluster), the operator can correct the error bysplitting the cluster. This action will divide the clusterinto smaller clusters based on scores from the groupingstage. The system then guides the operator through ex-amining each of the smaller clusters, using the same in-terface.

To split a cluster of ballots, we again utilize each bal-lot’s ranked list of candidate blank ballots, and group theballots according to the nth blank ballot on the list, wheren is the smallest entry in such that we observe a differ-ence in blank ballots among ballots in a cluster. For ex-ample, if all the ballots in a cluster have the same ranklist of blank ballots up until the 5th element down theirlists, then the ballots are re-grouped based on that rankedlist index.

4.7 Voting target extraction

Once the voted-ballot-to-blank-ballot association is es-tablished, we extract the voting targets from each votedballot. The blank ballots tell us where the voting tar-gets are located; we then look at the same location in thevoted ballot, and crop out regions around the voting tar-gets so that later stages can classify them into filled vs.empty.

The main challenge is to align the voted ballot andits corresponding blank ballot, so that the locations ofthe voting targets in each are aligned. We solve this intwo steps: (a) coarse global registration and (b) fine lo-cal registration around each individual target. We usethe content of the ballot itself for alignment, not theballot-specific registration marks which are sometimesobscured or missing from the scanning process. In thisphase, we again use a rigid transformation model and di-rect registration algorithms (as described in Section 4.5).

The first step is to estimate the transformation betweenthe voted ballot and its corresponding blank ballot at

9

Page 10: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

a coarse level, using the entire ballot image. We usethe Lucas-Kanade algorithm, with a rigid transforma-tion model. The ballot and blank ballot are both down-sampled prior to coarse registration for performance rea-sons.

This produces an approximate alignment, but it isnot perfect. Though the rigid transformation can modelmuch of the variation between scanned images, often thevariation introduced by scanning is not perfectly uniformacross the image. Thus, it is possible for one regionof the ballot to be well-aligned, while another region isnot. Our second step, local alignment, addresses thisproblem. During local alignment, we apply the Lucas-Kanade algorithm to small regions around each votingtarget, separately and individually. Figure 7 shows anexample of this two-step process.

4.8 Quarantining

For robustness, our processing pipeline is designed to au-tomatically sanity-check its results and detect anomalies.Anomalous ballots are quarantined for manual review.Figure 12(b) shows examples where operator assistancewas necessary. The top image shows an example whereunusual voter marks can result in unpredictable situa-tions that are best interpreted by a human; the bottomimage illustrates a case where only part of the ballot wasscanned, a situation that we cannot resolve automatically.

We detect anomalous ballots using a outlier detectionprocedure. We use distance-based outliers [10]: a valueis a DB(p,D)-outlier when at least a fraction p of allother values in the dataset E are of distance greater thanD from it. One appealing aspect of this approach is thatthe observed data need not follow a standard distribution.

During target alignment, we track the final leastsquares registration error from each target’s local align-ment step and use those values for outlier detection. Thisis a single value for each target that estimates the good-ness of fit of the registration around that target. For eachvoting target, we form the set E of least squares registra-tion errors for that target (across all ballots in the elec-tion) and check for DB(p,D)-outliers among the set E.In other words, we declare a registration error error e∈ Ean outlier if we have |e− e′|> D for at least a p fractionof all e′ ∈ E. This can be computed efficiently by sortingthe set E and iterating over its entries. In our work, weset p = 0.999 and D = (max(E)−min(E))/2 and per-form the outlier test separately for each different votingtarget. Ballots containing any flagged targets are quaran-tined and set aside for manual review.

Figure 8: The mark classification step. The main screenshows all the targets in a grid. In purple, we show azoomed-in version of a portion of this screen image. Theoperator can both set a global threshold and mark targetsthat are mis-classified by the threshold. Also, the op-erator may select a particular target and view the ballotcontaining that target, to help adjuticate voter intent.

4.9 Mark classification

The final step in OpenCount is to classify each extractedtarget from the voted ballots as either “marked” or “un-marked”. We display all of the extracted targets in a grid,sorted by their average pixel intensity (similar to Ballot-Tool [13]). The operator can then select a threshold thatseparates the marked voting targets from the unmarked.

In nearly all cases, voting targets are clearly marked orunmarked, as a majority of voters completely fill in thevoting targets, so this approach works well for the over-whelming majority of voting targets. However, a smallfraction of targets are borderline or less clearcut. Wehandle these marginal marks through manual review. Wetreat the operator as a domain-specific expert who hasthe final say, sidestepping the difficult problem of auto-matically inferring voter intent for marginal marks. Inparticular, the operator can adjuticate the status of anyindividual target and override the threshold-based clas-sification for that target. If voter intent is unclear fromthe voting target alone, the operator can view the corre-sponding ballot (Figure 8) to help determine voter intent.

We have found that the grid-based interface is veryhelpful, as it allows the operator to efficiently scan a largecollection of voting targets. Unusual or borderline casestend to stand out visually.

As an optimization, OpenCount initially estimates asuggested threshold using clustering techniques. Wemake the simplifying assumption that the average pixelintensities of marked and unmarked targets each follownormal distributions, and we set the initial threshold asthe value that minimizes the difference between the twoactual distributions and the best-fit normal distributions.

We do not attempt to resolve write-ins [9]. Instead,we treat all write-in votes as votes for a single syntheticcandidate, “write-in.”

10

Page 11: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

Election Name Voted Ballots Blank Ballots Resolution

Alameda 1374 8 1460×2100Merced 7120 1 1272×2100San Luis Obispo (SLO) 10689 26 1700×2200Stanislaus 3151 1 1700×2800Ventura 17301 1 1408×3000

Table 1: We use data from five California counties.

Dataset Prepr. Template Ballot Grouping Target Target Avg. per Operator Total(C) creation (H) grouping (C) check (H) extract. (C) check (H) ballot (C) total (H) in min.

Stanislaus 594 31s - - 478s 53s 0.34s 84s 19 minsMerced 1011s 108s - - 3045s 712s 0.57s 820s 81 minsVentura 2851s 107s - - 6197s 490s 0.52s 597s 161 minsAlameda 848s 510s 403s 87s 896s 162s 1.56s 759s 48 minsSLO 1924s 438s 6219s 998s 5722s 29s 1.30s 1465s 256 mins

Table 2: Timing information for each dataset. Steps that use operator interaction time are labeled with (H) and stepsthat use computing time are labeled with (C). In template creation, we sum the total time required by the operator todetect targets, perform data entry, and select grouping patches.

5 Evaluation

We evaluated our system on data from the California2011–2012 post-election risk-limiting audit pilot pro-gram [1]. The data covers elections in five counties andincludes voted ballots, blank ballots, and CVRs. See Ta-ble 1 for a summary.

The CVRs obtained from the dataset were producedfrom a much earlier prototype which we used to conductthe audits in 2011. During those audits, we comparedthe CVRs against the official voting tallies. In the fewcases where discrepancies were found, these CVRs wereconsistently found to be more accurate than the officialtallies. For sake of evaluating our current system, weregard the existing CVRs as ground truth and measurethe accuracy of OpenCount by comparing its output tothe CVRs provided in the datasets.

In our evaluation, we measure both operator interac-tion time and computation time; see Table 2. A memberof the team that developed OpenCount served as operatorin all the tests. All timings were measured on a four-coremachine with an Intel i7-950 CPU and 12GB of RAM.

We emphasize that we do not claim to develop themost sophisticated fully-automated vote tabulation sys-tem. Rather, our system helps a trained human opera-tor find the difficult cases to make an informed judge-ment. We highlight all such difficult cases in each eval-uation. We also highlight interesting examples of ballotsdetected as outliers for each dataset.

Three of the elections—Merced, Stanislaus, andVentura—had only a single ballot style, and thus did notexercise the grouping step. Two others had multiple bal-lot styles and required grouping. We use OpenCount toprocess all datasets with no additional tuning. See Fig-

(a) Stanislaus (b) Merced (c) Ventura

Figure 9: Example ballots from three counties.

ures 9, 12, and 13, for example ballots.In every one of the five elections, with the possible

exception of a few judgement calls, we achieve perfectaccuracy. These judgement calls represent ambiguouscases where the human operator made a different deci-sion in classifying the target, and where it is not clearwhat the right outcome should be. These are unrelated tothe performance of our system.

Stanislaus. The Stanislaus County ballots were pro-cessed in 19 minutes. (0.34 seconds of computation perballot). The CVRs generated by OpenCount matched theground truth completely. In this election only a singlecontest was audited, resulting in minimal operator effortduring contest labeling.

Merced. The Merced County ballots took 81 minutesto proces. We observed some discrepancies betweenthe CVRs produced using OpenCount compared to theground truth. Upon closer inspection, we believe that all

11

Page 12: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

(a) (b)

(c) (d) (e)

(f) (g)

Figure 10: In these examples, the ground truth registereda vote but OpenCount’s operator treated them as non-votes. In (g), the target appears to have been darkenedthrough erasure, damaging the ballot.

disagreements consist of borderline cases of interpretingvoter intent. Figures 10 and 11 (in the appendix) dis-play every target disagreed upon, highlighted in blue. Inall cases, the OpenCount operator regarded the partic-ular target as un-marked while the ground truth CVRstreat them as marked. As we rely upon a human opera-tor to adjuticate such cases and it is not the purpose ofthese experiments to evaluate the operator’s judgements,we make no claims about which result is “correct.” Itis enough that OpenCount helped the operator identifythese cases, so that the operator could form a judgement.

The goal of our system is to steer the operator’s atten-tion to the difficult cases, so the operator can make aninformed decision. OpenCount was successful in doingso. In the borderline cases mentioned above, the onlydifference between the results is that the operator judgedthe target differently.

Ventura. The Ventura County ballots took 161 minutesto process We identified only one discrepancy betweenour CVRs and the ground truth, shown in Figure 12(a).We consider this another difficult case, which is left upto election officials.

We discovered interesting ballots that OpenCount au-tomatically flagged as suspicious (Figure 12(b)). Inone example, the scanned image is completely occluded,

(a) (b) (c)

Figure 11: In these Merced “vote for three” contests,the ground-truth CVRs counted these as valid marks andtreated the contest as overvoted. OpenCount’s opera-tor considered them hesitation marks (i.e., not a validmark), so the contest is interpreted as containing threevalid votes.

halfway down the ballot. In the other, a voter scribbledout a marked vote, perhaps to make a correction. Bothcases were automatically flagged as outliers and set asidefor manual review.

Alameda. The Alameda County ballots took 48 min-utes to process OpenCount produced the same CVRs asthe ground truth.

The Alameda County case study presented an inter-esting challenge. Ballots were cast in four differentprecincts, and each precinct had an English/Spanish andEnglish/Chinese version of each ballot. As Figure 13shows, the choice of language influences the locationsof the voting targets. As such, the choice of groupingpatch was especially important in order for OpenCountto be able to correctly tabulate the voted ballots. Notethat the timing marks, which are labeled in purple andencode the precinct number, did not appear to encode thelanguage used. Therefore, we defined a grouping patchthat included both the precinct number and a portion oftext from the ballot (to capture the language).

San Luis Obispo (SLO). The SLO dataset was distinc-tive due to the relatively large number of blank ballots.OpenCount processed the ballots in 256 minutes Com-paring the CVRs from our system yielded only one dis-crepancy, shown in Figure 14. This is another case ofa judgement call in which the human operator made adifferent decision. All other ballots matched the groundtruth CVR completely.

Discussion. Overall, our experience with OpenCounthas been positive. However, we have identified several

12

Page 13: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

(a) (b)

Figure 12: Examples from Ventura. (a) contains a casewhere the ground truth claimed three votes, while Open-Count claimed two votes. (b) contains examples of bal-lots automatically flagged for manual inspection.

Figure 13: Example ballots from Alameda, which hasfour precincts (purple box) as well as two different lan-guage styles. We show two ballots from the sameprecinct, but with different languages; in green we zoomin on a single contest. As the red dashed line indicates,the language affects the location of the voting target. Theblue grouping patch captures both the language and theprecinct information in order to identify the ballot style.

areas for future improvement. While our current ap-proach scales satisfactorily to elections with dozens ofballot styles, the operator effort to label all contests andverify the grouping results becomes prohibitive if thereare thousands of different ballot styles. In addition, wehave learned that the requirement to find and scan one ofeach kind of blank ballot is labor-intensive for electionofficials. A way to eliminate this requirement would bevaluable.

Figure 14: The one discrepancy for the SLO election.The operator of OpenCount adjuticated this as a vote forYES; the ground truth classifies it as an overvote.

6 Related Work

We were inspired by the pioneering work of TEVS [21]and Votoscope [8], which attempt to solve the sameproblem. As mentioned before, one difference betweenTEVS and OpenCount is that TEVS is specialized to avendor’s particular ballot layout, whereas we try to avoidmaking assumptions about ballot layout and, for robust-ness, avoid relying upon OCR. The current implemen-tation of TEVS supports only Hart ballots. In addition,TEVS does not provide the operator an opportunity toverify or visualize intermediate results of the computa-tion, whereas it is a key goal of OpenCount to enable theoperator to verify the results of its computations.

BallotTool is another system to assist a human opera-tor in processing a set of ballot images [13]. BallotToolrequires the operator to define the ballot layout and iden-tify voting targets, whereas we attempt to largely auto-mate these tasks. Our work distinguishes itself in ourtight integration of computer vision techniques with fo-cused operator interaction. Our careful interleaving ofprocessing and verification enables us to reduce the oper-ator workload while being robust to scanning errors ex-hibited by the scanning process and general to a wideclass of ballot styles.

Nagy et al. describe a method for identifying candi-date locations for voting targets, based upon analysis ofthe timing marks on the edge of the ballot [15]. However,we have found that not all vendors’ ballot styles containthese timing marks. For instance, Alameda and VenturaCounties use Sequoia ballots, which do not have the nec-essary timing marks. In addition, their method failed on1% of ballots in their experiments, which is too high forour purposes. Therefore, we developed other methods.

Many recent works have looked at various aspects ofthe ballot analysis problem [5, 22, 9, 13, 23]. Worksthat are complementary to ours include tools for guid-ing an operator to discover sources of errors in scanned

13

Page 14: Operator-Assisted Tabulation of Optical Scan Ballots...Operator-Assisted Tabulation of Optical Scan Ballots Kai Wang Eric Kim† Nicholas Carlini† Ivan Motyashov† Daniel Nguyen†

ballots [5], analysis of write-in regions [9], and capturinga stream of paper ballots at a distance using video [22].Xiu et al. describe more more sophisticated approachesto fully automated target classification [23], though wehave not found a need for these techniques in our exper-iments to date.

7 Conclusion

In this paper, we present OpenCount: a scaleable, ro-bust, and accurate tool that helps support election au-diting. OpenCount’s unique combination of computervision techniques and operator assistance allows us tooperate on real-life elections, without relying upon orintegrating with vendor systems. OpenCount has beensuccessfully used to support risk-limiting audits in fiveCalifornia counties.

The OpenCount software is available to the public athttps://code.google.com/p/opencount/.

8 Acknowledgments

We thank the California Secretary of State’s officeand election officials at Stanislaus, Merced, Ventura,Alameda, and San Luis Obispo counties for sharingscanned ballots. This research would not have been pos-sible without their generous assistance. We also thankPhilip Stark for helpful guidance, the anonymous review-ers and our shepherd Alex Halderman for many helpfulcomments, and Ed Smith and Dominion Voting Systemsfor helpful information about Dominion’s future votingsystems. We thank Andy Chang for experiments analyz-ing TEVS. Finally, we thank Piotr Dollar for his valuablefeedback and for making his Matlab toolbox1 available.

This work was supported by National Science Foun-dation grant CNS-0524745. Any opinions, findings, con-clusions or recommendations expressed in this publica-tion are those of the author(s) and do not necessarily re-flect the views of the National Science Foundation.

References

[1] Post Election Risk-Limiting Audit Pilot Program 2011-2012.http://www.sos.ca.gov/voting-systems/oversight/

risk-limiting-pilot.htm.

[2] BAKER, S., AND MATTHEWS, I. Lucas-Kanade 20 Years On: AUnifying Framework. International Journal of Computer Vision56, 3 (March 2004), 221–255.

[3] BENALOH, J., JONES, D., LAZARUS, E. L., LINDEMAN, M.,AND STARK, P. B. SOBA: Secrecy-preserving ObservableBallot-level Audits. In Proceedings of EVT/WOTE 2011.

1http://vision.ucsd.edu/~pdollar/toolbox/doc/

index.html

[4] CALANDRINO, J. A., HALDERMAN, J. A., AND FELTEN, E. W.Machine-assisted election auditing. In Proceedings of EVT 2007.

[5] CORDERO, A., JI, T., TSAI, A., MOWERY, K., AND WAGNER,D. Efficient user-guided ballot image verification. In Proceedingsof EVT/WOTE 2010.

[6] HALL, J. L., STARK, P. B., MIRATRIX, L. W., BRIONES,M., GINNOLD, E., OAKLEY, F., PEADEN, M., PELLERIN, G.,STANIONIS, T., AND WEBBER, T. Implementing Risk-LimitingPost-Election Audits in California. In Proceedings of EVT/WOTE2009.

[7] Humboldt County Election Transparency Project. http://

humtp.com.

[8] HURSTI, H. Votoscope software, October 2005. http://vote.nist.gov/comment_harri_hursti.pdf.

[9] JI, T., KIM, E., SRIKANTAN, R., TSAI, A., CORDERO, A.,AND WAGNER, D. An analysis of write-in marks on optical scanballots. In Proceedings of EVT/WOTE 2011.

[10] KNORR, E. M., AND NG, R. T. Algorithms for MiningDistance-Based Outliers in Large Datasets. In Proceedings of the24th International Conference on Very Large Data Bases (VLDB’98), pp. 392–403.

[11] LEWIS, J. Fast Normalized Cross-Correlation, Febru-ary 1998. http://www.idiom.com/~zilla/Papers/

nvisionInterface/nip.html#tex2html1.

[12] LINDEMAN, M., AND STARK, P. B. A gentle introduction torisk-limiting audits. IEEE Security and Privacy (2012). SpecialIssue on Electronic Voting, to appear.

[13] LOPRESTI, D., NAGY, G., AND SMITH, E. B. A document anal-ysis system for supporting electronic voting research. In Pro-ceedings of the 8th IAPR International Workshop on DocumentAnalysis Systems (DAS ’08), pp. 167–174.

[14] LUCAS, B. D., AND KANADE, T. An iterative image registrationtechnique with an application to stereo vision. In Proceedings ofthe 7th International Joint Conference on Artificial Intelligence(IJCAI’81) - Volume 2, pp. 674–679.

[15] NAGY, G., LOPRESTI, D., SMITH, E. H. B., AND WU, Z. Char-acterizing Challenged Minnesota Ballots. In Proceedings of SPIE(2011), vol. 7874.

[16] RESCORLA, E. Understanding the Security Properties of Ballot-Based Verification Techniques. In Proceedings of EVT 2009.

[17] STARK, P. Report on second risk-limiting audit under AB 2023in Monterey County California. Verified Voting Blog, May 2011.http://blog.verifiedvoting.org/2011/05/07/1370.

[18] STARK, P. Personal communication, May 2012.

[19] STARK, P. B. Efficient post-election audits of multiple contests:2009 California tests. In 4th Annual Conference on EmpiricalLegal Studies (CELS 2009).

[20] STARK, P. B., AND WAGNER, D. A. Evidence-based elections.IEEE Security and Privacy (2012). Special Issue on ElectronicVoting, to appear.

[21] TRACHTENBERG, M. Trachtenberg Election Verification System(TEVS). https://code.google.com/p/tevs/.

[22] WANG, K., RESCORLA, E., SHACHAM, H., AND BELONGIE,S. OpenScan: a fully transparent optical scan voting system. InProceedings of EVT/WOTE 2010.

[23] XIU, P., LOPRESTI, D., BAIRD, H., NAGY, G., AND SMITH,E. B. Style-Based Ballot Mark Recognition. In Proceedingsof the 9th International Conference on Document Analysis andRecognition (ICDAR 2009).

14