Voter Choice in the 2006 Mexican Presidential Election...Voter Choice in the 2006 Mexican...
Transcript of Voter Choice in the 2006 Mexican Presidential Election...Voter Choice in the 2006 Mexican...
Voter Choice in the 2006 Mexican Presidential Election
Inés Levin*
R. Michael Alvarez**
April 18 2010
Abstract
Until recently, a popular explanation of voter choice in Mexico was that voter decisions
were driven by attitudes toward the hegemonic PRI regime. However, political conditions
changed in Mexico after the victory of PAN’s candidate Vicente Fox in 2000. In this article,
we use survey data from the 2006 Mexico Panel Study to analyze the determinants of voter
choice in Mexico’s new democracy, using statistical methodologies appropriate for
modeling voter choice in multiparty elections. We find that different to what is often found
in more stable democracies, region and party identification are the most important factors
for explaining voter choice, although demographic factors, retrospective evaluations of the
economy and assessments of candidate‐specific traits also have considerable impacts on
voter decisions.
*Graduate student, Division of Humanities and Social Sciences, California Institute of
Technology, Pasadena, CA 91125; corresponding author, e‐mail: [email protected].
**Professor of Political Science, Division of Humanities and Social Science, California
Institute of Technology, Pasadena, CA 91125; e‐mail: [email protected].
1
Introduction
In 1995, two scholars started their analysis of the 1988 and 1991 Mexican elections asking
“How do citizens who have long been governed by the same party in an authoritarian
regime vote when there seems to be a chance that they could turn the incumbent out of
office?” (Domínguez and McCann 1995, page 34); their analysis revealed that the main
force driving voter choice were attitudes towards the PRI regime. However, by 2006, the
situation had changed. The 2006 Mexican presidential election was the first held after the
transition towards a multi‐party regime that started in the late seventies and culminated in
the election of Vicente Fox in 2000. This election was different relative to previous ones as
defeating the old regime was no longer the main electoral issue (Domínguez 2009a;
Klesner 2007, 2009; Lawson 2009; Moreno 2006). Unlike Fox in 2000, Felipe Calderón did
not run under a slogan of institutional change, but under a promise of continuity and
stability (Domínguez 2009a; Klesner 2007; Lawson 2009; Moreno 2009). The main
purpose of this paper is to study the determinants of voter choice in a new democracy,
using novel statistical methodologies well suited for studying voter behavior in multiparty
elections.
How do citizens vote in a new democracy, and how well do existing academic
theories of political behavior account for voter behavior in such a context? First, it may be
the case that existing socio‐psychological theories might apply to account for how voters
decide which candidate to support in the 2006 Mexican contest, meaning that we should
look to factors such as social and demographic characteristics, geographic region, and
partisanship to study voter choices (Lawson 2006; Klesner 2006, 2007, 2009; Moreno
2007). Second, to what extent are Mexican voters motivated to choose a presidential
2
candidate based on spatial issues or ideology, and thus to what extent does the standard
spatial model of elections account for voter choices in this election? (Alvarez and Nagler
1995, 1998; Downs 1956; Enelow and Hinich 1984). And third, perhaps it is the case that
Mexican voters apply theories of retrospective voting to their decision‐making in the 2006
election, evaluating the incumbent candidate according to the recent evolution of the
national economy (Alvarez and Nagler 1995, 1998; Fiorina 1978, 1981; Kiewiet 1981;
Kinder and Kiewiet 1981); or conversely, perhaps they only care about their personal
economic conditions, and behave in line with pocketbook explanations of voter behavior
(Markus 1988)? Finally, given recent episodes of electoral corruption and voter fraud in
Mexico (Aparicio 2002; Cornelius 2002; Bruhn 2009; Díaz‐Santana 2002; Eisenstadt 2007;
Infante 2005; Lawson 2007b; Schedler 2004), is voter choice affected by citizen’s trust in
the cleanness of the electoral process? In this paper we test these theories using survey
data from the 2006 Mexico Panel Study (Lawson 2007; Lawson and Moreno 2007).
In this article we build upon previous research on voter behavior in Mexico to
further our understanding of the effects of voter confidence on voter choice. In previous
studies of Mexican elections, scholars have found that voter choice varies by region, socio‐
demographic factors, incentives for Duvergerian strategic voting, as well as risk attitudes
and uncertainty about candidate abilities (Magaloni 2006b; Morgenstern and Zechmeister
2001; Poiré 2000). However, there has been relatively scant attention paid to testing the
spatial theory of elections in the Mexican context, or how confidence in the cleanness of the
electoral process might influence voter choice. Of course, related questions have been
investigated in the past (McCann and Domínguez 1998), but we believe the relevance of the
problem justifies a more in depth and careful analysis.
3
Furthermore, as there have been great advances in political methodology in recent
years that allow us to model voter choice in a multi‐candidate context like the 2006
Mexican presidential election, we take into account issues like sampling uncertainty and
how non‐response affects explanatory variables. As discussed below, we use Bayesian
methods for conducting multiple imputations (van Buuren and Groothuis‐Oudshoorn
2009), and for the estimation of flexible discrete dependent variable voter choice models
(Imai and van Dyk 2005a,b). As we demonstrate below, application of these new
methodologies sheds new light on voting behavior in Mexican presidential elections.
Elections in Mexico are ideal for our study for a number of reasons. First, the
country recently experienced a democratic transition after 70 years of hegemonic
government by the Partido de la Revolución Institucional (PRI) (Craig and Cornelius 1995;
Diaz‐Cayeros and Magaloni 2001; Diaz‐Cayeros et al. 2003; Klesner 2001; Klesner and
Lawson 2001; Lawson 2007b, Magaloni 2006a; Schedler 2000). Second, the 2006
presidential race—which was hotly contested between a member of the incumbent Partido
de Acción Nacional (PAN) and a candidate from the opposition Partido de la Revolución
Democrática (PRD)—was preceded by criticisms of the fairness of the electoral process and
followed by a challenge of the official election outcome (Bruhn 2009; Eisenstadt 2007;
Estrada and Poiré 2007; Klesner 2007). Also, the 2006 election coincided with the
emergence of strong left‐wing presidential candidates in several Latin American countries,
resulting in the media portraying political behavior in these elections in ideological terms
(Bruhn 2009; Bruhn and Greene 2007, 2009; Greene 2009; Lawson 2009; McCann 2009;
Moreno 2006, 2007). Finally, in recent years there has been increasing attention paid to
4
elections in Mexico by scholars, and there has been a concerted effort to collect quality
voter survey data in Mexican national elections.
In line with previous studies, we find that region and party identifications remain
important factors for explaining voter choice in 2006. However, region and party
identification do not account for all the variation in voter behavior in the 2006 election. We
found that retrospective evaluations of the economy, as well as evaluations of candidates’
abilities and honesty are all factors contributing to the decision. Differently, candidate‐
specific variables such as ideological distances and opinions about commercial
relationships with the US only have small and insignificant effects on voter choice. Finally,
we found that there are considerable effects of voter confidence (although these effects go
in an unexpected direction), but not so much of voters’ willingness to participate in
protests against official election outcomes.
The rest of the article is structured as follows. First, we discuss the main hypothesis
of our study. After that, we describe the data used to estimate our model and the
distribution of explanatory variables across respondents as a function of voter choice. In
the following section, we present the results of our empirical analysis. Subsequently, we
close with a discussion of our main findings and implications for further study. Finally, The
appendix contains detailed descriptions of the statistical methodology applied in this
paper, multiple imputation method, and the procedure used to simulate choice
probabilities.
5
Hypotheses
In previous studies of the 2006 Mexican election, regional and demographic factors were
considered among the main explanations of voter choice (Klesner 2006, 2007, 2009;
Lawson 2006, 2009; Moreno 2007). Our first two hypotheses are constructed based on
these results as well as regular findings in the Mexican politics literature.
(1) Support for Calderón (PAN) was higher in the industrial and exportoriented North and
Center of the country; support for López Obrador (PRD) was higher in Mexico City and the
agriculturallyoriented; and support for Madrazo (PRI) was relatively higher in the South
compared to the North, Center and Mexico City.
(2) Support for Calderón was higher in urban locations, among female, young, religious,
educated and wealthy citizens; support for López Obrador was higher in urban locations,
among men, middleaged, nonreligious and educated citizens; and support for Madrazo was
higher in rural locations, among older, nonreligious and less educated citizens.
Another variable considered important for explaining voter choice in the 2006 Mexican
presidential election is party identification (Klesner 2007; Moreno 2007). Also, it is often
argued that PAN and PRD voters behaved more faithfully, relative to PRI supporters. Since
the beginning of the campaign PRI’s candidate Madrazo was perceived negatively by most
voters, including those identified with the PRI (Lawson 2009). Additionally, Madrazo ran
third during most of the campaign and this might have motivated PRI identifiers to behave
6
strategically. In this article we test whether voters used partisan cues, as well as whether
people with old partisan attachments where less likely to desert Madrazo.
(3) Citizens voted for the candidate who shares their party identification, and those with older
partisan attachments were more likely to support Madrazo.
Ideology and issue positions have been found to have impacts on voter choice in elections
in other countries (Alvarez 1997; Alvarez and Nagler 1995, 1997; Alvarez et al. 1999,
2000). The 2006 presidential election in Mexico coincided with the emergence of strong
leftist figures in several Latin American countries, and Mexican presidential candidates
were perceived as having polarized policy preferences on issues such as continuation of
previous economic policies, privatization of government‐owned companies and
commercial relations with the U.S. In particular, López Obrador was depicted as offering a
populist message and was compared to Venezuela’s Hugo Chavez, while Calderón was
perceived as a more conservative candidate who would continue the policies of the
previous PAN government (Klesner 2007, Lawson 2009, McCann 2009, Moreno 2007).
Still, some scholars argue that voters were not as polarized as elites (Bruhn and
Green 2007, 2009; Green 2009). Specifically, while PAN and PRD politicians differed
substantially in their opinions about the role of government, abortion, and privatization of
the electric sector, voters located themselves closer to the center of the issue spectrum—
except for relations with the U.S. where voters exhibited negative positions relative to
politicians. Still, Moreno (2007, page 16) claims that “the election represented a struggle of
left against right, of state against market.” Thus, since ideology seemed to be an important
7
feature of the 2006 presidential race, we test whether voters behaved according to spatial
model of voting like those introduced by Downs (1956) or Enelow and Hinich (1984):
(4) Voters supported the candidate closest to them in the leftright ideological dimension, as
well as on issue positions such as commercial relations with the U.S.
Retrospective assessments of the economy are another factor which have influenced voter
behavior in other countries (Alvarez and Nagler 1995, 1997; Alvarez et al. 1999, 2000;
Duch and Stevenson 2008; Lewis‐Beck and Stegmaier 2000, 2006). Thus, another possible
explanation for voter choice in the 2006 Mexican election is that citizens decided based on
a retrospective evaluation of the national and personal economy. In the year before the
election, there was a considerable increase in economic growth and decrease in the
inflation rate, and Fox was perceived as successful in achieving macroeconomic stability
(Klesner 2007; Moreno 2007, 2009). Also, some scholars point out that one of the main
PAN campaign messages was that this stability could be compromised if López Obrador
won the election (Lawson 2009). Moreno (2007) argues that “assessments of the economy
are, in fact, one of the strongest explanatory factors of presidential vote in 2006.”
Accordingly, we formulate the following hypothesis:
(5) Support for Calderón was larger among those voters making positive retrospective
evaluations of national and personal economic conditions.
8
Also, scholars have pointed out that voters exhibited varying judgments about the honesty
of the different candidates, as well as their ability to manage the economy, crime and
poverty. At the beginning of the campaign López Obrador exhibited an advantage in voters’
evaluations due to his recent experience as mayor of the Federal District and relative
anonymity of PAN’s candidate Calderón. However, by the end of the campaign Calderón
managed to improve in voters’ evaluations and to raise doubts about López Obrador’s
ability to maintain macroeconomic stability (Lawson 2009, Greene 2009). Our next
hypothesis is that candidate evaluations like the ones described above may have had an
impact on voter choice:
(6) Voters decided based on candidate evaluations such as honesty, or ability to manage the
economy, poverty and crime.
Previous studies have found that voter confidence in the fairness of the electoral process
has a significant impact on other dimensions of voter behavior in Mexican elections, such as
political engagement and turnout (Hiskey and Bowler 2005; McCann and Dominguez 1998;
Klesner and Lawson 2001). Thus, it is possible that voter confidence also affected voter
choice in 2006. Specifically, we expect voters who thought the election would be unfair to
punish both the incumbent party and the opposition candidate with antecedents of
committing electoral fraud—Madrazo.1 Also, we expect support for López Obrador to be
higher among those voters willing to participate in post‐electoral conflicts.
1 Madrazo had been accused of electoral fraud in the 2004 Tabasco gubernatorial election.
9
(7) Voters who thought the electoral process would be clean and who were not willing to
participate in postelectoral protests supported Calderón. Also, support for Madrazo was
lower among those voters with lower confidence in the cleanness of the electoral process.
Differently, support for López Obrador was higher among those exhibiting lower confidence in
the cleanness of the electoral process and who were willing to demonstrate against official
election outcomes.
Data and Methodology
Mexico Panel Study 20052006
We use data from the Mexico Panel Study (MPS) 2005‐2006, a project conducted by the
Grupo Reforma and Massachusetts Institute of Technology (Lawson et al. 2007), consisting
of in‐person interviews to Mexicans 18 years old or older. The study consists of three
survey waves: one conducted in October 2005 (approximately 9 months before the
election), another conducted in April‐May 2006 (approximately 2 months before the
election) and a third one conducted in July 2006, after the election.2 In this article, we
consider those individuals who provided an affirmative answer to the turnout question,
answered the voter choice question in the 3rd wave, and preferred one of the three main
candidates (approximately 76% of the 3rd wave sample). Table 1 shows the proportion of
respondents supporting each candidate, according to regional indicators, socio‐ 2 The first wave contains 2,400 interviews, including a national sample of 1,600 individuals
as well as over‐samples of the Federal District and rural locations of 500 and 300
individuals, respectively. Then, during the 2nd and 3rd waves researchers re‐interviewed
1,776 and 1,594 of the 1st wave respondents, respectively.
10
demographic variables, political attitudes and candidate evaluations. Also, Table 2 shows
the proportion of missing values for each explanatory variable, as well as means of imputed
values for each imputed data set.3
Dependent variable: The objective of our multivariate analysis is to explain voter choice
based on 3rd wave post‐election survey responses. The original wording of the question is:
“For the purposes of this survey, I will give you a sheet where you can mark how you voted
on the last presidential elections, without my seeing you, and then deposit it in this bag. For
whom did you vote for president?” In total, 1,218 of the 3rd wave respondents answered
this question and chose one of the three main candidates. Among them, 20% chose
Madrazo, 42% chose López Obrador and 38% chose Calderón (see Table 1). If we compare
these figures with the observed 3‐party election outcome, where Madrazo obtained 24% of
the vote, López Obrador obtained 38% of the vote and Calderón obtained 38% of the vote,
then it follows that the 3rd wave sample is not completely representative of the Mexican
electorate—it overrepresents López Obrador supporters and underrepresents Madrazo
supporters.
Explanatory variables: The first set of variables we use are indicators of region,
urbanization and partisanship of state government. Figure 1 shows how official vote
shares varied across states, and Table 1 shows the regional distribution of voter choice
according to the 3rd wave MPS survey. While Calderón was stronger in the industrial‐ and 3 The appendix at the end of the paper contains details about the procedure used to impute
missing values and about the wording and coding of all variables used in our analysis.
11
export‐oriented North and Center, López Obrador was relatively more popular in Mexico
City and the agriculturally oriented South. Differently, Madrazo was very unpopular in
Mexico City but obtained a substantial share of the vote in the South. As expected, López
Obrador and Calderón obtained a majority of the votes in states with PRD or PAN
governorship, respectively. However, Madrazo barely obtained an average of 23% of the
vote in states with PRI government.
The second set of explanatory variables are socio‐demographic characteristics like
age, race, gender, education, church attendance, income and whether the individual ever
received benefits from the Oportunidades or Progresa anti‐poverty programs. Table 1
shows that Calderón was relatively more popular among young citizens, those with white
skin, female voters, highly educated individuals and those with a high monthly income.
Differently, López Obrador received relatively more support from middle‐aged citizens,
with light‐ or dark‐brown skin, male voters, those with middle and high levels of education,
with middle and low levels of monthly income, who never received benefits from anti‐
poverty programs, as well as those with low levels of church attendance. Finally, Madrazo
received relatively more support among old citizens, with dark brown skin, female voters,
individuals with low levels of education, low levels of monthly income, those who received
benefits from anti‐poverty programs, as well as those who attend church very often.
As mentioned in the previous section, as a result of the regional political context, the
2006 presidential election was often portrayed in ideological terms. The third set of
explanatory variables included in our model corresponds to ideology indicators, issue
positions, approval of president Fox, and party identifications measured eight to nine
12
months before the election.4 According to Table 1, Calderón and Madrazo received
relatively more support among right‐of‐center voters, and López Obrador received more
support among left‐of‐center voters. Regarding foreign policy positions, while those
preferring an increase in commercial relations between Mexico and the United States
preferred Calderón relatively more often, those preferring a decrease in commercial
relations between both countries tended to prefer López Obrador relatively more often.
Regarding social policy positions, Madrazo was relatively more popular among those who
did not support abortion in case of rape, and the opposite happened with López Obrador. In
addition, as regards to party identifications, PAN and PRD voters overwhelmingly voted for
Calderón and López Obrador, respectively. However, only 55% of PRI respondents voted
for Madrazo. Also, independent respondents split their vote evenly between Calderón and
López Obrador, and approximately 1 in 10 voted for Madrazo. Finally, Madrazo was
relatively more popular among respondents with old partisan attachments, and the
opposite happened with López Obrador. 4 Note that all explanatory variables were taken from the 1st or 2nd waves of the MPS.
Whenever possible and appropriate, we used 2nd wave responses. The variables
intentionally taken from the 1st wave of the MPS were: party identification, evaluations of
the national and personal economy and candidate evaluations. The reason we did so is
because 2nd wave respondents may already have a well‐formed idea of their voter choice in
the forthcoming election, and this might affect their 2nd wave answers to these questions.
Other explanatory variables taken from the 1st wave are: reception of benefits from anti‐
poverty programs and church attendance, because these questions were not included in
the 2nd wave of the MPS.
13
The fourth and last set of explanatory variables included in our model are candidate
evaluations—specifically, judgments of candidates’ ability to manage the economy, reduce
crime and poverty and their honesty—evaluations of the national and personal economy,
voter confidence in the fairness of the electoral process, and attitudes toward post electoral
conflicts. Table 1 shows that voters perceiving an improvement in the national or personal
economy voted for Calderón more frequently, while those perceiving deterioration in
economic conditions voted more frequently for López Obrador or Madrazo. As regards to
voter confidence, those thinking that the election would be somewhat or totally fair voted
more often for Calderón, while the opposite pattern is observed in support for López
Obrador. Finally, support for López Obrador was substantially higher among those willing
to participate in a post‐electoral conflict, and the opposite is observed in the case of
Calderón vote shares.
Modeling Voter Choice in Mexican Presidential Elections
In the Mexican elections literature, some authors study voter choice using binary logistic
regressions (Cortina et al. 2008; Dominguez and McCann 1998). This approach is
problematic because it rests upon implausible assumptions about voter behavior (Alvarez
and Nagler 1998). For instance, Cortina et al. (2008) use a multilevel logistic regression to
explain voter choice between Calderón and Obrador in the 2006 presidential election,
disregarding the preferences of 20% of the electorate who supported Madrazo. Implicitly,
this procedure assumes that if the latter were forced to choose between Calderón and
López Obrador the conditional probability of voting for one or the other would be similar
to that of voters observed choosing Calderón or López Obrador, which—as we show
14
later—is not necessarily the case. Differently, Dominguez and McCann (1998) consider the
binary choice between support for the incumbent PRI party and opposition voting in the
1988 presidential election. This approach is also problematic because it implies an
inaccurate coding of the dependent variable. Suppose, for instance, that PAN voting in 1988
was positively explained by female gender, while the opposite happened with PRD voting;
then the model specification of Dominguez and McCann (1998) may lead to meaningless
estimates because gender effects go in different directions for different opposition parties.
Still, the most common approach for modeling voter choice in Mexican elections
employs multinomial logit (Dominguez and McCann 1995; Lawson 2006; Magaloni 2006;
Moreno 2007). The primary limitation of the MNL model is that it assumes disturbances
are uncorrelated across alternatives, and therefore forces the condition of Independence of
Irrelevant Alternatives (IIA) whereby adding or removing one of the candidates does not
affect the relative probabilities of choosing among the remaining candidates. This is an
implausible assumption in the context of multicandidate elections; in the Mexican case, this
assumption requires that voter decisionmaking be unaffected by changes in the set of
candidates who are running for office.
Therefore, modeling voter choice in multiparty elections requires a representation
accommodating potential correlation between disturbances across alternatives, such as the
multinomial probit model (MNP, Hausman and Wise 1978). Morgenstern and Zechmeister
(2001) estimate a MNP model of voter choice in the 1997 Mexican legislative election. Still,
their specification is limited because it does not include choice‐specific right hand side
15
variables.5 In this article, we estimate a MNP model considering choice‐specific variables
such as ideology distances, issue distances and pre‐election candidate evaluations. More
formally, our model specification can be written as follows:
ijjiijij vaXU , for j=1,2,3
where ijX is a vector of alternative‐specific covariates, ia is a vector of individual‐specific
covariates, and ijv is a normally distributed error term. Also, while ijX is associated with a
vector of coefficients which does not vary by candidate or individual decision maker, ia
is associated with a choice‐specific vector of coefficients j which take different values for
each of the three candidates. As in other random utility specifications, we assume voters
are rational and choose the candidate who maximizes their utility.
In addition to improving the model specification relative to previous studies of voter
choice in Mexican elections, we estimate our MNP model using Bayesian estimation, which
exhibits advantages relative to the simulated likelihood approach used in previous studies
(Albert and Chib 1993; Imai and Dyk 2005a; McCulloch and Rossi 1994; McCulloch et al.
5 In addition, even though Morgenstern and Zechmeister (2001) consider individual
positions on issues, they disregard other variables that may affect both voter choice and
issue positions – such as regional dummies or party identifications – and this may result in
omitted variable bias.
16
2000).6 Specifically, Bayesian estimation has computational advantages relative to
simulation methods, and does not rely on asymptotic approximation. In a Bayesian
framework a Markov chain based sampler “can be used to draw directly from the e
posterior distribution and perform finite sample likelihood inference to any degree of
accuracy” (McCulloch and Rossi 1994, 208). In this article, we fit our voter choice model
using R's MNP package (Imai and Dyk 2005b). Details of the specification and estimation of
our voter choice model can be found in the appendix.
Results
In table 3 we present the estimated parameters of our multivariate model. The estimation
was done setting Madrazo as the baseline alternative, so coefficients should be interpreted
in terms of propensities to vote for López Obrador or Calderón, relative to Madrazo. The
first column shows the names of the explanatory variables included in our model. If we
focus on individual‐specific covariates (from the “intercept” row to the “personal economy”
row) then the second, third and fourth columns show the means and 95% confidence
intervals (CI) for López Obrador relative to Madrazo, and the fourth, fifth and sixth
columns show the means and 95% CI for Calderón relative to Madrazo. Then, if we focus
on the set of candidate‐specific variables (from the “ideology distance” row to the “shares
governor party ID” row), then the first, second and third columns give the means and 95%
CI for the the coefficients. Finally, the last three rows show the mean and 95% CI for the
6 Morgenstern and Zechmeister (2001) estimate their MNP model using simulated
maximum likelihood.
17
normalized value of Lopez‐Obrador's variance, correlation between López Obrador and
Calderón’s variance.7
We use the results of the model to compute the probability that each individual, conditional
on his own characteristics and other explanatory variables, votes for the different
candidates. As mentioned previously, the observed proportions in our sample are: 20.0%
for Madrazo, 41.5% for López Obrador and 38.4% for Calderón. If we compute predicted
probabilities of voting for each candidate and then use them to predict individual choices,
then predicated proportions are 19.7% for Madrazo, 44.8% for López Obrador, and 35.5%
for Calderón.8 Overall, our model correctly classifies 72.5% of the individuals who
responded having voted for Madrazo, López Obrador or Calderón. This suggests that our
model contributes to explaining individual behavior, although it slightly under‐estimates
the proportion of voters supporting Calderon, and over‐estimates the proportion of voters
supporting López Obrador.9
7 The trace of the variance covariance matrix is set to 1 (see methodological appendix).
8 Average probabilities are 20.9%, 45.5% and 33.6% for Madrazo, López Obrador and
Calderón, respectively.
9 In particular, it performs substantially better than model‐free classification criteria such
as assuming individuals have equal chance of supporting any of the candidates (which
would correctly classify approx. 1/3 of our sample), assuming individuals have 50% chance
of voting for López‐Obrador or Calderón (which would correctly classify approx. 40% of
our sample), as well as assuming everyone would vote for either López Obrador or
Calderón (which would correctly classify 41.5% and 38.4% of our sample, respectively).
18
Region, urbanization and state government. According to Table 3, regional indicators
were significant for explaining voter choice, and effects are in the expected directions, with
individuals in the Center and Mexico City being more likely to support López Obrador
relative to Madrazo, and individuals in the South being less likely to support Calderón
relative to Madrazo. Also, respondents living in urban locations were more likely to
support Calderón relative to Madrazo, compared to respondents living in rural or mixed
locations. Finally, we find no evidence suggesting that voters were more likely to support a
candidate sharing his or her governor’s party identification.
Sociodemographic characteristics. At conventional levels of significance (>90%), race
was the only socio‐demographic variable significant for explaining voter choice. In
particular, we find that people with white skin were less likely to support López Obrador
relative to Madrazo, compared to respondents with light‐ or dark‐brown skin. The
remaining socio‐demographic variables were not significant, although signs went in the
expected direction—with women being less likely to support López Obrador over Calderón
but more likely to support Calderón over Madrazo—and more affluent individuals being
more likely to support Calderón relative to Madrazo.
Ideology distance, issue positions and party identifications. We do not find that
individuals were more likely to support candidates closer to them in the ideological scale,
Still, the proportion of correctly classified individuals varies by alternative. Specifically, this
proportion was 63.1% for Madrazo, 79.1% for López‐Obrador and 70.3% for Calderón.
19
or that they based their choice on differences in opinions about commercial relationships
between México and the US. Also, individual positions on the issue of abortion in case of
rape were not significant for explaining voter choice. Moving to party identifications, we
find that PRD and PAN partisanships significantly affected voter choice in the expected
direction—with PRD and PAN identifiers being more likely to vote for López Obrador and
Calderón over Madrazo, respectively. Differently, we find that people with PRI partisanship
were not more likely to vote for Madrazo relative to López Obrador or Calderón, a very
surprising finding. Still, we do find that respondents with older partisan attachments were
more likely to vote for Madrazo relative to both López Obrador and Calderón. A probable
explanation of the latter two results is that while recent PRI partisans were likely to dessert
their party’s nominee and vote for an opposition candidate—perhaps based on strategic
motivations—old PRI partisans were more likely to follow party cues and support their
party nominee. Last, in line with expectations, individuals approving of president Fox were
significantly more likely to support Calderón relative to Madrazo.
Candidate evaluations and the economy. Regarding retrospective evaluations of the
national economy, we find that higher evaluations had a positive and significant effect on
the probability of voting for Calderón over Madrazo. Differently, evaluations of personal
economic conditions were not significant for explaining voter choice. In addition,
evaluations of candidates’ ability to manage the economy and poverty, as well as
evaluations of candidates’ honesty, had a positive and significant effect on voter choice.
Evaluations of candidates’ ability to fight crime were not significant, although signs went in
the expected direction.
20
Voter confidence and attitudes towards postelectoral confidence. Our model includes
both separate indicators of voter confidence and willingness to participate in post‐electoral
conflicts, as well as an interaction between both variables. At conventional levels of
significance (>90%), none of these variables had a significant direct effect on voter choice.
Also, signs contradict our seventh hypothesis—individuals with more confidence in the
cleanness of the electoral process were less likely to vote for Calderón over Madrazo.
In addition, the estimated value of the normalized covariance between voting for
López Obrador and voting for Calderón is positive and significant. This is evidence of
violation of the condition of Independence of Irrelevant Alternatives (IIA), and implies that
if Madrazo were to drop‐out from the race, then the relative probabilities of choosing
between López Obrador and Calderón would not remain the same. Therefore, it would be
inappropriate to model voter choice in the 2006 presidential election using specifications
which assume the IIA assumption is satisfied – such as multinomial logit regressions.
Finally, we note that when the model is estimated within the different data sets
standard deviations are usually smaller. Therefore, some coefficients are significant within
imputations but not after pooling results across imputed data sets. For instance, receiving
benefits from anti‐poverty programs, PRI partisanship, voter confidence, willingness to
participate in protests, as well as the interaction between the last two variables are all
significant for explaining voter choice within several of the imputed data sets, but not after
computing the overall results. This suggests that the multiple imputation procedure indeed
reflects the uncertainty about the missing values, and this uncertainty translates into lower
levels of significance for some model parameters.
21
Marginal effects
The multinomial probit specification assumes that voter choice is non‐linear on the
explanatory variables. Even though the separate analysis of the coefficients presented in
Table 3 may provide an idea of the direct effect of right‐hand‐side variables on voter
choice, it does not provide an accurate idea of the total impact or magnitude of these
changes. For this purpose, it is better to compute the change in the probability of
supporting the different candidates as a result of marginal changes in each of the
explanatory variables. In Table 4 we show estimated marginal effects for a hypothetical
voter.10 The probabilities that this individual supports Madrazo, López Obrador or
Calderón are 12.6%, 43.0% and 43.6%, respectively.
Effects of region, urbaneness, demographic variables and party identifications
The magnitude of regional effects suggests that this is one of the most important factors for
explaining voter choice. Specifically, when the location of our hypothetical voter changes 10 Our hypothetical individual is female, has brown skin, has 41 years old, lives in an urban
location in Mexico City, has completed middle or technical school, attends church once a
month, perceives between 2,600 and 3,900 pesos per month, receives or has received
benefits from anti‐poverty programs, supports abortion in case of rape, is somewhat
confident in the cleanness of the electoral system, is unwilling to participate in protests
against the official election outcome, approves a little of Fox, is an independent voter, is
located at the center of the ideology spectrum, and believes that both the state of the
national and personal economy have stayed the same.
22
from Mexico City to South there is a 17 percentage point increase in the probability of
supporting Madrazo, and 10 and 8 percentage point decrease in the probability of
supporting Calderón and López Obrador, respectively. Also, changing location from Mexico
City to Center causes a 10 percentage point decrease in the chance of voting for López
Obrador. Finally, changing location from Mexico City to North causes 13 and 8 percentage
point increases in the probability of voting for Calderón and Madrazo, respectively, and a
21 percentage point reduction in the probability of voting for López Obrador. In addition,
urbanness effects reflect that when our hypothetical voter changes from an urban to a rural
or mixed location, the chance of supporting Madrazo increases by 5 percentage points.
Moving to the impact of socio‐demographic variables, we find that race, gender and
receiving benefits from anti‐poverty programs have considerable effects on voter choice.
Specifically, having white instead of brown skin causes a 10 percentage point decrease in
the probability of voting for López Obrador. Regarding gender, a male hypothetical voter is
11 percentage points more likely to vote for López Obrador and 10 percentage points less
likely to vote for Calderón, compared to a female voter. Last, switching current or past
reception of anti‐poverty benefits to having never received these types of benefits
decreases the probability that our hypothetical voter supports Calderón by 7 percentage
points, and increases the probability that she supports López Obrador by 6 percentage
points (although these effects are not significant at conventional confidence levels).
Next, the other set of variables causing large changes in voting probabilities are
those related to party identification. When our hypothetical voter is a PRI identifier instead
of independent, the chance that she votes for Madrazo is 7 percentage points higher, and
the chances that she votes for López Obrador or Calderón decline by 5 and 2 percentage
23
points, respectively (although these effects are not 90% significant). Also, in line with the
findings of Table 3, being a PRD or PAN identifier, instead of an independent voter, greately
increases the probabilities of voting for López Obrador and Calderón by 41 and 23
percentage points, respectively, while it significantly decreases the likelihood of voting for
other candidates.
Effect of the economy and candidate evaluations
Changes in retrospective evaluations of the national have small but significant effects in the
behavior of our hypothetical voter. In particular, increasing the evaluation of the national
economy from “stayed the same” to “improved a little” raises the probability of supporting
Calderón by 5 percentage points. On the other hand, assessments of personal economic
conditions had small and non‐significant effects on voter choice. Moving to assessments of
candidate‐specific traits, we find that evaluations of candidates’ abilities to manage the
economy and poverty have significant effects on the chances of voting for the different
candidates. Since coefficients associated with these evaluations do not change across
candidates, we only show the impact of these effects for López Obrador, although effects
should go in similar directions for Calderón and Madrazo. Table 5 shows that when
evaluations go from “not at all capable” to “very capable” of managing the economy, crime
or poverty, the probability of voting for López Obrador increases between 7 and 10
percentage points. Most importantly, when the honesty evaluation changes from “not at all
honest” to “very honest” the probability of voting for López Obrador increases by 21
percentage points, showing that perceptions honesty are also a very important for
explaining voter choice.
24
Effect of changes in ideology distances
To measure the effect of candidate movements in the ideology scale, we computed the
probabilities of voting for López Obrador and Calderón for different levels of candidate
ideology, leaving constant the ideology position of our hypothetical individual—who is
located at the center of the ideology scale—as well the position of other candidates and the
level of all other explanatory variables (see Figure 2). Everything else constant, choice
probabilities are maximized when the candidates move to the center of the ideology
spectrum. Specifically, if López Obrador moves from “center‐left” to “center” his choice
probability increases from approximately 43.0% to 44.7%, and something similar happens
when Calderón moves from “center‐right” to “center.” Still, the magnitude of these
movements is statistically indistinguishable from zero, and are small compared to the
impact of other variables such as region or party identifications.
Effect of voter confidence and attitudes toward postelectoral conflicts
According to Table 4, when our hypothetical voter changes from being somewhat confident
in the cleanness of the electoral process to being totally confident, the probability she
supports Calderón decreases significantly by 6 percentage points, while the probability
that she supports López Obrador increases by a similar magnitude. This suggests that it is
not the case that higher levels of voter confidence increase support for the incumbent. In
addition, when our hypothetical voter changed from being unwilling to participate in
demonstrations against official election outcomes to being willing to participate in protests,
the chance that she supports López Obrador increases by 6 percentage points and the
25
chance that she supports Calderón or Madrazo decreases by 4 and 2 percentage points,
respectively. Still, the impact of willingness to participate in protest is not significant at
conventional confidence levels (90%).
Effect of changes in the number of candidates running for office
In this section we address the following question: What would have happened if the third
candidate, Madrazo, had dropped out from the race? Would the relative probabilities of
choosing Calderón over Madrazo have remained the same? Or would the odds have shifted
decisively in favor of one of the remaining candidates? One of the advantages of estimating
a multinomial probit model—compared, for instance, to the multinomial logit specification
usually employed in the Mexican politics literature—is that it allows error terms to be
correlated across choices, and enables the simulation of changes in choice probabilities as a
result of a change in the set of candidates running for office.11 In this paper, we used
estimated coefficients and components of the variance‐covariance matrix of normalized
error terms to simulate changes in individual choice probabilities under a hypothetical
scenario where Calderón and López Obrador are the only two candidates running for
office.12
11 In the multinomial logit specification, relative probabilities are assumed to be unaffected
by changes in the choice set. This assumption is often termed “Independence of Irrelevant
Alternatives.” For details about how the multinomial probit specification relaxes IIA, see
Alvarez and Nagler 1995.
12 The appendix contains a detailed explanation of the simulation procedure.
26
Figure 3 shows the distribution of the normalized odds of voting for Calderón,
relative to López Obrador under two different scenarios: one where PRI's candidate
Madrazo is running for office (black lines); and another one where Madrazo drops out from
the race (grey lines).13 These densities are computed including all voters, as well as for
groups of respondents who reported voting for a specific candidate: Madrazo (upper right
plot), López‐Obrador (lower left plot) and Calderón (lower right plot). Overall, the
distribution of odds is very similar under both hypothetical scenarios, and there is no
significant change in mean or median values of these distribution. However, a visual
examination of the distribution of preferences of Madrazo voters suggest that these
individuals are less certain about their preferences for López‐Obrador and Calderón after
Madrazo drops out from the race—in that scenario, there is a larger mass of individuals
located closer to the center of the distribution. In addition, while the distribution is clearly
multimodal when Madrazo takes part in the contest (with one set of voters preferring
López Obrador relative to Calderón, and another one with opposite preferences), the mode
located to the left of the scale is smoothed when Madrazo drops out from the race,
suggesting that there are fewer voters feeling highly certain about their preference for
Calderón over López Obrador. Even though we do not find significant changes in the
predicted vote shares received by the different candidates, Figure 3 suggests that a
scenario where Madrazo dropped out from the race would have been somewhat more
challenging for Calderón.
13 Higher numbers correspond to higher probability of voting for Calderón relative to López
Obrador.
27
Conclusion
The purpose of our analysis was to use survey data from the 2006 Mexican presidential
election to learn about how citizens vote in a new democracy. Similar to previous studies,
we found that region and party identification are central for explaining presidential choice,
followed by socio‐demographic characteristics such as race, gender and reception of anti‐
poverty benefits. To a lower degree, we found that retrospective evaluations of the national
and personal economy, as well as evaluations of candidates’ ability to manage economic
policy and crime, also have significant impact on voter decisions. Regarding ideology
distances, even though effects were small, we found that movements of any candidate to
the center of the ideology scale could have caused decisive changes in the election
outcome—given the closeness of the race. Finally, we found that voter confidence and
attitudes towards post‐electoral conflicts had substantial effects on voter choice,
suggesting that, even after the fall of the old regime, voter choice was affected by citizens’
perception of the quality of the electoral process.
How should we interpret the finding that geography looms so large on voter
decisions? We discard the proposition that the unequal distribution of socio‐economic
factors across the country explains the spatial distribution of voter choice. Our model
controlled for a diverse set of socio‐economic indicators, including household income,
educational attainment, and reception of anti‐poverty benefits; and even then, region
remained significant for explaining voter choice. A second proposition that we do not find
support for is that the larger industrial development in the North, and strong commercial
relationships with the US in that region, drive differences in voter choice between northern
and southern regions of the country. Our model controlled for individual positions on
28
Mexico‐US commercial relationships, as well as whether the respondent lived in an urban
location—which should be positively correlated with industrial employment, and
negatively correlated with rural employment such as peasantry—and even then, region
remains an important factor for explaining voter choice. In a similar way, we do not find
evidence in favor of alternative explanations of the importance of region based on ideology,
on differences in distributions of party identifications, on confidence in the cleanness of the
electoral process, as well as economic explanations of the effect of region. After controlling
for all of these factors, the direct effect of region on voter choice remains strong and
significant. This leads us to conclude that geographic representation is of outmost
importance in Mexico’s new democracy.
The other variable which was found to have a large impact on voter choice, is party
identification. This result stands in contrast to recent trends in advanced democracies,
where researchers have found a decrease in the importance of party identification as an
explanation of voter choice—to the extent that independent voters have been found to be
more consistent in their partisan choice than party identifiers (Clarke and Stewart 1998,
362). A possible explanation of this difference is that partisan attachments may not have
the same meaning in Mexico as they do, for instance, in the United States. In particular,
partisanship in Mexico may not be a long‐term attitude toward one particular party, but a
short‐term feeling developed during the electoral period, indistinguishable from voter
choice—it may be indeed the case that some “cognitive and attitudinal constructs are so
proximate to vote choice that they constitute part of what is to be explained” (Evans 2000,
401). In addition, Magaloni (2006, 209) writes that the concept of party identification is
problematic in Mexico, when researchers use cross‐sectional data, because respondents do
29
not distinguish the difference between preferences for the different candidates and voting
intentions. If that were the case, the inclusion of party identification in our model would be
somewhat questionable.
Still, the party identification measure used in our model was taken from the first
wave of the Mexico Panel Study, conducted approximately nine months before Election
Day, when some of the candidates were still largely unknown to parts of the population.14
Therefore, we are confident that the measure of party identification used in our analysis is
not equivalent to voter choice. Moreover, the phrasing of the partisanship question used in
the MPS is similar to that use in ANES studies in the US. According to Clarke and Stewart
(page 361) “it is argued that the ANES question prompts long‐term thinking,” compared to
alternative phrasings such as that used by Gallup which asks about party identification “as
of today.” Since more than 39% of our sample reports having relatively old partisan
attachments (older than 11 years), excluding party identification from our model would be
problematic, because we would not be able to distinguish between determinants of voter
choice and determinants of partisan attachments. Still, since party identification has such a
strong impact on voter choice, even after controlling for a broad set of socio‐economic
variables and political attitudes, we believe it is important that future studies focus on
exploring the determinants of party identification in Mexico. One possible explanation of
the impact of party identification is that local politics determine preferences for
presidential candidates. We reject this proposition because our model contained an
indicator of the party identification of the state governor, and we found that sharing the 14 Actually, all explanatory variables included in our model were measured two to nine
months before the election, while the dependent variable was measured after Election Day.
30
same party identification as the state governor has no impact on the probability that voters
choose a given candidate.
Overall, our results do not break with previous findings in the Mexican literature,
but add novel and interesting insights about the causes of political behavior. Therefore, our
results should be relevant to scholars of Mexican politics as well as those interested in
voter behavior in new democracies. Moreover, our analysis is also important for its
methodological contributions, including: use of multiple imputation to incorporate our
uncertainty about the nature of the non‐responses; specification of a statistical model
appropriate for studying voter choice in multi‐candidates elections; considering the effect
of candidate‐specific variables such as ideology distances useful to test for the spatial
theory of voting or for the effect of strategic movements by candidates or parties.
Therefore, this article should also be pertinent to a more general audience interested in
statistical models of voter choice. We found that our multiple imputation procedure
affected both the magnitude and significance of our parameters; that there is positive and
significant correlation between voting for López Obrador and Calderón, suggesting a
violation of the IIA condition; and we also found that several candidate‐specific variables
were significant for explaining voter choice. These results suggest that our methodological
advances were worthwhile and indeed necessary to gain an accurate understating of voter
behavior in the 2006 Mexican presidential election.
31
References
Albert, James H. and Siddhartha Chib. 1993. Bayesian Analysis of Binary and Polychotomous Response Data. Journal of the American Statistical Association 88(422):669‐79. Alvarez, R. Michael. 1997. Information and Elections. Ann Arbor: University of Michigan Press. Alvarez, R. Michael and Gabriel Katz. 2008. Structural Cleavages, Electoral Competition and Partisan Divide: A Bayesian Multinomial Probit Analysis of Chile’s 2005 Election. Electoral Studies 28(2): 177‐89. Alvarez, R. Michael and Jonathan Nagler. 1995. Economics, Issues and the Perot Candidacy: Voter Choice in the 1992 Presidential Election. American Journal of Political Science 39(3):714‐44. Alvarez, R. Michael and Jonathan Nagler. 1997. Economics, Entitlements and Social Issues: Voter Choice in the 1996 Presidential Election. American Journal of Political Science 42(4):1349‐63. Alvarez, R. Michael and Jonathan Nagler. 1998. When Politics and Models Collide: Estimating Models of Multiparty Elections. American Journal of Political Science 42(1): 55‐96. Alvarez, R. Michael, Jonathan Nagler and Shaun Bowler. 2000. Issues, Economics, and the Dynamics of Multiparty Elections: The British 1987 General Election. American Political Science Review 94(1): 131‐49. Alvarez, R. Michael, Jonathan Nagler and Jennifer Willette. 2000. Measuring the Relative Impact of Issues and the Economy in Democratic Elections. Electoral Studies 19(2‐3): 237‐53. Aparicio, Ricardo. 2002. La Magnitud de la Manipulación del Voto en las Elecciones Federales del Año 2000. Perfiles Latinoamericanos 20: 79‐102. Bruhn, Kathleen. 2009. López Obrador, Calderón and the 2006 Presidential Campaign. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 169‐88. Bruhn, Kathleen and Greene, Kenneth F. 2007. Elite Polarization Meets Mass Moderation in Mexico’s 2006 Elections. PS: Political Science and Politics 40(1):33‐8.
32
Bruhn, Kathleen and Greene, Kenneth F. 2009. The Absence of Common Ground Between Candidates and Voters. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 109‐28. Clarke, Harold D. and Marianne C. Stewart. 1998. The Decline of Parties in the Minds of Citizens. Annual Review of Political Science 1:357‐78. Cornelius, Wayne A. 2002. La Eficacia de la Compra y Coacción del Voto en las Elecciones Mexicanas de 2000. Perfiles Latinoamericanos 20: 11‐31. Cortina, Jeronimo, Andrew Gelman and Maria Lasala‐Blanco. 2008. One vote, many Mexicos: Income and Vote Choice in the 1994, 2000, and 2006 Presidential Elections. Manuscript. Paper presented at the annual meeting of the MPSA Annual National Conference, 2008. Craig, Ann L., and Wayne A. Cornelius. 1995. “Houses Divided: Parties and Political Reform in Mexico. In Building Democratic Institutions: Party Systems in Latin America, eds. Scott Mainwaring and Timothy R. Scully. Stanford: Stanford University Press, pp. 249‐97. Díaz‐Cayeros, Alberto and Beatriz Magaloni. 2001. Party Dominance and the Logic of Electoral Design in Mexico’s Transition to Democracy. Journal of Theoretical Politics. 13(3):271‐93. Díaz‐Cayeros, Alberto, Beatriz Magaloni, and Barry R. Weingast. 2003. Tragic Brilliance: Equilibrium Hegemony and Democratization in Mexico. Manuscript. Stanford University. Díaz‐Santana, Héctor. 2002. El Ejercicios de las Instituciones Electorales en la Manipulación del Voto en México. Perfiles Latinoamericanos 20: 101‐128. Domínguez, Jorge I. 2009a. The Choices of Voters During the 2006 Presidential Election in Mexico. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 285‐303. Domínguez, Jorge I., Chappell Lawson and Alejandro Moreno. 2009b. Consolidating Mexico’s Democracy. Maryland: John Hopkins University Press. Domínguez, Jorge I. and James A. McCann. 1995. Shaping Mexico’s Electoral Arena: The Construction of Partisan Cleavages in the 1988 and 1991 National Elections. American Political Science Review 89(1): 34‐48. Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper and Row. Duch, Raymund M. and Randolph T. Stevenson. 2008. The Economic Vote: How Political and Economic Institutions Condition Election Results. New York: Cambridge University Press.
33
Eisenstadt, Todd A. 2007. The Origins and Rationality of the “Legal versus Legitimate” Dichotomy Invoked in Mexico’s 2006 Post‐Electoral Conflict. PS: Political Science and Politics 40(1):39‐43. Enelow, James M. and Melvin J. Hinich. 1984. The Spatial Theory of Voting: An Introduction. New York: Cambridge University Press. Estrada, Luis and Alejandro Poiré. 2007. Taught to Protest, Learn to Lose? Journal of Democracy 18(1):73‐87. Evans, Geoffrey. 2000. The Continued Significance of Class Voting. Annual Review of Political Science 3: 401‐17. Fiorina Morris P. 1978. Economic retrospective voting in American national elections: a micro‐analysis. American Journal of Political Science 22(2):426–43 Fiorina Morris P. 1981. Retrospective Voting in American National Elections. New Haven: Yale Univ. Press. Gelman, Andrew and Donald B. Rubin. 1992. Inference for Iterative Simulation Using Multiple Sequences. Statistical Science 7: 457‐472. Geweke, John, Michael Keane and David Runkle. 1994. Alternative Computational Approaches to Inference in the Multinomial Probit Model. The Review of Economics and Statistics 76(4):609‐32. Greene, Kenneth F. 2009. Images and Issues in Mexico’s 2006 Presidential Election. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 246‐67. Hausman, Jerry A. and David A. Wise .1978. A Conditional Probit Model for Qualitative Choice: Discrete Decisions Recognizing Interdependence and Heterogeneous Preferences. Econometrica 46(2): 403‐26. Hiskey, Jonathan T. and Shaun Bowler. 2005. Local Context and Democratization in Mexico. American Journal of Political Science 49(1): 57‐71. Infante, José M. 2005. Elecciones en México: Restricciones, Fraudes y Conflictos. Confines 1/2: 65‐78. Imai, Kosuke and David A. van Dyk. 2005a. A Bayesian Analysis of the Multinomial Probit Model using Marginal Data Augmentation. Journal of Econometrics 124(2):311‐34. Imai, Kosuke and David A. van Dyk. 2005b. MNP: R Package for Fitting the Multinomial Probit Model. Journal of Statistical Software 14(3).
34
Kiewiet, D. Roderick. 1983. Macroeconomics and Micropolitics: The Electoral Effects of Economic Issues. Chicago: Univ. Chicago Press Kinder Donald R. and D. Roderick Kiewiet. 1981. Sociotropic Politics: the American Case. British Journal of Political Science 11(2):129–41. Klesner, Joseph L. 2007. The 2006 Mexican Elections: Manifestations of a Divided Society? PS: Political Science and Politics 40(1):27‐32. Klesner, Joseph L. 2001. The End of Mexico’s One‐Party Regime. PS: Political Science & Politics 34(1): 107‐14. Klesner, Joseph L. 2009. A Sociological Analysis of the 2006 Elections. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 50‐70. Klesner, Joseph L. and Chappell Lawson. 2004. “Adiós” to the PRI? Changing Voter Turnout in Mexico’s Political Transition. Mexican Studies 17(1): 17‐39. Lawson, Chappell. 2009. The Mexican 2006 Election in Perspective. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 1‐25. Lawson, Chappell. 2007b. How Did We Get Here? Mexican Democracy After the 2006 Elections. PS: Political Science and Politics 45‐8. Lawson, Chappell. 2007a. The Mexico 2006 Panel Study. Waves 1, 2 and 3. http://web.mit.edu/polisci/research/mexico06 (Dec 22, 2009). Lawson, Chappel and Alejandro Moreno. 2007. El Estudio de Panel 2006: Midiendo el Cambio de Opiniones Durnate La Campaña Presidencial. Política y Gobierno 14(2): 437‐65. Lewis‐Beck, Michael S. and Mary Stegmaier. 2000. Economic Determinants of Electoral Outcomes. Annual Review of Political Science 3:183‐219. Lewis‐Beck, Michael S. and Mary Stegmaier. 2006. Economic Models of Voting. In Oxford Handbook of Political Behavior, ed. Russell Dulton and Hans‐Dieter Klingemann. Oxford: Oxford University Press. Little, Roderick J. A. 1988. Missing‐Data Adjustments in Large Surveys. Journal of Business & Economic Statistics 6(3):287‐301. Magaloni, Beatriz. 2006a. Voting for Autocracy: Hegemonic Party Survival and its Demise in Mexico. New York: Cambridge University Press.
35
Magaloni, Beatriz. 2006b. How Voters Choose and Mass Coordination Dilemmas: Evidence From the 1994, 1997, and 2000 National Elections. In Voting for Autocracy: Hegemonic Party Survival and its Demise in Mexico. New York: Cambridge University Press. Markus Gregory B. 1988. The impact of personal and national economic conditions on the presidential vote: a pooled cross‐sectional analysis. American Journal of Political Science 32(1):137–54. McCann, James A. 2009. Ideology in the 2006 Campaign. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 268‐83. McCann, James A. and Jorge I. Domínguez. 1998. Mexicans React to Electoral Fraud and Political Corruption: an Assessment of Public Opinion and Voting Behavior. Electoral Studies 17(4): 483‐503. McChulloch, Robert, Nicholas G. Polson and Peter E. Rossi. 2000. A Bayesian Analysis of the Multinomial Probit Model with Fully Identified Parameters. Journal of Econometrics 99(1):173‐93. McChulloch, Robert and Peter E. Rossi. 1994. An Exact Likelihood Analysis of the Multinomial Probit Model. Journal of Econometrics 64(1‐2: 207‐40. Moreno, Alejandro. 2007. The 2006 Mexican Presidential Election: The Economy, Oil Revenues, and Ideology. PS: Political Science and Politics 40(1)15‐9. Moreno, Alejandro. 2009. The Activation of Economic Voting in the 2006 Campaign. In Consolidating Mexico’s Democracy, Jorge I. Domínguez, ed. Chappell Lawson and Alejandro Moreno. Maryland: John Hopkins University Press. Pp. 209‐228. Moreno, Alejandro. 2006. Changing Ideological Dimensions of Party Competition, 1990‐2006. In The Global System, Democracy and Values. ed. Yilmaz Esmer and Thorleif Pettersson. Uppsala: Uppsala University Press. Morgenstern, Scott and Elizabeth Zechmeister. 2001. Better the Devil You Know than the Saint You Don’t? Risk Propensity and Vote Choice in Mexico. Journal of Politics 63(1): 93‐119. Nobile, Agostino. 1998. A Hybrid Markov Chain for the Bayesian Analysis of the Multinomial Probit Model. Statistics and Computing 8(3): 229–242. Poiré, Alejandro. 2000. Un Modelo Sofisticado de Decision Electoral Racional: El Voto Estratégico en México, 1997. Política y Gobierno 7(2): 353‐82.
36
Roberts, Gareth O. 1996. Markov Chain Concepts Related to Sampling Algorithms. In Markov Chain Monte Carlo in Practice. Walter R. Gilks, Sylvia Richardson and David J. Spiegelhalter. London: Chapman & Hall. Rossi, Peter E., Greg M. Allenby and Robert McCulloch. 2005. Bayesian Statistics and Marketing. West Sussex: John Wiley and Sons. Rubin, Donald B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons. Schedler, Andreas. 2004. “El Voto es Nuestro”: Cómo los Ciudadanos Perciben el Clientelismo Electoral. Revista Mexicana de Sociología 66(1): 57‐97. Schedler, Andreas. 2000. Mexico’s Victory: The Democratic Revelation. Journal of Democracy 11(4): 5‐19. Tierney, Luke. 1994. Markov Chains for Exploring Posterior Distributions. Annals of Statistics 22(4): 1701‐28. van Buuren, Stef and Karin Groothuis‐Oudshoorn. 2009. MICE: Multivariate Imputation by Chained Equations in R. Forthcoming in Journal of Statistical Software.
37
APPENDIX
A1. Tables and Figures
Table 1: Summary Statistics
Madrazo López Obrador Calderón
Overall 20 42 38Region Center 21 29 51 Mexico and DF 9 62 30 North 25 15 59 South 32 39 28Urbanness Rural or mixed 33 33 34 Urban 13 46 41Income 0 a 1,299 32 43 25 1,300 a 1,999 24 39 37 2,000 a 2,599 17 46 37 2,600 a 3,999 26 42 32 4,000 a 5,199 15 40 45 5,200 a 6,499 20 45 36 6,500 a 7,899 18 48 34 7,900 a 9,199 13 40 48 9,200 a 10,499 19 39 42 10,500 or more 6 32 62Benefit antipoverty No 15 47 38program Yes 31 32 36Age younger than 30 18 38 44 30 to 44 years old 20 42 37 45 to 59 years old 20 44 36 older than 59 28 39 33Race Dark brown 27 42 31 Light brown 18 43 39 White 19 33 48Gender Male 19 47 34 Female 23 36 42Education Has no schooling 30 33 38 Incomplete elementary 30 33 36 Complete elementary 26 39 35 Incomplete middle/technical 13 60 28 Complete middle/technical 24 42 34 Incomplete high 11 45 44 Complete high 15 45 40 Incomplete college 8 52 40 Complete college or more 12 40 48Church attendance Never 19 54 26 Only on special occasions 18 46 36 Once a month 16 44 40 Once a week 21 37 42 More than once a week 29 34 37Abortion No 27 32 41 Yes 18 46 36Protest Not willing 22 36 42 Willing 20 48 31
38
Table 1: Summary Statistics (cont.)
Madrazo López Obrador Calderón
Overall 20 42 38Voter Confidence Not at all 24 52 23 A little 20 46 33 Somewhat 21 39 40 Totally 20 35 45Fox approval Disapproves a lot 25 68 8 Disapproves a little 23 64 14 Neither 36 32 32 Approves a little 22 36 43 Approves a lot 12 18 71Party ID (short) None 13 43 44 PAN 5 17 78 PRD 4 89 8 PRI 55 19 27Party ID (long) None 13 43 44 Strong PAN 5 10 85 Strong PRD 2 93 5 Strong PRI 63 17 20 Weak PAN 4 20 75 weak PRD 5 87 9 Weak PRI 47 20 33Length of partisanship None 13 43 44 less than two years 9 56 36 between 2 and 5 14 46 40 between 5 and 11 14 41 45 more than 11 39 30 31National economy Worsened a lot 24 67 9 Worsened a little 22 55 23 Same 19 48 32 Improved a little 20 30 49 Improved a lot 14 21 65Personal economy Worsened a lot 23 62 15 Worsened a little 21 61 18 Same 20 43 38 Improved a little 19 29 52 Improved a lot 19 26 55Ideology Strong left 6 73 21 Somewhat left 16 62 22 Center‐left 15 53 33 Center‐center 19 41 40 Center‐right 23 31 45 Somewhat right 29 25 46 Strong right 34 17 49Mexico‐US relations Decrease 20 50 30 Stay the same 19 43 39 Increase 21 38 41State Government PAN 22 20 58 PRD 16 59 25 PRI 23 32 44
39
Table 2: Multiple Imputation: Summary Statistics
Means
Missing Values data set
1 data set
2 data set
3 data set
4 data set
5 Center 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ South 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ Mexico City 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ Urban 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ Age 13.7% (219) 41.2 41.2 41.1 41.2 41.1 White 13.8% (220) 20.6% 20.9% 20.8% 21.1% 21.2% Female 13.6% (216) 48.8% 49.1% 49.4% 48.7% 49.3% Education 13.9% (221) 5.2 5.2 5.2 5.2 5.2 Church 0.9% (15) 3.2 3.2 3.2 3.1 3.2 Abortion 20.6% (329) 63.1% 63.6% 63.3% 63.1% 63.7% Income 22.3% (355) 4.1 4.1 4.1 4.1 4.1 Benefit antipoverty program 8.2% (130) 20.9% 21.0% 20.5% 20.8% 20.2% Voter confidence 17.6% (281) 2.7 2.7 2.7 2.7 2.7 Protest 18.1% (289) 35.6% 34.8% 34.8% 34.7% 34.9% Fox approval 18.3% (291) 3.4 3.4 3.4 3.4 3.4 PRI 3.7% (59) 22.2% 21.3% 28.8% 15.6% 25.7% PAN 3.7% (59) 5.7% 12.7% 8.3% 8.1% 6.1% PRD 3.7% (59) 16.2% 9.9% 8.1% 15.8% 5.0% Length partisanship 5.3% (85) 1.5 1.5 1.6 1.4 1.4 National economy 3.5% (55) 3.4 3.5 3.4 3.4 3.4 Personal economy 2.5% (40) 3.3 3.3 3.3 3.3 3.3 Ideology 33.2% (530) 4.0 4.0 4.0 4.0 4.0 MX‐US 21.1% (336) 0.3 0.3 0.3 0.3 0.3 MX‐US distance AMLO 19.0% (303) 0.2 0.2 0.2 0.2 0.2 MX‐US distance Calderon 23.2% (370) 0.3 0.3 0.3 0.3 0.3 MX‐US distance Madrazo 20.7% (330) 0.2 0.3 0.2 0.3 0.3 Evaluation economy AMLO 11.9% (190) 2.4 2.4 2.4 2.4 2.4 Evaluation economy Calderon 24.2% (386) 2.3 2.3 2.3 2.3 2.3 Evaluation economy Madrazo 14.1% (224) 2.1 2.1 2.1 2.1 2.1 Evaluation poverty AMLO 11.4% (182) 2.3 2.3 2.3 2.3 2.3 Evaluation poverty Calderon 22.8% (363) 2.4 2.4 2.4 2.4 2.4 Evaluation poverty Madrazo 13.7% (218) 2.0 2.1 2.0 2.0 2.0 Evaluation crime AMLO 12.0% (191) 2.3 2.3 2.3 2.3 2.3 Evaluation crime Calderon 22.8% (363) 2.2 2.2 2.2 2.2 2.2 Evaluation crime Madrazo 13.7% (219) 2.0 2.1 2.0 2.0 2.0 Evaluation honesty AMLO 13.4% (214) 2.3 2.3 2.3 2.3 2.3 Evaluation honesty Calderon 24.5% (391) 2.4 2.4 2.4 2.4 2.4 Evaluation honesty Madrazo 14.0% (223) 2.0 2.0 2.0 2.0 2.0 PRI state government 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ PRD state government 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ PAN state government 0.0% (0) ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ ‐‐‐ Means corresponds to the average of the last 250 iterations N= 1594
40
Table 3: Bayesian Multinomial Probit Coefficients
López Obrador/Madrazo Calderón/Madrazo
β 90% C.I. β 90% C.I.
Intercept 0.14 ‐0.71 1.00 0.90 ‐1.84 0.00
Center 0.35 0.05 0.68 0.00 ‐0.30 0.30 South 0.10 ‐0.22 0.44 0.53 ‐0.83 ‐0.24 Mexico City 0.64 0.31 0.98 0.05 ‐0.27 0.36 Urban 0.08 ‐0.17 0.31 0.32 0.09 0.57 Age 0.00 ‐0.01 0.01 0.00 ‐0.01 0.00 White 0.27 ‐0.51 ‐0.03 0.00 ‐0.26 0.27 Female 0.15 ‐0.34 0.04 0.18 ‐0.02 0.39 Education 0.02 ‐0.07 0.03 0.01 ‐0.04 0.07 Church 0.03 ‐0.12 0.05 0.02 ‐0.10 0.07 Abortion 0.04 ‐0.17 0.25 0.01 ‐0.22 0.19 Income 0.03 ‐0.02 0.07 0.04 0.00 0.09 Benefit antipoverty program 0.04 ‐0.28 0.20 0.17 ‐0.07 0.40 Voter confidence 0.05 ‐0.12 0.21 0.14 ‐0.29 0.02 Protest 0.44 ‐0.28 1.13 0.59 ‐1.49 0.26 Voter confidence * Protest 0.09 ‐0.34 0.16 0.20 ‐0.09 0.51 Fox approval 0.05 ‐0.14 0.04 0.19 0.09 0.30 PRI 0.28 ‐0.68 0.12 0.25 ‐0.67 0.16 PAN 0.46 0.05 0.87 0.99 0.58 1.41 PRD 1.30 0.86 1.77 0.16 ‐0.34 0.65 Length partisanship 0.18 ‐0.29 ‐0.07 0.11 ‐0.22 0.01 National economy 0.01 ‐0.11 0.13 0.14 0.01 0.27 Personal economy 0.01 ‐0.13 0.12 0.06 ‐0.07 0.18
Ideology (distance from voter) 0.04 ‐0.11 0.03 Mexico‐US relations (distance from voter) 0.02 ‐0.07 0.11 Ability manage economy 0.09 0.01 0.17 Ability manage poverty 0.07 0.00 0.14 Ability manage crime 0.06 ‐0.02 0.15 Honesty 0.18 0.11 0.25 Shares governor party ID 0.05 ‐0.04 0.13
2~AMLO 0.94 0.67 1.21 2
,~
CALDAMLO 0.40 0.15 0.61 2~CALD 1.064 0.79 1.34
N=1,218
Table 4: First Differences
Variable Change with respect to hypothetical voter
Madrazo López Obrador Calderón 5% 50% 95% 5% 50% 95% 5% 50% 95%
Baseline probability of support 5.6 12.6 23.2 28.4 43.0 58.3 29.4 43.6 58.6
Region: Mexico City to south 9 17 27 ‐17 8 1 ‐19 10 0
center ‐2 4 12 ‐19 10 ‐1 ‐4 5 15 north 1 8 18 ‐32 21 ‐12 2 13 23
Urban: urban to rural or mixed 0 5 11 ‐3 5 14 ‐18 10 ‐2 Age: 41 to 54 ‐1 0 3 ‐3 1 4 ‐5 2 2 Race: brown to white ‐2 3 11 ‐18 10 ‐3 ‐2 7 15 Gender: female to male ‐4 0 4 4 11 17 ‐17 10 ‐4 Education: complete middle to complete high ‐2 0 3 ‐6 2 1 ‐1 2 6 Church attendance: once a month to once a week ‐1 1 3 ‐4 1 2 ‐3 0 3 Abortion: yes to no ‐4 0 5 ‐9 2 5 ‐6 2 9 Income: 2,600‐3,999 to 5,200‐6,499 ‐3 1 0 ‐3 0 3 ‐1 2 5 Benefit antipoverty: yes to no ‐4 1 6 ‐3 6 15 ‐16 7 1 Voter Confidence: somewhat to totally fair ‐2 1 5 0 6 11 ‐11 6 ‐1 Protest: unwilling to willing ‐6 2 2 ‐2 6 13 ‐10 4 3 Fox: approves a little to approves a lot ‐4 2 0 ‐10 7 ‐4 6 9 12 Party ID: independent to PRI ‐2 7 17 ‐19 5 9 ‐16 2 11
PAN ‐19 10 ‐4 ‐25 13 ‐1 10 23 35 PRD ‐21 11 ‐4 28 41 53 ‐42 29 ‐17
Length of party ID: 2‐5 years to 5‐11 years 1 4 8 ‐8 4 ‐1 ‐3 0 4 State of national economy: same to improved a little ‐4 2 1 ‐8 4 1 1 5 9 State of personal economy: same to improved a little ‐3 0 2 ‐6 2 2 ‐2 2 7
Note: Each row shows changes in choice probabilities for a hypothetical voter, resulting from a marginal change in one variable, leaving everything else constant.
Table 5: Candidate evaluations and voter choice
Ability to manage
Economy Poverty Crime Honesty
Not at all 36 37 38 29
Little 40 40 41 36
Somewhat 43 43 43 43
Very 46 46 45 50
Note: The table shows choice probabilities for a hypothetical voter,
for different levels of candidate evaluations, leaving everything else constant.
43
Figure 1: Regional cleavages in the 2006 Presidential Election
First Place Candidate
Second Place Candidate
Note: blue indicates Felipe Calderón (PAN), yellow indicates Andrés M. López Obrador (PRD) and green indicates Roberto Madrazo (PRI). Source: Instituto Federal Electoral (IFE)
44
Figure 2: Ideology Distance and Voter Choice
López Obrador
Calderón
Note: Plots shows probabilities of choosing López‐Obrador (upper) and Calderón (lower), respective, for different levels candidate ideology positions. Simulations were done based on the characteristics of a hypothetical voter.
Figure 3: TwoCandidate Race (López Obrado vs. Madrazo)
A2. Multiple Imputation Procedure
One common problem when using complete data methods is the high incidence of missing
values and non‐responses in survey data. For instance, we find that most explanatory
variables included in our voter choice model contain between 1% and 37% missing values
(see Table 2). One possible “quick” solution for addressing this issue is to delete
observations with missing values in one or more of the variables included in the complete
data model (in this case, in the voter choice model). A drawback of this solution is that it is
very inefficient – makes poor use of all the information available. For example, if we
dropped all observations with missing values in the regressors, we would be left with only
432 observations (27% of the 3rd wave sample). Moreover, since the remaining
observations are not representative of the original sample, this procedure might lead to
biased estimates. Another common solution is to fill or impute non‐responses with single
values such as column means or predicted values computed using observed data. The
problem of the single‐imputation approach is that estimates computed using a single
imputed data set will not reflect uncertainty about non‐responses, and they may reduce or
inflate relationships between variables (citations). No single‐imputation method will
exactly predict missing values – as stated by Rubin (1987, page X), “if one value were really
adequate, then that value was never missing.”
Instead, we follow a multiple‐imputation procedure that, on top of having similar
advantages as single‐imputation methods, allows us to account for the uncertainty about
the origin of the non‐response (Rubin 1987). In a few words, this method consists in
replacing each missing value by several imputations, and then use each set of imputations
to construct multiple imputed data sets. We carry out this step using the MICE software
package in R, which estimates the imputed values using a Bayesian approach, and we
monitor iterations to make sure imputed values converged across data sets. After that, we
use each complete data set to estimate similar voter choice models, and follow an
innovative procedure for obtaining overall estimates which reflect the uncertainty about
the missing values. Specifically, instead of using Rubin’s (1987) rules for computing overall
estimates and total variance based on within‐ and between‐ imputation variability, we take
advantage of the fact that our voter choice model is estimated using Bayesian methods to
47
approximate the posterior conditional distribution of the coefficients taking into account
both within‐ and between‐ imputation variability. We do so by first estimating three MCMC
chains of model coefficients for each of the five imputed data set, each with different
starting values; and then we pool the fifteen chains and use them to summarize the
posterior conditional distribution of the coefficients.
More formally, we specify a probability model on the complete data setX ,
where misobs XXX , is N x K matrix of explanatory variables. In doing so, we assume that
the non‐response mechanism is “ignorable” (Rubin 1987), which is the same as assuming
that the missingness is generated at random conditional on observables (MRA condition).
The ignorability condition implies that we can specify a probability distribution over
Kxx 1 and estimate parameters of the imputation‐model using observed values. After
that, we can use observed values in 11 Kxx and estimated parameters of the imputation‐
model to generate M imputations for missing variables in Kx .
If non‐response were explained by unobserved variables correlated with the outcome of
interest or other explanatory variables, then the multiple imputation procedure would not
preserve the relationship between the variables and would lead to biased estimates. Thus,
a recommendation usually made is that the imputation‐model includes all variables
included in the complete‐data model (in our case, all those variables included in the voter
choice model), all other variables useful for explaining non‐response, as well as any other
variable explaining a substantive proportion of the variance which may help reduce our
uncertainty about the missing values (citations). In this article the imputation‐model
includes all explanatory variables included in the complete‐data model as well as 22 other
variables which may help explain variability and non‐response (see Table 2 and Table
A.X).15
15 The variables which we include in the imputation‐model but not in the voter‐choice model are: participation in the 2000 presidential election; trust in the Federal Electoral Institute (IFE); evaluation of Mexico’s democracy; trust in family and friends; marital status; receiving remittances from the US; frequency with which the person speaks about politics; interest in politics; awareness of the campaign; watching the news; whether the person saw the presidential debate; time cost of voting; visit from a party representative; strength of partisanship; support for death penalty; women role in society; union membership status and occupation dummies (blue collar worker, white collar worker, peasant, student or housewife).
48
An important factor to take into account is that the missing data model should
account for the different types of variables in X (binary, ordered, continuous, and so on).
Therefore, in choosing a probability model, assuming all variables are normally distributed
may be unrealistic and lead to nonsensical imputations. In this article we compute the
multiple imputations using “predictive mean matching” (PMM, Little 1988). As explained
by Horton and Lipsitx (2001), PMM first specifies a linear model for each variable kx using
the remaining variables in X as regressors:
(A.1) KKkkkkkE xxxxx 1111110
After that, observed values of Kkk xxxx 111 and draws from the distribution of the
parameters are used to generate M predicted values Mkk xx ˆˆ 1 for each observation with
missing values in kx . Finally, PMM imputes the observed value in kx which is closes to mkx̂ ,
for Mm ,,1 . As a result, when using PMM, “imputed values have the same gaps as in the
observed data and are always within the range of the observed data” (Buuren and
Groothuis‐Oudshoorn 2009, page X).16
Another important issue when conducting multiple imputations is the simultaneous
occurrence of non‐response in several variables in X . In that case, estimating equation
(A.1) is not straightforward. One possible solution is to specify a joint distribution over
Kxx 1 , |XP , where are the imputation‐model parameters. A drawback of this
approach is that when there are multiple types of variables in X , |XP may have no
known form. In this article we use an alternative solution, which consists in using a “fully
conditional specification” (FCS). In particular, we conduct multiple imputation by chained
equations using R’s package MICE (van Buuren and Groothuis‐Oudshoorn 2009). The idea
is to specify separate conditional distributions for each incomplete variable in X ,
16 Alternative methods for computing imputed values are: simple linear regression,
propensity score methods, logistic regression, polynomial regression, and discriminant
analysis.
49
kkkP ,| Xx , where kX are other variables in X and k are imputation model
parameters specific to variable kx . Then, Gibbs sampling can be used to repeatedly draw
from the “chain” of univariate conditional distributions and generate M imputations for
each missing value in X . Let t be the iteration number, then the Gibbs sampler draws
successively from the following conditional distributions (citations):
t
Kt
Ktobs
KKt
K
tK
tobsKK
tK
ttK
tobst
tK
tobst
P
P
P
P
ˆ,,,,|~ˆ
,,,|~ˆ
ˆ,,,,|~ˆ
,,,|~ˆ
11
11
111
2111
112111
xxxxx
xxx
xxxxx
xxx
Since we use PMM, tk
obsk
tk xxx ~, and t
kx~ are the observed value in obskx closest to t
kx̂ .
As explained by Buuren and Groothuis‐Oudshoorn (2009), previous iterations of tkx̂ only affect t
kx̂ indirectly, “through it relationship with other variables” (page 7). This
reduces the autocorrelation between imputations and allows quick convergence –
according to Buuren and Groothuis‐Oudshoorn (2009) 10 to 20 iterations may be enough.
In our case most imputed values converged quickly (such as missing values in voter
confidence, see figure A.2) although others required more than 50 iterations to converge
(such as missing values in ideology distances for Calderón and Madrazo, see Table A.2).
Conservatively, we decided to run the imputation‐model with 250 iterations. Table 2 shows
that mean of the imputed values was almost the same for all variables across the five
complete data sets.
How many imputations are needed? According to Rubin (1987), efficient estimates
can be achieved with only 3 to 5 imputations per missing value, where “efficiency” is a
function of “how much more precise the estimate might have been if no data had been
missing” (Schafer and Olsen, page 548). In our model, the fraction of missing data in the
explanatory variables varies between 1% and 37% (see Table 2) and we set M=5.
According to Rubin’s rule about the efficiency of the estimates, our results should be more
than 90% efficient.
50
A3. Model Specification and Estimation
Random utility models assume voter choice is based on the relative utility voters perceive
from each alternative. Specifically, when voters chose among three candidates, we assume
observed choice iY depends on utility vector 321 ,, iiii UUUU , such that ijY equals one if
ii UU max1 and zero otherwise, for j=1,2,3. Moreover, even though utilities are
unobserved or latent, the multinomial probit specification assumes we can model them
according to the following random utility system (Alvarez and Nagler 1995,1998):
ijjiijij vaXU , for j=1,2,3
where ijX is a vector of alternative‐specific covariates, ia is a vector of individual‐specific
covariates, and ijv is a normally distributed error term. Also, while ijX is associated with a
vector of coefficients which does not vary by candidate or individual decision maker, ia
is associated with a choice‐specific vector of coefficients j which takes different values
for each of the three candidates.
Note that voter choice is unchanged when we add a constant to both sides of these
equations. Thus, the above model is unidentified. Following the convention in the
literature, we solve this identification problem by rewriting the model in terms of the
utility perceived from the third candidat, such that 21
~,
~~iii UUU with 3
~iijij UUU for
j=1,2 and
ijjiijij vaXU ~~~~
where 3
~iijij XXX , 3
~ jj , 3~
iijij vvv , and ~,0~~ Nvij .
According to this normalization, the voter chooses the third alternative when both
1
~iU and 2
~iU are negative, or otherwise chooses ijUj
~maxarg . Thus, the probability of
choosing the first alternative equals
211
~~&0
~iii UUUP
Estimating this probability requires high‐dimensional integration, and is
computationally demanding. However, the recent development of Markov Chain Monte
Carlo (MCMC) methods enables us to do an efficient an9d accurate estimation of
51
multinomial probit models, using Bayesian simulation (Geweke et al. 1994; Imai and Dyk
2005; McCulloch and Rossi 1994; McCulloch et al. 2000; Nobile 1998; Rossi 2005). As
explained by Imai and Dyk (2005), doing so requires use of a data augmentation algorithm
to iteratively sample from the conditional distributions of U and ,,, 21 . Under
certain regularity conditions and an adequate burn‐in period (Gelman and Rubin 1992;
Roberts 1996; Tierney 1994, 1996), the resulting Markov chain converges to the
conditional distribution of U and , YUP |, .
However, the above model is still unidentified – rescaling each ijU~by a positive constant
leaves voter choice unchanged. For instance, we can write each random utility equation as
ijjiijijij vaXUU ~~~~~~
where is a positive constant, and 2,0~~ Nvij .
We say that 21~,~,
~ and are the identified parameters, while
21~,~,
~ and 2 are the unidentified parameters. Solving the identification
problem requires the specification of proper prior distributions (McChulloch and Rossi
1994; Imai and Dyk 2005a). In this paper, we follow the approach of Imai and Dyk (2005)
who place a prior on the identified parameters, 10 ,~ AN , setting *
112 , and obtain
the unidentified parameters as 1*110
*11
** ,~| AN , ** ,~ SvWishartinverse , where
0 and A are the prior mean and variance‐covariance matrix of , v are the degrees of
freedom of the inverse Wishart distribution, and S is the prior scale of .17 This approach
allows us to easily interpret prior distributions, as well as their impact on posterior results,
as well as to specify flat priors. Thus, it improves upon the approach followed by
McChulloch and Rossi (1994), who specify the prior of the unidentified parameters as
10
* ,~ AN , and obtain the prior of the identified parameters as a byproduct (see Imai
and Dyk 2005a).
According to Imai and Dyk (2005a), their procedure improves upon the one
proposed by McChulloch and Rossi (1994), Nobile (1998) and McCulloch et al. (2000) in
terms of interpretability of the prior distributions, simplicity of the computation and speed 17 For an application of the algorithm developed by McChulloch and Rossi (1994), see Alvarez and Katz (2008).
52
of convergence. The last achievement is a result of the use of a fully marginalized data
augmentation algorithm, whereby the distribution of the working parameters α is
independent from that of the distribution of the identified parameters, and repeated
sampling from conditional distributions is done averaging over α, thus reducing the
autocorrelation between sampler draws. In this paper, we fit the model using the MNP
package (Imai and Dyk 2005b), implemented through R, which executes scheme 1 of
algorithm 1 introduced by Imai and Dyk (2005a).
53
A4. Simulation of Choice Probabilities
With Madrazo set as the baseline category, the variance covariance Matrix of the MNP
model is:
CALDMADCALDMADCALDAMLOCALDMADAMLOMADMAD
AMLOMADAMLOMAD
,22
,,,2
,22
2
2~
Further, to simplify notation we write: AMLOMADAMLOMADAMLO ,222 2 ,
CALDMADCALDMADCALD ,222 2~ , and
CALDAMLOCALDMADAMLOMADMADCALDAMLO ,,,2
,~ . Identification requires fixing one of
the parameters of this matrix, or a linear combination of them. In this paper, we fix the
trace equal to 1.
3 Candidate Election
To compute the probability that López Obrador wins, we need to draw from the cdf
of a bivariate normal with vector of means TCALDiAMLOiMADiAMLOi UUUU ,,,,
~~~~ and
variance‐covariance matrix
CALDAMLOCALDAMLOCALDMADCALDAMLOAMLOMADAMLO
AMLOMADAMLOMADAMLO
,22
,,,2
,22
2
2~
. Since our
model does not yield estimates of 2AMLO , 2
CALD and CALDAMLO, ; but of 2~AMLO , 2~
CALD and
CALDAMLO,~ , we need to express 2
AMLO , 2CALD and CALDAMLO, as linear combinations of the
estimated parameters. Still, note that: 2,
22 ~2 AMLOAMLOMADAMLOMAD ,
CALDAMLOCALDAMLOCALDAMLOCALDAMLO ,22
,22 ~2~~2 , and
CALDAMLOAMLOCALDMADCALDAMLOAMLOMADAMLO ,2
,,,2 ~~ . So in terms of estimated
parameters,
CALDAMLOCALDAMLOCALDAMLOAMLO
AMLOAMLO
,22
,2
2
~2~~~~
~~
.
54
Similarly, to compute the probability that Calderón wins, we need to draw from the
cdf of a bivariate normal with vector of means TAMLOiCALDiMADiCALDi UUUU ,,,,
~~~~ and
variance‐covariance matrix
CALDAMLOCALDAMLOAMLOMADCALDAMLOCALDMADCALD
CALDMADCALDMADAMLO
,22
,,,2
,22
2
2~
. Again,
note that: 2,
22 ~2 CALDCALDMADCALDMAD ,
CALDAMLOCALDAMLOCALDAMLOCALDAMLO ,22
,22 ~2~~2 , and
CALDAMLOCALDAMLOMADCALDAMLOCALDMADCALD ,2
,,,2 ~~ . So in terms of estimated
parameters,
CALDAMLOCALDAMLOCALDAMLOCALD
CALDCALD
,22
,2
2
~2~~~~
~~
.
2 Candidate Election
To compute the probability an individual votes for López Obrador if Madrazo drops
out from the race, we need to draw from the following standard normal cdf:
CALDAMLOCALDAMLO
CALDiAMLOiAMLOi
UUP
,22
,,,
2
~~
Note that: CALDAMLOCALDAMLOCALDAMLOCALDAMLO ,22
,22 ~2~~2 , so we can
estimate AMLOiP, by drawing from the following normal cdf:
CALDAMLOCALDAMLO
ACALDiAMLOiAMLOi
UUP
,22
,,, ~2~~
~~
.
55
A5. Wording and Coding of Variables
Voter choice (3rd wave). For the purposes of this survey, I will give you a sheet where you can mark how you voted on the last presidential elections, without my seeing you, and then deposit it in this bag. For whom did you vote for president? Region indicators. North if state equals Baja California, Baja California Sur", Sonora, Sinaloa, Nayarit, Chihuahua, Coahuila, Durango, Zacatecas, San Luis de Potosí, Nuevo León or Tamaulipas; Center if state equals Jalisco, Aguascalientes, Colima, Michoacán, Guanajuato, Querétaro, Morelos, Hidalgo, Tlaxcala or Puebla; México State or Federal District if state equals Estado de México or Distrito Federal; South if state equals Guerrero, Oaxaca, Chiapas, Veracruz, Tabasco, Yucatán, Campeche or Quintana Roo. Urbanization. 1. Urban; 2. Rural or mixed. Partisanship of state governor (in 2006). PRI if state equals Sonora, Chihuahua, Coahuila, Nuevo León, Tamaulipas, Sinaloa, Nayarit, Colima, Puebla, Estado de México, Veracruz, Hidalgo, Durango, Oaxaca, Tabasco, Campeche or Quintana Roo; PAN if state equals Baja California, San Luis de Potosí, Querétaro, Guanajuato, Tlaxcala, Jalisco, Aguascalientes, Morelos, or Yucatán; PRD if state equals Baja California Sur, Michoacán, Guerrero, Chiapas, Zacatecas or Distrito Federal. Age (2nd wave). How old are you? Female (2nd wave). Answered by the interviewer. 1. Female; 0. Male. White (2nd wave). Answered by the interviewer. 1. White; 0. Light brown or dark brown. Education (2nd wave). How many years of schooling have you had? 1. No schooling; 2. Incomplete elementary; 3. Complete elementary; 4. Incomplete middle/technical; 5. Complete middle/technical; 6. Incomplete high; 7. Complete high; 8. Incomplete college; 9. Complete college or more. Church attendance (1st wave). How often do you attend religious services? 1. More than once a week; 2. Once a week; 3. Once a month; 4. Only on special occasions; 5. Never. Abortion (2nd wave). In your opinion, should abortion in case of rape be legal or illegal? 1. Legal; 0. Illegal. Income (2nd wave). I will show you a card with different income levels. Which one would your household monthly income fall in, counting all wages, salaries, pensions and other sources of income? (in Mexican pesos). 1. 0 to 1,299; 2. 1,300 to 1,999; 3. 2,000 to 2,599; 4.
56
2,600 to 3,999; 5. 4,000 to 5,199; 6. 5,200 to 6,499; 7. 6,500 to 7,899; 8. 7,900 to 9,199; 9. 9,200 to 10,499; 10. 10,500 or more. Benefit from antipoverty programs (1st wave). Question A: Do you or a relative who lives in this household receive benefits from the program Oportunidades? Question B: Before, did you receive benefits from the program Progresa? 0. No I/we do not (question A) AND No, I did not receive (question B). 1. Yes, I/we do (question A) OR Yes, I received (question B). Voter confidence (2nd wave). How fair do you think elections will be this year: totally fair, more or less fair, a little fair, or not at all fair? 1. Not at all; 2. A little; 3. More or less; 4. Totally fair. Protest (2nd wave). Let’s suppose the election is over and IFE announces that candidate you voted for has lost. Your candidate rejects the result and calls on his followers to join him in protest demonstrations. Would you join the demonstration to support your candidate or would you not join the demonstration? 0. No would not join it; 1. Yes I would join it. Fox approval (2nd wave). In general, do you approve or disapprove of the way which Vicente Fox is doing his job as president? (Insist) A lot or little? 1. Disapprove a lot; 2. Disapprove a little; 3. Neither; 4. Approve a little; 5. Approve a lot. Partisanship (1st wave). In general, would you say you identify with the PAN, the PRI or the PRD? (Insist) Would you say you identify strongly with the (party name) or only somewhat with the (party name)? PAN coding: 1. Strong PAN or weak PAN; 0. Otherwise; PRI coding: 1. Strong PRI or weak PRI; 0. Otherwise. PRD coding: 1. Strong PRD or weak PRD; 0. Otherwise. Length of partisanship (1st wave). For how long have you identified with the (PAN/PRI/PRD)? 0. Independent; 1. Less than 2 years; 2. Between 2 and 5 years; 3. Between 5 and 11 years; 4. More than 11 years. Evaluation of the national economy (1st wave). Since Fox became president, would you say the national economy has gotten better, has gotten worse, or stayed the same? (Insist) Would you say it has gotten a lot (better/worse) or a little (better/worse)? 1. A lot worse; 2. A little worse; 3. Stayed the same; 4. A little better; 5. A lot better. Evaluation of the personal economy (1st wave). Since Fox became president, would you say your personal economic situation has gotten better, has gotten worse, or stayed the same? (Insist) Would you say it has gotten a lot (better/worse) or a little (better/worse)? 1. A lot worse; 2. A little worse; 3. Stayed the same; 4. A little better; 5. A lot better. Ideology, respondent (2nd wave). In politics, would you consider yourself on the left, on the right, or in the center? (If left or right) Ver or somewhat on the left/right? (If center) Center‐left, center‐right, or center‐center? 1. Very on the left; 2. Somewhat on the left; 3. Center‐left; 4. Center‐center OR None; 5. Center‐right; 6. Somewhat on the right; 7. Very on the right.
57
Ideology, candidate (Comparative Study of Electoral Systems, México Survey, July 2006). In politics people often talks about “left” and “right”. Using the scale shown in the fourth card, where 0 means left and 10 means right, where would you locate (Felipe Calderón/Roberto Madrazo/Andrés Manuel López Obrador)? Coding: We use linear interpolation to translate responses from the original 11‐point dimension, to a 7‐point dimension comparable to the ideology coding in the MPS. Also, we compute the average standing of each candidate, among respondents how have finished primary school or achieved higher levels of education, and round to the closest integer. As a result, we assign an average standing of 3 to López Obrador, of 4 to Madrazo, and of 5 to Calderón, in a 7‐point ideology dimension. MexicoUS relations, respondent (2nd wave). What would you prefer: that commercial relations between Mexico and the United States increase, decrease, or remain the same? ‐1. Decrease; 0. Remain the same; 1. Increase. MexicoUS relations, candidates (2nd wave). Do you believe (candidate name) will try to increase commercial relations between Mexico and the United States, decrease commercial relations between Mexico and the United States or keep relations between Mexico and the United States the same? ‐1. Decrease; 0. Stay the same; 1. Increase. Candidate evaluations (1st wave). Now I’d like to ask your opinion about some presidential candidates. I’ll start with Felipe Calderón. In your opinion, how (capable of managing the economy/capable of reducing crime/capable of reducing poverty/honest) is Calderón; very, somewhat, not very or not at all? Now Andrés Manuel López Obrador. How (capable of managing the economy/capable of reducing crime/capable of reducing poverty/honest) is López Obrador; a lot, somewhat, a little or nothing? Now Roberto Madrazo. How (capable of managing the economy/capable of reducing crime/capable of reducing poverty/honest) is Madrazo; a lot, somewhat, a little or nothing? Coding for each candidate: 1. Not at all; 2. Not very; 3. Somewhat; 4. Very.