Professor B. JonesUniversity of California, Davis
The Nature of Research in Political Science Hypotheses Working Example: immigration
Normative ◦ Value Judgments◦ What ought to be?◦ The Problem?
Normative conclusions often passed off as causally inferred or scientifically derived
But it’s difficult to sustain inference if derived solely by normative judgment
Also, they way we want the world to work may cloud our understanding of it!
Information Exposure Implications? Be Careful! Don’t confuse “entertainment” with
scientific research.
Philosophers Classical Political Theorists Literary Figures Ethicists …all very important work!
Purports to account for “what is” Empirically based Grounded in scientific method Often mathematical in its treatment Important “names”
◦ Harold Gosnell, Charles Merriam, William Riker
Always much harder than you may think The “relationship” posed undergirds your
“research question.” It connects y to x. Big vs. Small Questions
◦ Big questions may be interesting…but hard to answer; small questions may be trivial.
Why do democratic states tend to not engage each other in conflict?
Do Supreme Court justices vote ideologically?
How did the 1965 VRA effect congressional redistricting?
Did 19c. changes to the ballot effect how members of Congress behave?
Does electoral system variability impact the behavior of legislators?
Spend Time! Quickly derived questions will be trivial
(usually)… And very hard to answer/study My experience: students are way too broad
in the kinds of questions they ask
Research questions may originate from◦ Personal observation or experience◦ Writings of others◦ Interest in some broader social theory◦ Practical concerns like career objectives
How are two or more variables related?◦ A variable is a concept with variation. ◦ An independent variable is thought to influence,
affect, or cause variation in another variable.◦ A dependent variable is thought to depend upon
or be caused by variation in an independent variable.
Variables can have many different kinds of relationships:◦ Multiple independent variables usually needed◦ Antecedent variables◦ Intervening variables◦ An arrow diagram can map the relationships
Causal relationships are the most interesting.
A causal relationhip has three components:◦ X and Y covary.◦ The change in X precedes the change in Y.◦ Covariation between X and Y is not a coincidence
or spurious. We can state relationships in hypotheses.
The research question puts boundaries on the problem:
Why did illegal immigration increase in the mid 90s/2000s?
The explanation leads you to think of y and the xk (i.e. the dependent and independent variables)
Let’s turn to a working example
Unauthorized M igrants Liv ing in U.S. (Pew Estimates, 2005)
3.9
5
8.4
10.3
0
2
4
6
8
10
12
1992 1996 2000 2004
Year
Nu
mb
er
(in
Mil
lio
ns
)
Attitudes of Americans toward Immigration? The number of anti-immigrant
protests/rallies? Court/congressional action on immigration? Legislation dealing w/immigration? Hate crimes? News coverage? (Look at some data)
Number of Articles Referencing "Border" and "Immigration" in Washington Post, Charlotte Observer, and M inneapolis Star Tribune
(1996-2005)
30
1510
2519
34
85
57
4942
46
8 8 5 6 9
20 1712
41
53
41
28
108
97 99 101
108
17
39
0
20
40
60
80
100
120
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Nu
mb
er
of
Art
icle
s
Washington Post
Charlotte Observer
Minneapolis Star Tribune
Number of Articles Referencing "Border" and "Immigration" inArizona Daily Star and Sacramento Bee
(1996-2005)
85
45
66
3122
140
118
172
130
112 112
100109
36 3227
43
67
1717
0
20
40
60
80
100
120
140
160
180
200
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Year
Nu
mb
er
of
Art
icle
s
Arizona Daily Star
Sacramento Bee
What are the factors increasing undocumented migration?
These are your x factors. Possible suspects
◦ Crushing poverty in Mexico and Latin America?◦ Willingness of American firms to hire
undocumented workers?◦ Terrorism?◦ State policies promoting migration?◦ Lax enforcement among U.S. agencies?
In fact, all of these probably had an impact. The problem? What kinds of variables are
these? Antecedent vs. Intervening Variables Getting the explanatory story straight can
be difficult!
Operation Gatekeeper defined Massive Increase in Immigration post-O.G. “Causal Explanation”:
◦ In-flows=f(Operation Gatekeeper)◦ Satisfied with this?
Problems with the “explanatory story”?◦ Time Series vs. Cross-Sectional Data◦ Perhaps O.G. was an antecedent variable
“A variable that occurs prior to all other variables and that may affect other independent variables.” (i.e. other xk)
O.G.------->Increase of Migrants Suppose Operation Gatekeeper did not
have a “direct effect” on in-migration? “Hidden Effects”
◦ O.G. shifted migration hubs◦ Stretched INS razor thin◦ Adoption of OTM category◦ Made migration an option to other Lat. Am.
countries
Deportable Aliens Apprehended in U.S.: Total and Mexican
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000
1997 1998 1999 2000 2001 2002 2003 2004
Fiscal YearSource: Dept. of Homeland Security
Nu
mb
er
Ap
pre
he
nd
ed
(i
n M
illi
on
s)
TotalMexican
OTM Apprehensions: The Top Five List
9316
14491
24420
36118
31005240
88597036
9602
16974
60217728
11628
14866
581 765 14602498
27396
27317
0
5000
10000
15000
20000
25000
30000
35000
40000
2002 2003 2004 2005
Fiscal Years 2002-2004Source: Congressional Research Service
Nu
mb
er
Ap
pre
he
nd
ed
Honduras
Brazil
El Salvador
Guatemala
Nicaragua
Breakdown of 2004 Unauthorized Population (Pew Estimates)
Other LatinAmerica, 24%
Asia, 9%
Europe/Canada, 6%
Africa/Other, 4%
Mexican, 57%
O.G. probably not directly connected to in-flow
That is◦ O.G. ? In-flow increase◦ What “?” is would constitute your real x factor.
Other things learned from data?◦ Terrorism explanations simply do not account
for increases in y.◦ Perhaps the problem extends beyond Mexico◦ América (Brazilian telenovela)
For illustration, imagine x corresponds to regional variables (e.g. different states, sectors, etc.)
Causal Explanation:◦ Regional Variation Increased in-flows
Does this model make sense? …maybe◦ Southern border much more difficult than
Northern.◦ Tucson/Yuma sectors the toughest of all.
The real question: what is it about region that elicits this effect?
Suppose law enforcement varied across regions: some sectors are tougher than others.
New Model: Region Law Enforcement -Increased in-flows
Here, law enforcement acts as an intervening variable.
Classic example: education and voting◦ Education may induce feelings of civic duty◦ Thus: education civic duty voting
Antecedents: factors occurring “back in time.” ◦ Temporally, prior to x
Intervening Variables: occurring “closer in time.”◦ Their relationship is related to x
Law enforcement is connected to region. Civic duty is connected to education.
Statements about a relationship◦ How does it work?◦ In what direction are the effects?◦ i.e. positive? negative?
In some sense, it’s an educated guess. Therefore, it’s inherently PROBABLISTIC You may be wrong!
Good Hypotheses◦ Empirical Statements◦ Testable: you can evaluate the relative
accuracy of the statement◦ General statements (interesting vs. trivial)
Bad Hypotheses◦ Normative Statements (Why?)◦ Not testable: impossible to bring data to bear
on your statement◦ Non-general: the triviality problem
The Good◦ Levels of law enforcement are related to in-flows
of undocumented migrants Where the presence of law enforcement is high, in-
flows will be lower Where the presence of law enforcement is low, in-
flows will be higher◦ These illustrate “directional” hypotheses
The Bad◦ Immigration is a bad thing.◦ …or immigration is a good thing.
Normative judgments are very difficult to evaluate.
Another example◦ America lost the Olympics bid because of Obama
The Ugly◦ The desire for a better life among impoverished
Mexicans has led to an increase in undocumented migration.
Why “ugly”? Another example
◦ Undocumented aliens hurt the U.S. economy
Six characteristics of a good hypothesis:1. Should be an empirical statement that
formalizes an educated guess about a phenomenon that exists in the political world
2. Should explain general rather than particular phenomena
3. Logical reason for thinking that the hypothesis might be confirmed by the data
4. Should state the direction of the relationship5. Terms describing concepts should be
consistent with the manner of testing6. Data should be feasible to obtain and would
indicate if the hypothesis is defensible
Hypotheses must specify a unit of analysis:◦ Individuals, groups, states, organizations, etc…
Most research uses hypotheses with one unit of analysis.
Definitions of concepts should be◦ Clear◦ Accurate◦ Precise◦ Informative
Otherwise, reader will not understand concept correctly.
Many of the concepts used in political science are fairly abstract—careful consideration is necessary.
If it’s testable, you’ll need data. But which data? Units of Analysis
◦ Defined as the level upon which you’ll collect/analyze data
◦ Countries, regions, individuals??? Our working example:
◦ UOA: perhaps Border Patrol sectors Another example:
◦ Education and Turnout◦ UOA? (Group vs. Individuals)
Does the choice matter?
Yes! Beware the Ecological Fallacy Quick definition: conclusions about
individuals are based on aggregated data (or group-level data)
History◦ Phrase coined by William Robinson (1950)◦ Literacy and immigration
Found literacy rate was positively correlated with percentage of people born outside the U.S. (r=.53)
However, at the individual level, he found immigrants were less literate than native born. (r=-.11)
Theories, data, and measurement.
Top Related