Who moves and why? Determinants of interstate …€¦ · Web viewHowever, many people do not...
Transcript of Who moves and why? Determinants of interstate …€¦ · Web viewHowever, many people do not...
May 09th, 2014
Who moves and why? Determinants of interstate migration in the United States using data from the 2011 American population survey.
Duc Trinh
1
Abstract
Many empirical studies done in the past have tried to establish the relationship between the interstate
migration and separate, seemingly inrelated variables such as sex, ethnicity, marital status, etc, which
are believed to be the main contributors to an individual’s decision to move. In this paper, I included
many of same variables together in a bigger model and found out many of them to be correlated to each
other but finally reached quite similar findings with past research.
I. Introduction
The phenomenon of migration has always been an interesting topic for economists
and politicians alike due to its importance to the dynamics of the economy. By
understanding the patterns, sources and reasons behind the decision to move from one
state and another, policy makers could come up with sound plans to leverage on the
benefical effects of migration or negate many of its unfavourable consequences.
The primary objective of this paper is to study the roles of many variables that
could possibly determine interstate migration in America using data from the 2011
IPUMS USA survey database such as age, races, marital status and occupations. The
remainder of the paper proceeds as follows. Section II provides a brief survey of the
relevant literature and also a theoretical explanation for the accompanying empirical
2
models. Section III describes and summarizes the data used. Section IV reports and
interprets empirical results. Finally, section V draws some conclusions and describes
possible future studies based on the shortcomings of the data I used in this paper.
II. Literature review and theoretical analysis
Many studies in the past has found correlation between migration and many
socio-demographic variables. I will attempt to summarize and review these studies
below in order to provide theoritical basis for my empirical models.
It is common sense to notice that migration tends to peak around the age of 25
to 30 during which young people transition into aduilthood. Several factors such as
career, relationship and personal preferences for moving definitely play an important
role in the increased mobility for this sector of population. Older people move less often
as they settle in their communities. Rogers (1988) theorizes that there is a consistent
pattern of elderly migration in many countries around the world and used mathematical
expressions called “model schedule” in his paper to summarize out these regularities. By
fitting these immigration into observed data, Rogers found out that there are 4 peaks of
migration during the typical life cycle of a person. The first migration peak occurs
around age 16 when teens move away from parental home due to job shifts, household
3
formation, marriage and transition into military or university. The second is around age
60 when people move away from family home due to housing related reasons. The last
peak of migration occurs at the age of 75 when older people move away from
retirement house due into a dependent status as they become more prone to illness of
old age. McInnis(1971) estimated a model of migration by means of linear regression for
age groups, occupation and education for males in Canada. He stated that the younger
group of migrants (between the age of 20-24) is most responsive to regional difference
in earnings. They move to places where there are better economic opportunities. For
older age group, the earning differentials between regions do not really affect
migration. Sandefur (1985) investigated the variations in migration of men during early
stages of life cycle defined by jobs, marriage and childbearing. He concluded that there
is a general inversed relationship between age and migration.
The race factor definitely affects the decisions of the individual to move. Wilson
et al (2009) tried to systematically compare the likelihood of return migration between
people from different racial groups and found out that there is a higher odds of return
migration for Hispanics and Blacks than for Whites. These return migration differentials
could be explained by established ethnic communities and also the distribution of initial
settlements.
Although not normally linked with migration, veteran status probably influences
the decision to move for many people. People on duty might have to move more
4
frequently due to requirements of their jobs and this may affects their decision after
they retire from the armed forces. The required relocation could reduce the cost of
gathering information about new destinations and lessen the tie with initial home
communities. Military services also bring a diverse set of connection for the veterans
with people from vastly different geographical origins. Using 1% sample from the U.S
population cencus from1940 to 2000, Bailey(2011) concludes that veterans of both
Black and White are significantly more like to have recently moved compared to
nonveterans.
Sandefur (1985) used the model of life cycles with three typical stages: a single
stage of never-married and divorced men, the second stage of married men with no
children and a third stage of married men and with childen. He found out that the
probabilty of of migration decreases more rapidly over time for married males with
childen for singled males, which could be a direct result of strengthened ties with local
communities after having childeren over the years. Speare(1987) attemped to quantify
the effect of life cycle changes, namely ,marital status change, on the probabilty of
residetial mobility. Running logistic regressions on the panal data of Rhole Island adults
from 1967 and 1979, Speare found out that mobility rates are highest for the newly
married and for the separated and divorced but lowest for the widowed. However,
many people do not immediaty move the during the year of martial status change and
5
the increased probablity of moving last for several years after the change in marital
status happen.
It is very important to take into account the effect of the individuals’
metropolitan status and types of migration destination as those variables certainly
affect their decision to move. It is generally accepted that people from rural areas,
especially younger ones, often migrates from their original places to bigger metropolitan
areas due to better economic prospects. However, in their study, Mills et al(2001) found
out that non-metropolitan counties are attractive alternative destinations for educated
non-metropolitan youth due to high returns to schooling. These findings could be
explained by lower cost of living and area specific amenity values.
The type of careers one is in and the industry one is employed in are generally
believed to determine the mobility of that person. For example, one who works in the
tourism industry would necessarily has to move a lot more than one who works in the
manufacturing industries due to the nature of their work. The employment status of
individuals also definitely affect their migration pattern. One might not be able to find a
job in his current community and might needs to migrate to another state in order to
find jobs.
6
Based on the literature review, I believe all the variable mentioned in the studies
done above should be incorporated into the model, namely age, sex, education, race,
employment status, veteran status, marital status, career, industries, occupations and
metropolitan status. My econometrics regression models are describe by the equation
below:
Yi= βXi+ ε
Where Y is the dichotomous dependent variable moved (where 1 indicates the
person has moved since last year and 0 indicates no change in living place) and Xs are
the independent variables that I plan to run regressions on. ε is the error term that
contains omitted variables that have not been included in the data and represents the
results of a chance process. This model adheres to the classical econometrics model in
which the error terms are randomly drawn from a normally distributed box of tickets with
average of 0 and unknown, constant SD. The errors are also independent of each other
and independent of the Xs. The Xs includes age, sex, education, race, employment
status, veteran status, marital status, career, industries, occupations and metropolitan
status and are fixed in repeated samplings.
7
III. Data and Measurement
The data are cross sectional and were collected from the IPUMS USA current
population survey in 2011. By generating new variables, I have transformed most of the
continuous variables into dummy variables for ease of interpretation. I also dropped
some “Not Availabe” observations and reduced the total size of the data set. All
descriptions and summary statistic of the variables are described in the tables below.
Variables Descriptionage Age of the person in the surveyincwage Income wage of the person in the surveyage16 If the person is 16 years old, age16=1.If not, age16=0 age60 If the person is 60 years old, age60=1.If not, age60=0age75 If the person is 75 years old, age75=1.If not, age75=0single If the person is single, single=1.If not, single=0married If the person is married, married=1.If not, married=0white If the person is white, white=1.If not, white=0black If the person is black, black=1.If not, black=0employed If the person is employed, employed=1.If not, employed=0unemployed If the person is unemployed, unemployed=1.If not, unemployed=0recreational If the person is working in the recreational industry, recreational=1.If not, recreational=0manufacturing If the person is working in the manufacturing industry, manufacturing=1.If not, manufacturing=0metrostatus If the person is lived in a metrololitian area 1 year ago, metrostatus=1.If not, metrostatus=0movedin9 If the person has lived in the current residence for fewer than 10 years, movedin9=1.If not, movedin9=0veteran If the person is a veteran, veteran=1.If not, veteran=0male If the person is a male, male=1.If not, male=0moved this is the dependent variable, if the person has moved outside of state since last year, moved =1, if not, moved =0
Table 1: Description of variables
8
male 173815 .9467192 .2245935 0 1 moved 173815 .0041078 .0639607 0 1 veteran 173815 .0943647 .2923363 0 1 movedin9 173815 .2432011 .4290168 0 1 metrostatus 173815 .041222 .1988038 0 1manufactur~g 173815 .0810747 .2729506 0 1recreational 173815 .002589 .0508161 0 1 unemployed 173815 .5757156 .4942353 0 1 employed 173815 .4242844 .4942353 0 1 black 173815 .0848028 .2785888 0 1 white 173815 .8724736 .3335627 0 1 married 173815 .6749187 .4684066 0 1 single 173815 .071484 .2576324 0 1 age75 173815 .0207002 .142379 0 1 age60 173815 .0226735 .1488609 0 1 age16 173815 0 0 0 0 incwage 173815 23698.49 44224.33 0 607000 age 173815 63.87702 15.47354 18 95 Variable Obs Mean Std. Dev. Min Max
Table 2: Variables summary statistics
IV. Results and Interpretation
The tables below display linear probabilty model and dprobit regression results on
different empirical models:
9
* p<0.05, ** p<0.01, *** p<0.001 (0.00) (0.00) (0.00) (0.00) Constant 0.029*** 0.032*** 0.033*** 0.010*** (0.00) veteran 0.023*** (0.00) movedin9 0.007*** (0.00) metrostatus -0.016*** (0.00) (0.00) manufacturing -0.005*** -0.003*** (0.00) (0.00) recreational -0.002 -0.001 (0.00) (0.00) unemployed 0.001** 0.000 (.) (.) employed 0.000 0.000 (0.00) (0.00) (0.00) male -0.000 -0.000 0.001 (0.00) (0.00) (0.00) black -0.003*** -0.003*** -0.002* (0.00) (0.00) (0.00) white -0.003*** -0.003*** -0.003*** (0.00) (0.00) (0.00) (0.00) married -0.001 -0.000 -0.000 -0.000 (0.00) (0.00) (0.00) (0.00) single 0.000 0.000 0.000 -0.000 (0.00) (0.00) (0.00) (0.00) age75 0.001 0.001 0.000 -0.000 (0.00) (0.00) (0.00) (0.00) age60 -0.003** -0.003** -0.003* 0.000 (.) (.) (.) (.) age16 0.000 0.000 0.000 0.000 (0.00) (0.00) (0.00) (0.00) Wage and salary in~e -0.000* -0.000* 0.000 -0.000 (0.00) (0.00) (0.00) (0.00) Age -0.000*** -0.000*** -0.000*** -0.000*** b/se b/se b/se b/se m1 m2 m3 m4
Table 3: LPM regression output
10
* p<0.05, ** p<0.01, *** p<0.001 (0.06) (0.08) (0.08) (0.12) Constant -1.114*** -0.993*** -0.986*** -2.250*** (0.04) veteran 0.661*** (0.04) movedin9 0.401*** (0.04) employed -0.041 (0.07) (0.07) manufacturing -0.415*** -0.321*** (0.35) (0.35) recreational -0.165 -0.076 (0.04) unemployed 0.061 (0.05) (0.05) (0.05) male 0.021 0.039 0.047 (0.06) (0.06) (0.07) black -0.109 -0.106 -0.100 (0.05) (0.05) (0.05) white -0.175*** -0.169*** -0.176*** (0.03) (0.03) (0.03) (0.04) married -0.056 -0.053 -0.053 -0.068 (0.05) (0.05) (0.05) (0.05) single -0.081 -0.084 -0.087 -0.076 (0.20) (0.20) (0.20) (0.21) age75 -0.036 -0.032 -0.060 -0.160 (0.10) (0.10) (0.10) (0.11) age60 -0.007 -0.006 0.012 0.117 (0.00) (0.00) (0.00) (0.00) Wage and salary in~e 0.000 0.000 0.000* 0.000 (0.00) (0.00) (0.00) (0.00) Age -0.028*** -0.027*** -0.028*** -0.010*** b/se b/se b/se b/se m5 m6 m7 m8
Table 4: dprobit regression output
We can see from the output tables that the coefficients obtained from LPM
regressions are quite different from coefficients obtained from the dprobit regressions.
A possible reason for this inconsistency is that LPM probably suffers from
heteroskedasticity. I ran a Breusch-Pagan test for model 4 (the most conclusive model)
after running a LPM regression to check whether there is heteroskedasticity.
11
Prob > chi2 = 0.0000 chi2(1) =397340.90
Variables: fitted values of moved Ho: Constant varianceBreusch-Pagan / Cook-Weisberg test for heteroskedasticity
. estat hettest
_cons .0101794 .0013276 7.67 0.000 .0075774 .0127815 veteran .0228416 .0006482 35.24 0.000 .0215711 .024112 movedin9 .0065748 .0004093 16.06 0.000 .0057726 .007377 metrostatus -.0162115 .0008089 -20.04 0.000 -.0177969 -.0146261manufactur~g -.0028938 .0005695 -5.08 0.000 -.00401 -.0017776recreational -.0008403 .0029913 -0.28 0.779 -.0067033 .0050226 unemployed .0004402 .0004296 1.02 0.306 -.0004018 .0012822 employed (dropped) male .0007102 .0007029 1.01 0.312 -.0006674 .0020878 black -.0022841 .0009031 -2.53 0.011 -.004054 -.0005141 white -.0029439 .0007578 -3.88 0.000 -.0044293 -.0014586 married -.0004475 .0003644 -1.23 0.219 -.0011617 .0002666 single -.0002733 .0006588 -0.41 0.678 -.0015645 .001018 age75 -.0000567 .0010739 -0.05 0.958 -.0021616 .0020481 age60 .0000869 .0010257 0.08 0.932 -.0019234 .0020972 age16 (dropped) incwage -6.64e-10 4.23e-09 -0.16 0.875 -8.95e-09 7.63e-09 age -.0001056 .000015 -7.03 0.000 -.0001351 -.0000762 moved Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 711.06702173814 .004090965 Root MSE = .06334 Adj R-squared = 0.0194 Residual 697.208397173799 .004011579 R-squared = 0.0195 Model 13.8586227 15 .923908181 Prob > F = 0.0000 F( 15,173799) = 230.31 Source SS df MS Number of obs = 173815
. reg moved age incwage age16 age60 age75 single married white black male employed unemployed recreational manufacturing metrostatus movedin9 veteran
Table 5: Breusch-Pagan test for LMP model
The result is extremely statistically significant, which means that
heteroskedasticity is present in the data. I fixed for that using robust SEs and obtained
the new output table for LPM regression.
12
* p<0.05, ** p<0.01, *** p<0.001 (0.00) (0.00) (0.00) (0.00) Constant 0.029*** 0.032*** 0.033*** 0.010*** (0.00) veteran 0.023*** (0.00) movedin9 0.007*** (0.00) metrostatus -0.016*** (0.00) (0.00) manufacturing -0.005*** -0.003*** (0.00) (0.00) recreational -0.002 -0.001 (0.00) (0.00) unemployed 0.001** 0.000 (.) (.) employed 0.000 0.000 (0.00) (0.00) (0.00) male -0.000 -0.000 0.001 (0.00) (0.00) (0.00) black -0.003* -0.003* -0.002 (0.00) (0.00) (0.00) white -0.003** -0.003** -0.003* (0.00) (0.00) (0.00) (0.00) married -0.001 -0.000 -0.000 -0.000 (0.00) (0.00) (0.00) (0.00) single 0.000 0.000 0.000 -0.000 (0.00) (0.00) (0.00) (0.00) age75 0.001 0.001 0.000 -0.000 (0.00) (0.00) (0.00) (0.00) age60 -0.003*** -0.003*** -0.003** 0.000 (.) (.) (.) (.) age16 0.000 0.000 0.000 0.000 (0.00) (0.00) (0.00) (0.00) Wage and salary in~e -0.000 -0.000 0.000 -0.000 (0.00) (0.00) (0.00) (0.00) Age -0.000*** -0.000*** -0.000*** -0.000*** b/se b/se b/se b/se m1 m2 m3 m4
Table 6: Robust SE of LMP regressions
The results obtained after using Robust SEs are radically different from before.
We can see that people at the age of 60 have a 0.3% decrease in the probability of
migration in the first there models (Table 6). However, the same age group do not
produce a statistically significant result in the last model. Being black also leads to a
0.3% decrease in the probability of migration in the second and third models but not for
the last models. Being unemployed leads to a 0.1% increase in the probability of
migration in the third models but not for the last models. We generally see less
statistically significant results for the same variables as we incorporate more variables
13
into the model. This could be to the problem of multicollinearity in which the variable
age60 is correlated with other variables in model 4. Therefore, I ran a regression of
age60, black and unemployed on the other new variables that are only present in model
4 and got the following result:
_cons .0244591 .0004129 59.24 0.000 .0236498 .0252684 veteran -.0212708 .0013059 -16.29 0.000 -.0238303 -.0187114 movedin9 .0012249 .0009233 1.33 0.185 -.0005847 .0030345 metrostatus -.0018493 .0018959 -0.98 0.329 -.0055652 .0018667 age60 Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 3851.64361173814 .022159571 Root MSE = .14874 Adj R-squared = 0.0017 Residual 3845.13077173811 .022122482 R-squared = 0.0017 Model 6.51283811 3 2.17094604 Prob > F = 0.0000 F( 3,173811) = 98.13 Source SS df MS Number of obs = 173815
Table 7: regression of age60 on metrostatus, movedin9 and veteran
There is a statistically significant direct relationship between being at the age of
60 and being a veteran (Table 7). A person reduces her probability of being 60 years old
by 2.1% when she is a veteran.
_cons .0733674 .0007713 95.12 0.000 .0718555 .0748792 veteran .0235208 .0024395 9.64 0.000 .0187395 .0283022 movedin9 .0335476 .0017248 19.45 0.000 .0301671 .0369281 metrostatus .0256438 .0035418 7.24 0.000 .0187019 .0325856 black Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 13490.0066173814 .077611738 Root MSE = .27786 Adj R-squared = 0.0052 Residual 13419.0813173811 .077205017 R-squared = 0.0053 Model 70.9253441 3 23.6417814 Prob > F = 0.0000 F( 3,173811) = 306.22 Source SS df MS Number of obs = 173815
14
_cons .6403107 .0013098 488.87 0.000 .6377436 .6428778 veteran -.4365441 .0041423 -105.39 0.000 -.4446629 -.4284252 movedin9 -.0963819 .0029287 -32.91 0.000 -.1021221 -.0906417 metrostatus .0009567 .006014 0.16 0.874 -.0108306 .0127441 unemployed Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 42457.2954173814 .244268559 Root MSE = .47181 Adj R-squared = 0.0887 Residual 38690.771173811 .222602545 R-squared = 0.0887 Model 3766.52443 3 1255.50814 Prob > F = 0.0000 F( 3,173811) = 5640.13 Source SS df MS Number of obs = 173815
Table 8: regression of black and unemployed on metrostatus, movedin9 and veteran
We can see that the variables black and unemployed also sufffer from the same
multicollinearity problem. They are directly correlated with the ommitted variable and
therefore violate the basic rules of the Classical Econometrics Model.
Dprobit regressions (Table 4) would probably do a better job at getting the
correct estimation and therefore I will interpret the results using the dprobit regression
outputs. We see that the marginal effect of increasing age by one would increase the
probability of migration by 1% (this result is very exact because the SE is very small
compared the statistically significant estimated coefficient). Being white and working in
manufacturing industries in a metropolitan area during last year reduce the probability
of moving while being a veteran and having lived in the current residency increase the
chance of moving to another state.
15
V. Conclusion
The results obtained from my empirical analysis very much agree with the
common sense knowledge and the result fromother studies done in the past. White,
older people who lives in a metropolitan area and works in a manufacturing industry
tend to move less and veterans tend to move more. Although my data and analysis are
certainly far from complete, the conclusion one draws from this study could potentially
help create better immigration policies that benefit the economy as a whole. Future
research could be improved by using a panel or time series data set and include more
variables such as education, occupation, distance and economic conditions of different
states in order to draw a clearer conclusion about the determinants of migration.
16
References
Alden Speare, Jr. and Frances Kobrin Goldscheider. 1987. "Effects of Marital Status Change on Residential Mobility." Journal of Marriage and Family 455-464.
Bailey, Amy Kate. 2011. "Race, Place, and Veteran Status: Migration among Black and White Men, 1940-2000." Population Research and Policy Review 701-728.
Beth A. Wilson, E. Helen Berry, Michael B. Toney, Young-Taek Kim and John B. Cromartie. 2009. "A Panel Based Analysis of the Effects of Race/ Ethnicity and Other Individual Level Characterisitics at Leaving on Returning." Population Research and Policy Review 405-428.
Hazarika, Bradford Mills and Gautam. 2001. "The Migration of Young Adults from non-Metropolitan Counties." American Journal of Agricultural Economics 329-340.
McInnis, Marvin. 1971. "Age. Education and Occupation Differentials in Interregional Migration: Some Evidence for Canada." Demography 195-204.
Rogers, Andrei. 1988. "Age Patterns of Elderly Migration: An International Comparion." Demography 355-370.
Sandefur, Gary D. 1985. "Variations in Interstate Migration of Men Across the Early Stages of life Cycle." Demography 353-366.