ps4_fall2015
-
Upload
luissanchez -
Category
Documents
-
view
37 -
download
0
description
Transcript of ps4_fall2015
![Page 1: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/1.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 1/8
Department of Economics W3412Columbia University Fall 2015
Problem Set 4
Introduction to Econometrics
Profs. Seyhan Erden and Miikka Rokkanenfor all sections.
Part I.
True, False, Uncertain with Explanation:(a) One can still use a linear regression framework even if the relation between a regressor and
the dependent variable is not linear .
(b) Including an interaction term between two independent variables, 1 and 2, allows for themeasurement of the effect of a unit increase in 1 and 2, above and beyond the sum of the
individual effects of a unit increase in the two variables alone.(c) To decide whether = 0 + 1 + or ln() = 0 + 1 + fits the data better,
you should examine the regression 2.
Part II.
1. Consider the following multiple regression model:
0 1 1, 2 2,
1, 2,[ | , ] 0
i i i i
i i i
Y X X u
E u X X
(a) Suppose 2, 1, 2i i X X , Can you compute the OLS coefficient? Explain.
(b) Assume again that 2, 1,2i i
X X Can you write a single variable model 0 1 1,i i iY X u
which is equivalent to the multiple regression model above? Can you compute the OLScoefficients of this single variable model? What is the intuition here?
(c) Consider the alternative model: 1 1, 2 2,i i i iY X X u where again 2, 1, 2
i i X X . Can you
compute the OLS coefficients in this model? Explain.
(d) Assume again 2, 1, 2i i
X X . Can you write a single variable model: 0 1 1,i i iY X u
equivalent to the multiple regression model in (c)? Can you compute the OLS coefficients ofthis single variable model? What is the intuition here?
2. Use Table 2 to answer the following questions. Table 2 presents the results of fourregressions, one in each column. Estimate the indicated regressions and fill in the values(you may either handwrite or type the entries in; if you choose to type up the table, anelectronic copy of Table 2 in .doc format is available on the course Web site). For example,to fill in column (2), estimate the regression with colGPA as the dependent variable andhsGPA and skipped as the independent variables, using the “robust” option, and fill in theestimated coefficients
![Page 2: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/2.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 2/8
(a) Fill out the table with necessary numbers, some will be on STATA output some you willneed to calculate yourself.
(b) Common sense predicts that your high school GPA (hsGPA) and the number of classes youskipped (skipped) are determinants of your college GPA (colGPA). Use regression (2) to testthe hypothesis (at the 5% significance level) that the coefficients on these two economic
variables are all zero, against the alternative that at least one coefficient is nonzero.(c) Find the F-statistic for regression (3) and explain what is it testing?(d) Find the F-statistic for regression (4) and explain what is it testing?(e) Are bgfriend (whether you have a boy/girlfriend) and campus (whether you live on campus)
jointly significant determinants of college GPA? Use regression (2) and (4) to test yourhypothesis. (i.e. use homoskedasticity-only F stat formula, eq.7.14 in the book, instead ofdirectly testing with STATA)
Table 1
Definitions of Variables in GPA4.dta (data is from Wooldridge textbook)Variable Definition
colGPA Cumulative College Grade Point Average of a sample of 141students at Michigan State University in 1994.
hsGPA High School GPA of students.
skipped Average number of classes skipped per week.
PC = 1 if the students owns a personal computer= 0 otherwise.
bgfriend = 1 if the student answered “yes” to having a boy/girl friend
question= 0 otherwise.
campus = 1 if the student lives on campus.= 0 otherwise.
![Page 3: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/3.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 3/8
Table 2
College GPA ResultsDependent variable: colGPA
Regressor (1) (2) (3) (4) hsGPA
( ) ( ) ( ) ( )skipped
( ) ( ) ( ) ( )PC __
( ) ( ) ( )bgfriend __ __
( ) ( )campus __ __ __
( )Intercept
( ) ( ) ( ) ( )F -stat is t ics test ing the hypoth esis that the populat ion co eff ic ients on the ind icated regressors areall zero :
hsGPA, skipped
( ) ( ) ( ) ( )hsGPA, skipped, PC __
( ) ( ) ( )hsGPA, skipped, PC, bgfriend, __ __
( ) ( )bgfriend, campus __ __ __
( )Regression sum mary stat is t ics
2 R
R
Regression RMSE
n
Notes: Heteroskedasticity-robust standard errors are given in parentheses under estimatedcoefficients, and p-values are given in parentheses under F - statistics. The F -statistics areheteroskedasticity-robust.
![Page 4: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/4.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 4/8
3. TeachingRatingsdata set contains data on course evaluations, course characteristics, and professor characteristics for 463 courses for the academic years 2000-2002 at the Universityof Texas at Austin. These data were provided by Professor Daniel Hamermesh of the
University of Texas at Austin and were used in his paper with Amy Parker, “Beauty in theClassroom: Instructors’Pulchritude and Putative Pedagogical Productivity,” Economics ofEducation Review, August 2005, Vol. 24, No. 4, pp. 369-376.Course_eval : “Course overall” teaching evaluation score, on a scale of 1 (very
unsatisfactory) to 5 (excellent)Beauty: Rating of instructor physical appearance by a panel of six students, averaged acrossthe six panelists, shifted to have mean zero.Female = 1 if the instructor is female, 0 if the instructor is maleMinority = 1 if the instructor is a non-White, 0 if the instructor is White NNenglish = 1 if the instructor is not a native English speaker, 0 if the instructor is a nativeEnglish speaker
Intro= 1 if the course is introductory (mainly large Freshman and Sophomore courses), 0 ifthe course is not introductoryOnecredit = 1 if the course is a single-credit elective (yoga, aerobics, dance, etc.), 0 otherwiseAge: Professor’s age
(a) Regress Course_eval on Beauty and female, test the hypothesis that all populationcoefficients are jointly significant at 5% significance level.
(b) Regress Course_eval on Beauty, female, minority and age, test the hypothesis that all population coefficients are jointly significant at 5% significance level.
(c) Now test if minority and age are jointly significant at 1% significance level using the resultsfrom part (a) and part (b)
(d) Consider the various control variables in the data set. Which do you think should be includedin the regression? Using a table like table 3, examine the effect of Beauty on Course_eval.
(hint: Stata does not list adjusted 2 under robust option. The command to see adjusted 2is
ereturn list r2_a)
![Page 5: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/5.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 5/8
Table 3
Teaching RatingsDependent variable: Course_eval
Regressor(Standard Error
Below)
(1) (2) (3) (4)
beauty
( ) ( ) ( ) ( )female
( ) ( ) ( ) ( )minority __
( ) ( ) ( )nnenglish __ __
( ) ( )intro __ __ __
( )onecredit __ __ __
( )age __ __ __
( )
intercept
( ) ( ) ( ) ( )F -stat is t ics test ing the nu l l hypothesis: popu lation coeff ic ients on thefol low ing regresso rs are al l zero : (p-value below)
beauty, female
( ) ( ) ( ) ( )beauty, female, minority
__ ( ) ( ) ( )beauty, female,
minority, nnenglish __ __
( ) ( )intro, onecredit __ __ __
( )minority, age __ __ __
( )intro, age __ __ __
( )
Regression summ ary stat is t ics 2
R
R
Regression RMSE
n
Notes: Heteroskedasticity-robust standard errors are given in parentheses underestimated coefficients, and p-values are given in parentheses under F- statistics. The F-
statistics are heteroskedasticity-robust.
![Page 6: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/6.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 6/8
4. Lawsch85 data set is collected by Kelly Barnett, an MSU economics student, for use in a
term project. The data come from two sources: The Official Guide to U.S. Law Schools,1986, Law School Admission Services, and The Gourman Report: A Ranking of Graduateand Professional Programs in American and International Universities, 1995, Washington,
D.C.(a) Regress salary on north south east and west to analyze the effects of regions on salary ofLaw School graduates. What is wrong with this regression? Why can you not do this?
(b) How would you correct the problem in part (a)?(c) Interpret the coefficient of east under your correction strategy in part (b). .
5. Does the separation of corporate control from corporate ownership lead to inflated executivesalaries and worse firm performance? George Stigler and Claire Friedland have addressedthese questions empirically using a sample of firms.1 A subset of their data are in the fileexeccomp.dta. The variables in the file are described in table 4
Table 4
Definitions of Variables in execcomp.dta
Variable Definition
ecomp Average total amount of compensation in thousands of dollars fora firm’s top three executive.
assets Firm’s assets in millions of dollars.
profits Firm’s annual profits in millions of dollars.
mcontrol A dummy variable indicating management control of the firm= 1 management-controlled firms.= 0 ownership-controlled firms.
(a) Regress executives’ compensation on the firm’s assets and profits, the control dummy, and
an intercept term. What proportion of the variation in top executive’s compensation in this
sample is accounted for by these variables?(b) If the firm’s profit rise by one million dollars, by how much do you estimate the top
executive’s average compensation will change, if assets and the form of control remain
fixed?(c) What is the estimated difference between the expected average compensations of top
executives in management-controlled firms and those in ownership-controlled firms, ifassets and profits remain fixed?
(d) Regress firm profits on firm assets and the management-control dummy. How much of the
variation in the firm’s profit in this sample can be accounted for by the variation in firm’sasset and the form of control?(e) Are the empirical results in (a) and (d) consistent with the claim that management control
hurts firm performance and leads to a higher pay for executives?
1 George J. Stigler and Claire Friedman, The Literature of Economics: The case of Berle and Means, Journal of Law
and Economics 26 no. 2 (June 1983): 237-268
![Page 7: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/7.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 7/8
6. Consider the following STATA output on college distances. This dataset contains data from arandom sample of high school seniors interviewed in 1980 and re-interviewed in 1986. Inthis exercise you will use these data to investigate the relationship between the number of
completed years of education for young adults and the distance from each student's highschool to the nearest four-year college. The variable ed corresponds to years of education anddist is the distance to the nearest college and it is measured in tens of miles (For example dist
= 3 means that the high school of the senior is 30 miles from the nearest college).
. reg ed dist, robust
Linear regression Number of obs = 3796F( 1, 3794) = 29.83Prob > F = 0.0000
R-squared = 0.0074Root MSE = 1.8074
------------------------------------------------------------------------------
| Robust
ed | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
dist | -.0733727 .0134334 -5.46 0.000 -.0997101 -.0470353
_cons | 13.95586 .0378112 369.09 0.000 13.88172 14.02999------------------------------------------------------------------------------
(a) A student’s high school was 18 miles from the nearest college. Estimate the number ofyears of schooling completed.
(b) Compute the 99% confidence interval for the difference in the predicted years ofeducation between a high school senior who is 93 miles to the nearest college and anotherstudent who attends a high school that shares a campus with a college. Explain what yoursolution means in one sentence.
(c) Does distance to the nearest college explain a lot of the variation in educationalattainment? Explain.
(d) Suppose distance was measured in kilometers such that 10 miles = 16 kilometers.Replicate the entire STATA output.
(e) Interpret the coefficient of tuition below where the dependent variable, led , is the naturallogarithm of years of education. Give one good explanation for your answer. (note thattuition is given in $1000)Linear regression Number of obs = 3796
F( 3, 3792) = 151.91Prob > F = 0.0000R-squared = 0.1001Root MSE = .12236
------------------------------------------------------------------------------| Robust
led | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------tuition | .0158511 .0069175 2.29 0.022 .0022887 .0294135momcoll | .0474716 .0063938 7.42 0.000 .034936 .0600071dadcoll | .0749874 .0055234 13.58 0.000 .0641583 .0858164_cons | 2.582142 .0065834 392.22 0.000 2.569234 2.595049
------------------------------------------------------------------------------
![Page 8: ps4_fall2015](https://reader036.fdocuments.us/reader036/viewer/2022082213/5695d03c1a28ab9b0291966c/html5/thumbnails/8.jpg)
7/21/2019 ps4_fall2015
http://slidepdf.com/reader/full/ps4fall2015 8/8
Following questions will not be graded, they are for you to practice and will be discussed at
the recitation:
7. SW Empirical Exercise 6.3
8. SW Exercise 7.19. SW Exercise 7.4
10. SW Empirical Exercises 7.1