Econometrics notes (Introduction, Simple Linear regression, Multiple linear regression)
Multiple Linear Regression
-
Upload
nadzirah-hanis -
Category
Education
-
view
194 -
download
2
Transcript of Multiple Linear Regression
N A D Z I R A H H A N I S Z A I N O R D I N P 7 5 1 8 2
M A RWA N O M A R J A L A M B O P 7 5 3 7 6
A S H O K S I VA J I P 7 7 8 0 0
D W I B U D I N I N G S A R I P 7 5 3 7 5
H A M Z A H WA L I P 7 4 9 1 8
O O I T H E N G C H O O N P 7 5 1 2 9
H A R U N A E M M A N U E L P 7 3 2 7 0
S U R E S H M A N I P 7 7 1 0 4
Multiple Linear Regression
NNPD 6014
Introduction
In developing countries, high BP is one of the risk factors for CVD, and the estimated 7.1 million deaths especially among middle, and old-age adults is due to high BP (Mungreiphy et al. 2011).
Overweight and obesity increase the risk of elevated blood pressure (Drøyvold et al. 2005).
Introduction
Positive association BMI and BP have also been reported among Asian populations.
Several studies indicate that high BP is associated with age (Mungreiphy et al. 2011).
Research Question
How well the BMI and age predict systolic blood pressure?
Which is the best predictor of perceived systolic blood pressure; BMI or age?
Research hypothesis
The systolic blood pressure change can predict by BMI and Age among population.
Statistic hypothesis
Ho : ᵝ BMI = 0, ᵝ age = 0
Ha : ᵝ BMI ≠ ᵝ age ≠ 0
Normality test
Multi Linear Regression - Assumptions
Sample size
Multicollinearity
Outliers
Normality, linearity, homoscedasticity, independence of residuals
Assumptions- sample size
Use formula;
N > 50 + 8m (where m = number of independent variables)
In our case study;
N> 50 + 8 (2)
N> 66
Our sample size is 96
Assumptions- Multicollinearity
To test the multicollinearity
between independent variables,
we should test correlation by Pearson’s factor to assume the relationship between the independent variables
Assumptions- Multicollinearity
Correlation between independent variable: Correlation should not exceed 0.9The result indicates Pearson’s correlation factor = -0.122 (not significant)No correlation assumedCorrelations between independentP- value not significant .236
Assumptions- Multicollinearity
To avoid multicollinearity in MLR between the independent variables:1- Variance inflation factor (VIF) < 102- Tolerance factor lie between (0-1); closed to zero means multicollinearityMulticollinearity indicates that a variable is almost a linear combination of other independent variable.Results:Age: Tolerance far away of Zero (=0.985) and VIF= 1.015
BMI: Tolerance far away of Zero (=0.985) and VIF= 1.015
Result: No multicollinearity has assumed (Independent variables)
Assumptions- Outliers
From option of statistics of Linear Reg.Choose casewise diagnostics and standard deviation equal 3.
From the Table:Std. Residual should lie between (-3.3 to +3.3) for (Minimum to Maximum)Our results interval (-2.482 to +3.217)
Result: No outliers
Assumptions- Outliers
Using Mahalanobis distance (the value less than 13.82) It show multivariate outliers among independent
variable
Assumptions- Normality
Normal distribution;Points lie in a straight line form bottom left to top right
Assumptions- Linearity
The distribution of observation up and down the line of total indicates equal values or similarities in the distribution that mean the linearity assumption has been assumed
Assumptions-homoscedasticity
By homoscedasticity: we test the equal variance between up and down distributed observations in scatter plot,
In our results equal variance has assumed.
Assumptions-homoscedasticity
Test of Independence
By Durbin-Watson test
The Durbin-Watson estimate ranges from zero to four.
Values hovering around two showed that the data points
were independent.
Values near zero means strong positive correlations and
four indicates strong negative.
Test of Independence
Here, the independence assumption is satisfied as
the value of Durbin-Watson equal 1.668.
Assumption
Assumption Check
Sample size
Multicollinearity
Outliers
Normality
Linearity
Homoscedasticity
Independence of residuals
Multiple linear
regression
Multiple linear regression
Using multi regression standard- All independent variable ( BMI and age) are entered into
the equation simultaneously. - We want to know how much variance in dependent variable
were able to explain as group or block.
Evaluating the model
Adjusted R2= 0.09 x 100 = 9 %
Age and BMI explains 9 % of variances in perceived systolic blood pressure .
ANOVA
Df (2)
F-ratio (F) = 5.699
Sig = 0.005 (p<0.05), significant enough to predict dependent variable
Report as; F (2,93) = 5.699; p<0.05
Evaluating each of the independent variable
To comparing the contribution of each independent we use Beta value.
The larger Beta 0.258 (p<0.05) in BMI shows more contribution in explaining systolic as compared to the age.
Regression equations
Systolic = 84.769 + (0.983*BMI)+(0.368*age)
Answering research question
How well the BMI and age predict systolic blood
pressure?
- BMI and age predicts 9% (p<0.05) of the variance
in systolic blood pressure
Which is the best predictor of perceived systolic blood
pressure; BMI or age?
- Both BMI (Beta 0.258 ) and age (Beta 0.241)
are predictor systolic blood pressure.
Conclusion (APA Style)
To predict the relationship between BMI and Age on Systolic blood pressure, multiple linear regression was performed.
Prior to interpretation (MLR) several assumptions evaluated.
First, appropriate sample size was assumed(≥66).
Second multicolinearity assumed by Pearson correlation, VIF and Tolerance.
Third, Mahalanobis distance did not exceed the critical X2
for df=2(at α = .05) of 13.82 for any cases in the data file.
Fourth, inspection of the normal probability plot of standardized residual against standardized predicted value indicated the assumption of normality, linearity and homoscedasticity of residuals were met.
Continue Conclusion
Systolic = 84.769 + (0.983*BMI)+(0.368*age)
The multi-linear regression model predicts 9% of the
variance in systolic blood pressure. Adjusted R2 = 0.09
This is statistically significant using MRL.
The independent variables ( age and BMI) that are
significant predictors of the dependent variables (systolic)
at alpha=0.05.
F(2, 93)= 5.699, p<0.05
References
WB Drøyvold, K Midthjell, TIL Nilsen and J Holmen. 2005. International Journal of Obesity 29, 650–655.
N. K. Mungreiphy, Satwanti Kapoor, and Rashmi Sinha. 2011. Journal of AnthropologyVolume 2011, Article ID 748147, 6
Mungreiphy, N., S. Kapoor & R. Sinha 2011. Association between BMI, blood pressure, andage: study among Tangkhul Naga tribal males of Northeast India. Journal of Anthropology
Allen, P. & Bennett, K. 2012. SPSS Statistic: A Practical Guide Version 20. Australia:Cengage Learning Australia Pty.
Coakes, S. J. 2013. SPSS vrsion20.0 for Windows. Analysis without Anguish. Milton : JohnWiley & Sons Ltd.
Morgan, G. A., Leech, N. L., Cloeckner, G. W. & Barrett, K. C. 2013. IBM SPSS forintroductory statistic; Use and interpretation ( 5th edn). New York; Routledge Taylor &Francis Group.
Piaw, C. Y. 2013. Mastering research statistics. Selangor ; McGraw-Hill Education(Malaysia) Sdn. Bhd,.
Chan Y. H. 2004. biostatistics 201: Linear regression Analysis. Singapore Med J. 45(2): 55-61