Multiple Linear Regression

29
NADZIRAH HANIS ZAINORDIN P75182 MARWAN OMAR JALAMBO P75376 ASHOK SIVAJI P77800 DWI BUDININGSARI P75375 HAMZAH WALI P74918 OOI THENG CHOON P75129 HARUNA EMMANUEL P73270 SURESH MANI P77104 Multiple Linear Regression NNPD 6014

Transcript of Multiple Linear Regression

Page 1: Multiple Linear Regression

N A D Z I R A H H A N I S Z A I N O R D I N P 7 5 1 8 2

M A RWA N O M A R J A L A M B O P 7 5 3 7 6

A S H O K S I VA J I P 7 7 8 0 0

D W I B U D I N I N G S A R I P 7 5 3 7 5

H A M Z A H WA L I P 7 4 9 1 8

O O I T H E N G C H O O N P 7 5 1 2 9

H A R U N A E M M A N U E L P 7 3 2 7 0

S U R E S H M A N I P 7 7 1 0 4

Multiple Linear Regression

NNPD 6014

Page 2: Multiple Linear Regression

Introduction

In developing countries, high BP is one of the risk factors for CVD, and the estimated 7.1 million deaths especially among middle, and old-age adults is due to high BP (Mungreiphy et al. 2011).

Overweight and obesity increase the risk of elevated blood pressure (Drøyvold et al. 2005).

Page 3: Multiple Linear Regression

Introduction

Positive association BMI and BP have also been reported among Asian populations.

Several studies indicate that high BP is associated with age (Mungreiphy et al. 2011).

Page 4: Multiple Linear Regression

Research Question

How well the BMI and age predict systolic blood pressure?

Which is the best predictor of perceived systolic blood pressure; BMI or age?

Page 5: Multiple Linear Regression

Research hypothesis

The systolic blood pressure change can predict by BMI and Age among population.

Statistic hypothesis

Ho : ᵝ BMI = 0, ᵝ age = 0

Ha : ᵝ BMI ≠ ᵝ age ≠ 0

Page 6: Multiple Linear Regression

Normality test

Page 7: Multiple Linear Regression

Multi Linear Regression - Assumptions

Sample size

Multicollinearity

Outliers

Normality, linearity, homoscedasticity, independence of residuals

Page 8: Multiple Linear Regression

Assumptions- sample size

Use formula;

N > 50 + 8m (where m = number of independent variables)

In our case study;

N> 50 + 8 (2)

N> 66

Our sample size is 96

Page 9: Multiple Linear Regression

Assumptions- Multicollinearity

To test the multicollinearity

between independent variables,

we should test correlation by Pearson’s factor to assume the relationship between the independent variables

Page 10: Multiple Linear Regression

Assumptions- Multicollinearity

Correlation between independent variable: Correlation should not exceed 0.9The result indicates Pearson’s correlation factor = -0.122 (not significant)No correlation assumedCorrelations between independentP- value not significant .236

Page 11: Multiple Linear Regression

Assumptions- Multicollinearity

To avoid multicollinearity in MLR between the independent variables:1- Variance inflation factor (VIF) < 102- Tolerance factor lie between (0-1); closed to zero means multicollinearityMulticollinearity indicates that a variable is almost a linear combination of other independent variable.Results:Age: Tolerance far away of Zero (=0.985) and VIF= 1.015

BMI: Tolerance far away of Zero (=0.985) and VIF= 1.015

Result: No multicollinearity has assumed (Independent variables)

Page 12: Multiple Linear Regression

Assumptions- Outliers

From option of statistics of Linear Reg.Choose casewise diagnostics and standard deviation equal 3.

From the Table:Std. Residual should lie between (-3.3 to +3.3) for (Minimum to Maximum)Our results interval (-2.482 to +3.217)

Result: No outliers

Page 13: Multiple Linear Regression

Assumptions- Outliers

Using Mahalanobis distance (the value less than 13.82) It show multivariate outliers among independent

variable

Page 14: Multiple Linear Regression

Assumptions- Normality

Normal distribution;Points lie in a straight line form bottom left to top right

Page 15: Multiple Linear Regression

Assumptions- Linearity

The distribution of observation up and down the line of total indicates equal values or similarities in the distribution that mean the linearity assumption has been assumed

Page 16: Multiple Linear Regression

Assumptions-homoscedasticity

By homoscedasticity: we test the equal variance between up and down distributed observations in scatter plot,

In our results equal variance has assumed.

Page 17: Multiple Linear Regression

Assumptions-homoscedasticity

Page 18: Multiple Linear Regression

Test of Independence

By Durbin-Watson test

The Durbin-Watson estimate ranges from zero to four.

Values hovering around two showed that the data points

were independent.

Values near zero means strong positive correlations and

four indicates strong negative.

Page 19: Multiple Linear Regression

Test of Independence

Here, the independence assumption is satisfied as

the value of Durbin-Watson equal 1.668.

Page 20: Multiple Linear Regression

Assumption

Assumption Check

Sample size

Multicollinearity

Outliers

Normality

Linearity

Homoscedasticity

Independence of residuals

Multiple linear

regression

Page 21: Multiple Linear Regression

Multiple linear regression

Using multi regression standard- All independent variable ( BMI and age) are entered into

the equation simultaneously. - We want to know how much variance in dependent variable

were able to explain as group or block.

Page 22: Multiple Linear Regression

Evaluating the model

Adjusted R2= 0.09 x 100 = 9 %

Age and BMI explains 9 % of variances in perceived systolic blood pressure .

Page 23: Multiple Linear Regression

ANOVA

Df (2)

F-ratio (F) = 5.699

Sig = 0.005 (p<0.05), significant enough to predict dependent variable

Report as; F (2,93) = 5.699; p<0.05

Page 24: Multiple Linear Regression

Evaluating each of the independent variable

To comparing the contribution of each independent we use Beta value.

The larger Beta 0.258 (p<0.05) in BMI shows more contribution in explaining systolic as compared to the age.

Page 25: Multiple Linear Regression

Regression equations

Systolic = 84.769 + (0.983*BMI)+(0.368*age)

Page 26: Multiple Linear Regression

Answering research question

How well the BMI and age predict systolic blood

pressure?

- BMI and age predicts 9% (p<0.05) of the variance

in systolic blood pressure

Which is the best predictor of perceived systolic blood

pressure; BMI or age?

- Both BMI (Beta 0.258 ) and age (Beta 0.241)

are predictor systolic blood pressure.

Page 27: Multiple Linear Regression

Conclusion (APA Style)

To predict the relationship between BMI and Age on Systolic blood pressure, multiple linear regression was performed.

Prior to interpretation (MLR) several assumptions evaluated.

First, appropriate sample size was assumed(≥66).

Second multicolinearity assumed by Pearson correlation, VIF and Tolerance.

Third, Mahalanobis distance did not exceed the critical X2

for df=2(at α = .05) of 13.82 for any cases in the data file.

Fourth, inspection of the normal probability plot of standardized residual against standardized predicted value indicated the assumption of normality, linearity and homoscedasticity of residuals were met.

Page 28: Multiple Linear Regression

Continue Conclusion

Systolic = 84.769 + (0.983*BMI)+(0.368*age)

The multi-linear regression model predicts 9% of the

variance in systolic blood pressure. Adjusted R2 = 0.09

This is statistically significant using MRL.

The independent variables ( age and BMI) that are

significant predictors of the dependent variables (systolic)

at alpha=0.05.

F(2, 93)= 5.699, p<0.05

Page 29: Multiple Linear Regression

References

WB Drøyvold, K Midthjell, TIL Nilsen and J Holmen. 2005. International Journal of Obesity 29, 650–655.

N. K. Mungreiphy, Satwanti Kapoor, and Rashmi Sinha. 2011. Journal of AnthropologyVolume 2011, Article ID 748147, 6

Mungreiphy, N., S. Kapoor & R. Sinha 2011. Association between BMI, blood pressure, andage: study among Tangkhul Naga tribal males of Northeast India. Journal of Anthropology

Allen, P. & Bennett, K. 2012. SPSS Statistic: A Practical Guide Version 20. Australia:Cengage Learning Australia Pty.

Coakes, S. J. 2013. SPSS vrsion20.0 for Windows. Analysis without Anguish. Milton : JohnWiley & Sons Ltd.

Morgan, G. A., Leech, N. L., Cloeckner, G. W. & Barrett, K. C. 2013. IBM SPSS forintroductory statistic; Use and interpretation ( 5th edn). New York; Routledge Taylor &Francis Group.

Piaw, C. Y. 2013. Mastering research statistics. Selangor ; McGraw-Hill Education(Malaysia) Sdn. Bhd,.

Chan Y. H. 2004. biostatistics 201: Linear regression Analysis. Singapore Med J. 45(2): 55-61