Simple Linear Regression

Post on 29-Sep-2015

1 views 0 download

description

Linear Regression

Transcript of Simple Linear Regression

  • Quantitative Business Analysis for Decision MakingSimple Linear Regression

    403.7

  • Lecture OutlinesScatter Plots Correlation AnalysisSimple Linear Regression ModelEstimation and Significance TestingCoefficient of DeterminationConfidence and Prediction IntervalsAnalysis of Residuals

    403.7

  • Regression Analysis ?Regression analysis is used for modeling the mean of response variable Y as afunction of predictor variables X1, X2,.., Xk. When K = 1, it is called simple regression analysis.

    403.7

  • Random SampleY: Response Variable, X: Predictor Variable

    For each unit in a random sample of n, the pair (X, Y) is observed resulting a random sample: (x1, y1), (x2, y2),... (xn, yn)

    403.7

  • Scatter PlotScatter Plot is a graphical displays of the sample (x1, y1), (x2, y2),... (xn, yn) by n points in 2-dimension.

    It will suggest if there is a relationship between X and Y

    403.7

  • A Scatter Plot Showing Linear Trend

    403.7

  • A Scatter Plot Showing No Linear Trend

    403.7

  • Modeling linear Trend A perfect linear relationship between Y and X exists if . Coefficient is the slope--quantifying the amount of change in y corresponding to one unit change in x.There are no perfect linear relationships in practical world.

    403.7

  • Simple Linear Regression ModelModel:

    is linear function (nonrandom) is random error. It is assumed to be normally distributed mean 0 and standard deviation . So are parameters of the model

    403.7

  • EstimationSimple linear regression analysis estimates the mean of Y (linear trend) by

    and

    403.7

  • Standard deviationStandard deviation (s) of the sample of n points in the scatter plot around the estimated regression line is:

    403.7

  • Testing the Slope of Linear Trend For Testing

    compute t-statistic and its p value:

    403.7

  • Coefficient of Determination: R2 A quantification of the significance of estimated model is denoted by R2. R2 > 85% = significant modelR2 < 85% = model is perceived as inadequateLow R2 will suggest a need for additional predictors for modeling the mean of Y

    403.7

  • Correlation Coefficient: rThe correlation coefficient r is the square root of R2. It is a number between -1 and 1.Closer r is to -1 or 1, the stronger is the linear trend Its sign is positive for increasing trend (slope b is positive)Its sign is negative for decreasing trend (slope b is negative)

    403.7

  • Confidence and Prediction IntervalsTo estimate by a confidence interval, or to predict response Y corresponding to its predictor value x = x0 1. Compute:

    2. compute:

    403.7

  • What is ?i.e. Standard Error ofFor estimating ,

    For Predicting Y,

    403.7

  • Analysis of ResidualsResiduals are defined:

    Residual analysis is used to check the normality and homogeneity of variance assumptions of random errors .Histogram or box plot of residuals will help to ascertain if errors are normally distributed.

    403.7

  • Analysis of Residuals (cont)Plot of residual against observed predictor values xi will help ascertain homogeneity assumption. random appearance = homogeneity of variance assumption is valid.non-random appearance =homogeneity assumption is not valid and variance is dependent on predictor values.

    403.7