Regression using lm lmRegression.R Basics Prediction World Bank CO2 Data.
-
Upload
baldric-briggs -
Category
Documents
-
view
212 -
download
0
Transcript of Regression using lm lmRegression.R Basics Prediction World Bank CO2 Data.
Regression using lmlmRegression.R
• Basics• Prediction• World Bank CO2 Data
Simple Linear regression
• Simple linear model: y = b1 + x b2 + error
y: the dependent variable x: the independent variable b1, b2 : intercept and slope coefficients
error: random departures between the model and the response.
Coefficients estimated by least squares
Multiple regression
• y = b0 + x1 b1 + x2b2 + x3b3 + … + error
Annual Boulder Temperatures
Temperature is dependent variable, Year is the independent variableErrors =???? Linear =???
CO 2 Emissions by Country
• Independent: GDP/capita• Dependent: CO2 emission• Linear?? Errors ??
The R lm function
• Takes a formula to describe the regression where ~ means equals
• Works best when the data set is a data frame• Returns a complicated list that can be used in summary,
predict, print plot lmFit <- lm( y ~ x1 + x2)
Or more generally using a data frame
lmFit <- lm( y ~ x1 + x2, data=dataset)
dataset$y, dataset$x1, dataset$x2
Analysis of World Bank data set
• Best to work on a log scale and GDP has the strongest linear relationship
• Some additional pattern leftover in the residuals
• Try other variables • Try a more complex curve• Check the predictions using cross-validation
Leave-one-out Cross-validation• Robust way to check a models predictions andthe uncertainty measure
• Four steps:1. Sequentially leave out each observation2. Refit model with remaining data3. Predict the omitted observation4. Compare prediction and confidence interval to the actual
observation
A check on the consistency of the statistical modelBecause omitted observation is not used to make prediction