Correlation / regression

26
Correlation • Regression • Multiple Regression Curve fitting Correlation / regression

description

Correlation / regression. Correlation Regression Multiple Regression Curve fitting. Correlation. Represents the relationship between two measurements Examples : height and weight, education level and income, BMI and skin fold thickness, wealth and fertility - PowerPoint PPT Presentation

Transcript of Correlation / regression

Page 1: Correlation / regression

• Correlation

• Regression

• Multiple Regression

• Curve fitting

Correlation / regression

Page 2: Correlation / regression

Correlation

• Represents the relationship between two measurements

– Examples : height and weight, education level and income, BMI and skin fold thickness, wealth and fertility

– Correlation does not represent one causing the other, usually is present if both measurements are influenced by a common factor

– The value is from -1 to 1

• 0 = no relationship

• 1 = perfect relationship

• -1 = perfect inverse relationship

Page 3: Correlation / regression
Page 4: Correlation / regression

Sxy

SSqxSSqyr

n

xxSSqx

22

n

yySSqy

2

2

n

yxxySxy

2

xsumx

22 xsumx ysumy

22 ysumy

xysumxy

Page 5: Correlation / regression

Confidence interval for r

• Correlation is not a linear measurement

– It stretches near 0 and compresses neat 1 or -1

– It has to be

• Transformed into a normally distributed linear measurement

• Have Standard Error estimated

• Have CI estimated

• Transformed back to the original format

Page 6: Correlation / regression

r

rLogz

1

15.0

3

1

n

SE

SEzF 96.1 SEzG 96.1

12exp

12exp

12exp

12exp%95

G

Gto

F

FCI

Page 7: Correlation / regression

Sample size

• Iterative procedure that satisfy two equations

3

2

m

zzn ba

12211

n

rrr

Logm

WhereZa = z value for Type I errorzb = z value for Type II error

Page 8: Correlation / regression

• Correlation

• Regression

• Multiple Regression

• Curve fitting

Correlation / regression

Page 9: Correlation / regression

Regression

• Draw a line which best fits the relationship between x and y– The line takes the form y = a + bx

• Where a is the y value when x=0

• Where b is the slope of the line, or how much y changes for one unit of change in x

– It assumes that y is dependent on x– It explains how changes in y values are governed by changes in

x values– It allows x to predict y

• Note : x = a + by is not the mirror image of y=a +bx, as how best fit is calculated differs

Page 10: Correlation / regression

Regression - example

Page 11: Correlation / regression

RegressionBest fit y=a + bx

Page 12: Correlation / regression

RegressionBest fit x=a + by

Page 13: Correlation / regression

Sxy

SSqxb

n

xxSSqx

22

n

yySSqy

2

2

n

yxxySxy

2

xsumx

22 xsumx ysumy

22 ysumy

xysumxy

n

xb

n

ya

Page 14: Correlation / regression

Confidence interval for b

2

2

).( nSSqxSxy

SSqySE xyb

tSEbtSEbCIb

t = Student’s t for sample size and Type I Error

Page 15: Correlation / regression

Confidence interval for predicted y

• SE 2 components and changes with x value– SE of regression slope b

– SE of departure from residual variation

SSqx

meanxx

nSSqx

SxySSqySE xy

22

.

1

xyxyxy tSEbtSExyCI ... .

Page 16: Correlation / regression

Confidence interval for predicted y

Page 17: Correlation / regression

• Correlation

• Regression

• Multiple Regression

• Curve fitting

Correlation / regression

Page 18: Correlation / regression

Multiple Regression

• Outcome, particularly clinical outcome

– Are subjected to multiple influences

– All of which are related to each other

• Multiple regression model is therefore commonly needed

Page 19: Correlation / regression

•BMI is influenced by mother and grandparents, but

•People who married tend to have comparable BMI

•Parent’s BMI tend to be dependent influenced by grandparents’

•Multiple regression y = a + b1x1 + b2x2 + b3x3 …bixi

Page 20: Correlation / regression

Multiple Regression

• Starts with a matrix of Sum/products [S]k,k from k measurements

– where Si,j is the Sxy between any pair I and j

– Where Si,i is the SSqx of variable i

• This matrix is inverted [V] = [S]-1

• The Partial Regression Coefficient bi

iiyy

iyi

VV

Vb

,,

,

• The constant an

xb

n

ya i

ki

Page 21: Correlation / regression

• Correlation

• Regression

• Multiple Regression

• Curve fitting

Correlation / regression

Page 22: Correlation / regression

Curve fit

• In cases where the relationship between x and y are not linear

• y = function(x)

– y = Log(x)

– y = sine(x)

• Polynomial curve fit

– A special case of multiple regression

– Will fit into any shape where y increases with x

– y = a + b1x + b2x2 + b3x3 …..bkxk

– In most biological systems fitting to the power of 3 is sufficient

Page 23: Correlation / regression

Polynomial curve fit

0

2

4

6

8

10

0 1 2 3 4 5 6x

y

Data point

y = a + bx

y = a + b1x + b2x2

y = a + b1x + b2x2 + b3x3

Page 24: Correlation / regression

CI of polynomial fit

• Complexity of calculating Standard Error– Summing of each individual coefficients– Residual

• Solution – 2 stage procedure– Do polynomial curve fit– Calculate error (distance between each

datapoint from the regression line)– Curve fit error

Page 25: Correlation / regression

-2

0

2

4

6

8

10

12

1 2 3 4 5 6

y = 14.45 – 16.66x + 5.83x2 – 0.45x3

SD = 0.29 + 0.18x

Page 26: Correlation / regression

Curve fitFemur length according to gestational age

Gestation (days)

Fem

ur

len

gth

(cm

s)