Theories of Archaeology Theories leading up Contemporary Archaeology.
Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey .
-
Upload
aiden-barnes -
Category
Documents
-
view
219 -
download
0
Transcript of Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey .
Computing in Computing in ArchaeologyArchaeology
Session 11. Correlation and Session 11. Correlation and regression analysisregression analysis
© Richard Haddlesey www.medievalarchitecture.net
Lecture aimsLecture aims
To introduce correlation and To introduce correlation and regression techniquesregression techniques
The scattergramThe scattergram
In correlation, we are always dealing In correlation, we are always dealing with with pairedpaired scores, and so values of scores, and so values of the the two variablestwo variables taken together taken together will be used to make a scattergramwill be used to make a scattergram
exampleexample
Quantities of New Forrest pottery Quantities of New Forrest pottery recovered from sites at varying distances recovered from sites at varying distances from the kilnsfrom the kilns
SiteSite Distance Distance (km)(km)
QuantityQuantity
11 44 9898
22 2020 6060
33 3232 4141
44 3434 4747
55 2424 6262
Negative correlationNegative correlation
Here we can see that the quantity of pottery decreases as distance from the source increases
Positive correlationPositive correlation
Here we see that the taller a pot, the wider the rim
Curvilinear monotonic relationCurvilinear monotonic relation
Again the further from source, the less quantity of artefacts
Arched relationship Arched relationship (non-monotonic)(non-monotonic)
Here we see the first molar increases with age and is then worn down as the animal gets older
scattergramscattergram
This shows us that scattergrams are This shows us that scattergrams are the most important means of the most important means of studying relationships between studying relationships between two two variablesvariables
REGRESSION
Regression differs from other techniques Regression differs from other techniques we have looked at so far in that it is we have looked at so far in that it is concerned not just with whether or not a concerned not just with whether or not a relationship exists, or the strength of that relationship exists, or the strength of that relationship, but with its naturerelationship, but with its nature
In regression analysis we use an In regression analysis we use an independent variable to estimate (or independent variable to estimate (or predict) the values of a dependent predict) the values of a dependent variablevariable
Regression equationRegression equation
y = f(x)
y = y axis (in this case the y = y axis (in this case the dependentdependent
f = function (of x)f = function (of x)
x = x axisx = x axis
y = f(x)
y = x y = 2x y = x2
General linear equationsGeneral linear equations
y = a + bxy = a + bx
Where y is the dependent variable, x Where y is the dependent variable, x is the independent variable, and the is the independent variable, and the coefficients a and b are constants, coefficients a and b are constants, i.e. they are fixed for a given datai.e. they are fixed for a given data
Therefore:Therefore: If x = 0 then the equation reduces to y = If x = 0 then the equation reduces to y =
a, so a represents the point where the a, so a represents the point where the regression line crosses the y axis (the regression line crosses the y axis (the interceptintercept))
The b constant defines the slope of The b constant defines the slope of gradient of the regression linegradient of the regression line
Thus for the pottery quantity in relation to Thus for the pottery quantity in relation to distance from source, b represents the distance from source, b represents the amount of decrease in pottery quantity amount of decrease in pottery quantity from the sourcefrom the source
y = a + bx
least-squares
least-squares
least-squares
least-squares
y = a + bx
y = a + bx
y = 102.64 – 1.8x
CORRELATION
CORRELATION
1 correlation coefficient
CORRELATION
1 correlation coefficient
2 significance
CORRELATION
1 correlation coefficient• r
2 significance
CORRELATION
1 correlation coefficient• r• -1 to +1
2 significance
• nominal – in name only
• ordinal – forming a sequence
• interval – a sequence with fixed distances
• ratio – fixed distances with a datum point
Levels of measurement:
• nominal
• ordinal
• interval
• ratio
Levels of measurement:
• nominal
• ordinal
• interval Product-Moment Correlation Coefficient• ratio
Levels of measurement:
• nominal
• ordinal Spearman’s Rank Correlation Coefficient• interval • ratio
Levels of measurement:
The Product-MomentCorrelation Coefficient
length (cm) width (cm)
sample – 20 bronze spearheads
n=20
length (cm) width (cm)
r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]
n=20
r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]
n=20
r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]
n=20
r = nΣxy – (Σx)(Σy) g= +0.67 √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]
n=20
Test of product moment correlation coefficient
Test of product moment correlation coefficient
H0 : true correlation coefficient = 0
Test of product moment correlation coefficient
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Test of product moment correlation coefficient
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables approximately random
Test of product moment correlation coefficient
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables approximately random
Sample statistics needed: n and r
Test of product moment correlation coefficient
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables approximately random
Sample statistics needed: n and r
Test statistic: TS = r
Test of product moment correlation coefficient
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables approximately random
Sample statistics needed: n and r
Test statistic: TS = r
Table: product moment correlation coefficient table.
n = 20
n = 20 r = 0.67 p<0.01
n = 20 r = 0.67 p<0.01
length (cm) width (cm)
Spearman’s Rank Correlation Coefficient (rs)
Spearman’s Rank Correlation Coefficient (rs)
H0 : true correlation coefficient = 0
Spearman’s Rank Correlation Coefficient (rs)
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Spearman’s Rank Correlation Coefficient (rs)
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables at least ordinal
Spearman’s Rank Correlation Coefficient (rs)
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables at least ordinal
Sample statistics needed: n and rs
Spearman’s Rank Correlation Coefficient (rs)
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables at least ordinal
Sample statistics needed: n and rs
Test statistic: TS = rs
Spearman’s Rank Correlation Coefficient (rs)
H0 : true correlation coefficient = 0
H1 : true correlation coefficient ≠ 0
Assumptions: both variables at least ordinal
Sample statistics needed: n and rs
Test statistic: TS = rs
Table: Spearman’s rank correlation coefficient table