7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation,...

39
7 Nov 2012 COMP80131-SEEDSM2 1 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2: Statistical Methods- Basics www.cs.man.ac.uk/~barry/mydocs/ myCOMP80131

Transcript of 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation,...

Page 1: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 1

Scientific Methods 1

Barry & Goran

‘Scientific evaluation, experimental design

& statistical methods’

COMP80131

Lecture 2: Statistical Methods-Basics

www.cs.man.ac.uk/~barry/mydocs/myCOMP80131

Page 2: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 2

Scientific Methods 1• Scientific evaluation:

– derivation of useful & reliable statements about some new or existing scientific idea

– based on an accumulation of evidence – often in the form of numerical tables.

• Experimental design: – how to generate the quantifiable outputs, – the systematic observation & measurement of these outputs – the recording of the resulting data. – design of experiments to test some theoretical prediction of what

the researcher expects to happen – a ‘research hypothesis’

• Statistical methods: – the means of deriving useful & reliable statements from numerical

evidence.

Page 3: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 3

Scientific Enquiry• It may be argued that:

– ‘Scientific researchers propose hypotheses as explanations of phenomena

– design experimental studies to test these hypotheses’. • It may also be argued otherwise.• Wider domains of inquiry may combine many independently

derived hypotheses.• Or not have hypotheses at all, other than contrived ones

such as:– ‘This idea can (not) be implemented’

Page 4: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 4

Philosophy of Science• Concerns:

–underpinning logic of the scientific method, –what separates science from non-science,– the ethics implicit in science.

• Assumes:–reality is objective and consistent,–humans have the capacity to perceive reality accurately,– rational explanations exist for elements of the real world.

• Logical Positivism & other theories claim to have defined the

logic of science,• All been been challenged.• Ludwig Wittgenstein (1889-1951) got his PhD in Manchester

Page 5: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 5

Ludwig Wittgenstein

He could ‘think’ you under the table.

(He was just as schloshed as Schlegel)

Page 6: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 6

Objectivity, repeatability & full disclosure

• Scientific inquiry is intended to be as objective as possible, to reduce biased interpretations of results.

• Procedures must be reproducible (i.e. repeatable)• Researchers should:

– document, archive & share all data and methodology – make this available for careful scrutiny by other scientists, – enable them to verify results by attempting to reproduce them.

• This practice is called ‘full disclosure’.• Allows methodology & statistical reliability of the data to be

verified.

Page 7: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 7

References on Statistics

1. DJ Hand ‘Statistics – a very short introduction’ Oxford UP 2008

2. Schaum’s Outlines ‘Prob & Stats’ 2009

3. WG Hopkins ‘A new View of Statistics’ (Google it).

4. Allen B. Downey ‘Think Stats’ published by O’Reilly, but avail on-line

Page 8: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 8

Tables of ResultsEngli Maths Phys Chem Hist Fren Music Art Avge 81 67 60 104 89 97 72 30 75.0 91 32 42 34 24 65 81 61 53.8 13 123 45 22 92 61 114 11 60.1 91 65 80 23 95 47 101 33 66.9 63 58 44 6 38 58 36 21 40.5 10 28 69 24 84 91 20 102 53.5 28 20 60 18 46 38 -3 79 35.8 55 0 44 85 35 23 11 112 45.6 96 38 49 17 11 42 45 48 43.3 96 21 48 83 80 27 8 101 58.0 16 68 55 35 69 44 40 55 47.8 97 41 64 13 91 63 -13 33 48.6 96 100 34 19 34 53 81 -10 50.9 49 92 70 17 13 39 63 -19 40.5 80 55 58 3 58 87 68 28 54.6 14 42 45 95 63 30 64 46 49.9 42 82 49 19 88 40 42 16 47.3 92 18 53 80 0 52 -17 108 48.3 79 69 53 29 0 6 59 31 40.8 96 31 62 40 77 23 50 65 55.5

A fictitious set of exam results.

A sample of 20 students out of a population of 1000.

Complete file is:

ExamData.xls or ExamData.dat

www.cs.man.ac.uk/~barry

Page 9: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 9

A bit of MATLAB

[Marks,Headings]=xlsread('ExamData.xls');

[nRows,nCols] = size(Marks);

Headings(1,1:nCols))

Marks

Reads in marks from Excel spreadsheet into an array Marks

Headings read in separately.

Miss out ‘;’ to display.

‘%’ is comment.

Page 10: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 10

A bit more MATLAB% Row with mean of each column:

Me = mean(Marks)

% Row with standd deviations of cols:

St_devs = std(Marks)

% Row with variances of cols:

Variances = var(Marks)

Statistics printed out:

Engli Maths Phys Chem Hist Fren Music Art AvgeMeans: 52.2 49.2 49.7 49.6 55.7 51.0 48.4 50.7 50.8Std_devs: 28.2 27.2 10.5 31.5 33.3 28.6 33.4 34.1 8.7Variances: 795 741 110 990 1109 819 1115 1165 75.5

Page 11: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 11

Definitions: mean46850699-423023163860-345 030

• Here is a column of marks, say for French.• Mean is the average. It is about 27.• A ‘statistic’ which summarizes the column of data.• Alternatives exist: e.g. median & mode• It allows comparisons to be made.

• If average is 31 next year we can hypothesise that─ the students are better, ─ better taught ─ or the exam was easier, ─ or maybe the exam room was warmer.

• (Is the increase of 4 statistically significant?)

Page 12: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 12

Definitions: variance46

8

50

6

99

-42

30

23

16

38

60

-3

45

0

30

On the right is another column. Mean is also 27.

But it is much less ‘spread out’ – variance is less.

All students are getting close to the same mark.

Maybe exam is not well designed to test ability.

28

26

29

25

30

24

27

26

28

27

28

26

25

29

27

N

nn meanx

NVariance

1

2)()1(

1

Another ‘statistic’: 1068 (left) & 2.86 (right)

Measure of ‘spread’

Page 13: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 13

Definitions: std_deviation46

8

50

6

99

-42

30

23

16

38

60

-3

54

0

30

This is the square root of the variance.

Also a measure of ‘spread’

Yet another ‘statistic’: 32.7 (left)

1.69 (right)

Many alternatives exist

28

26

29

25

30

24

27

26

28

27

28

26

25

29

27

Page 14: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 14

Population-mean & sample-mean• Simplest statistic is mean or average. • For table of 20 marks, average is easily found & understood.

• Consider this batch of students to be a ‘sample’ of a much larger ‘population’ of say 1000 students taking exams.

• How representative is this ‘sample-mean’ likely to be of the mean for the whole population, i.e. the ‘population mean’?

• Question arises all the time in statistical methods.

• Second example: – Assume a population of 50 million people in the UK,– We take a ‘sample’ of 1000 people, – measure their heights & compute the average, – how close will this ‘sample mean’ be to true mean for whole population?

• Same question can be asked about variance.

Page 15: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 15

Population & sample variance

P

nn popmeanx

PepopVarianc

1

2)(1

N

nn

N

nn

samplemeanxN

ancesampleVari

popmeanxN

ancesampleVari

1

2

1

2

)()1(

1 2

)(1

1

where P is total number of measurements in the population.

There are two ways to estimate popVariance by a sample variance:

In practice we usually do not know popmean.

Equations are normally close when N is large.

Page 16: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 16

Back to MATLAB• Divide the 1000 marks into batches & compute the sample mean for

each batch.

True Means: 52.2 49.2 49.7 49.6 55.7 51.0 48.4 50.7 50.8------------------------------------------------------------------------------Means: 50.0 58.7 51.0 46.7 43.7 62.3 61.1 36.9 51.3 52.7 Means: 48.5 51.8 57.8 47.2 45.6 47.7 53.7 50.6 48.0 44.5 Means: 49.5 48.6 30.9 53.9 43.7 53.6 46.6 50.4 56.9 48.4 Means: 44.5 68.2 48.1 55.9 48.0 52.5 54.0 42.2 50.3 56.8 Means: 52.2 39.9 38.1 69.9 50.4 61.9 57.2 50.6 49.5 59.8 Means: 59.0 61.5 39.5 54.9 42.6 44.0 50.6 41.0 62.1 48.9 Means: 44.6 56.1 48.7 49.9 44.3 48.4 39.1 52.4 56.6 43.5 Means: 62.8 49.6 55.7 42.9 48.8 42.1 60.7 66.5 41.8 55.2 Means: 51.7 52.3 53.2 48.2 48.1 69.1 49.8 57.0 50.1 53.4 Means: 49.9 47.4 54.1 50.4 67.2 51.6 42.9 56.1 52.5 44.9 Means: 55.8 46.1 48.5 55.8 54.7 54.5 39.3 49.9 43.8 53.1 Means: 50.4 44.1 55.5 46.6 47.8 41.7 47.9 57.5 53.7 51.5 Means: 52.8 67.2 47.8 46.7 53.3 53.8 46.9 51.3 48.5 58.6 Means: 47.0 48.6 56.4 50.3 50.9 56.4 50.0 52.1 42.5 50.5 Means: 54.2 50.0 52.3 51.0 52.3 50.9 50.8 63.5 48.6 58.6 Means: 56.3 51.1 54.0 53.9 64.0 48.8 50.8 44.3 62.2 61.8 Means: 40.9 53.3 52.8 56.9 51.2 61.1 57.6 56.8 50.1 37.6 Means: 53.0 55.9 38.8 47.2 49.0 62.2 49.1 39.4 54.6 49.5 Means: 47.8 51.4 48.2 45.9 48.2 53.6 54.0 43.6 49.1 48.3 Means: 38.9 51.9 52.0 60.7 44.1 44.2 70.8 51.3 49.9 46.8 Means: 52.6 54.9 54.9 50.8 43.8 53.5 50.9 58.3 40.1 48.9 Means: 52.5 68.1 53.3 46.1 60.1 53.4 52.0 48.3 51.5 55.5 Means: 60.0 45.7 45.5 45.7 50.5 51.8 44.8 50.1 54.2 65.9

Sample means for

50 batches of 20

Look at col 1 (Engl)

Page 17: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 17

50 batches of 20 (column 1)

5 10 15 20 25 30 35 40 45 500

10

20

30

40

50

60

70

80

90

100

Batch

Sam

ple

mea

n

Look at spread over all batches for column 1

Remember pop-mean 52.2

Mean (of sample-means) =52.2

Variance = 32

Page 18: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 18

20 batches of 50 (column 1)

2 4 6 8 10 12 14 16 18 200

10

20

30

40

50

60

70

80

90

100

Batch

Sam

ple

mea

n

Variance has reduced.Mean of sample-means = 52.2

Variance = 18.2

Page 19: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 19

10 batches of 100

1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

60

70

80

90

100

Batch

Sam

ple

mea

n

Mean of sample-means = 52.2

Variance = 7.28

Page 20: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 20

Distributions

• Histogram divides domain (x-axis) into say 10 or 20 regions• Plots the number of marks that fall in each region.• In MATLAB:

figure(1); hist(Marks(:,1),20);

figure(2); hist(Marks(:,2),20);

figure(3); hist(Marks(:,3),20); etc.

Page 21: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 21

Histogram for col 1 (English)

Fairly evenly distributed across the 20 ‘bins’.

Looks like a ‘uniform’ distribution.

Mean 50

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

Page 22: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 22

Histogram for col 2 (Maths)

• Looks a bit ‘Gaussian’ or ‘normal’.

• Biased towards bins close to the mean.

• Tails away on either side.• Mean 50

-40 -20 0 20 40 60 80 100 120 1400

50

100

150

Page 23: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 23

Histogram for col 3 (Phys)

10 20 30 40 50 60 70 80 900

20

40

60

80

100

120

140

Also looks ‘Gaussian’

Mean 50 with smaller variance

Page 24: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 24

Histogram for col 4 (Chem)

-20 0 20 40 60 80 100 1200

20

40

60

80

100

120

140

Bi-modal distribution

Page 25: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 25

Column 5(Hist)

A bit strange

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

140

160

Page 26: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 26

Col 6 (French)

Uniform again?

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

Page 27: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 27

Column 7 (Music)

-50 0 50 100 150 2000

20

40

60

80

100

120

Gaussian again?

Page 28: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 28

Col 8 (Art)

-100 -50 0 50 100 150 2000

20

40

60

80

100

120

140

Gaussian again?

Page 29: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 29

Col 9 (Average)

20 30 40 50 60 70 800

20

40

60

80

100

120

Gaussian?

Note smaller variance.

Page 30: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 30

Some questions for you

• Analyse the fictitious exam results & comment on features.• Compute means, stds & vars for each subject & histograms for the

distributions.• Make observations about performance in each subject & overall• Do marks support the hypothesis that people good at Music are also

good at Maths?• Do they support the hypothesis that people good at English are also

good at French?• Do they support the hypothesis that people good at Art are also good at

Maths?• If you have access to only 50 rows of this data, investigate the same

hypotheses• What conclusions could you draw, and with what degree of certainty?

Page 31: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 31

Sample covariance• Measure of how two columns are related.• Let cols be x and y, each with N entries:

N

nynxn meanymeanx

N 1

))((1

1

• Note that meanx & meany are ‘sample means’

• If x = y, this is variance of x = variance of y)

• Useful but difficult to interpret sometimes.

Page 32: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 32

Another sample covariance

N

nynxn popmeanypopmeanx

N 1

))((1

• If we know population means, formula changes to:

N

nyxnn popmeanpopmeanyx

N 1

)(1

• It can be shown that this equals:

• First term sometimes called correlation between x and y

Page 33: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 33

Pearson Correlation Coeff

• Another measure of how two columns x & y are correlated.

yx

N

nynxn meanymeanx

N

varvar

))((1

1

1

• Lies between -1 and +1. Much easier to interpret.

• If we know pop-means, replace (N-1) by N as usual.

yx stdstd

yxariance

),(cov

Page 34: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 34

Scatter plot col 1 against col 1

Corr coeff = 1

Positive correlation

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

col 1

col 1

Page 35: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 35

Scatter plot col 1 against -col 1

Corr-coeff = -1

Negative correlation

0 10 20 30 40 50 60 70 80 90 100-100

-90

-80

-70

-60

-50

-40

-30

-20

-10

0

col 1

col 1

Page 36: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 36

Scatter plot col 1(Eng) against col 2(Maths)

Corr coeff = 0.04

(close to zero)

Very weak or no correlation

0 10 20 30 40 50 60 70 80 90 100-40

-20

0

20

40

60

80

100

120

140

Page 37: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 37

Scatter plot col 2(Maths) against col 7(Mus)

Corr coeff = 0.8

(strong +ve corr)

-50 0 50 100 150 200-40

-20

0

20

40

60

80

100

120

140

col 7

col 2

Page 38: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 38

Scatter plot col 8(Art) against col 2 (Maths)

-40 -20 0 20 40 60 80 100 120 140-100

-50

0

50

100

150

200

col 2

col 8

Corr coeff = -0.8

Strong –ve correlation

Page 39: 7 Nov 2012COMP80131-SEEDSM21 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 2:

7 Nov 2012 COMP80131-SEEDSM2 39

Correlation

In MATLAB: corr(Marks)

1.00 -0.037 -0.029 -0.068 -0.04 0.012 -0.015 0.013 0.34

-0.037 1.00 -0.0014 0.051 -0.033 0.003 0.79 -0.82 0.365

-0.029 -0.0014 1.00 -0.042 0.03 0.009 0.017 0.011 0.15

-0.068 0.051 -0.042 1.00 -0.013 -0.055 0.048 -0.031 0.42

-0.04 -0.033 0.03 -0.013 1.00 -0.053 0.002 -0.006 0.43

0.012 0.003 0.009 -0.055 -0.053 1.00 -0.004 -0.009 0.363

-0.015 0.79 0.017 0.0476 0.0021 -0.004 1.00 -0.66 0.48

0.013 -0.82 0.011 -0.031 -0.0061 -0.009 -0.66 1.00 -0.16

0.34 0.37 0.15 0.42 0.43 0.363 0.48 -0.16 1.00