Multivariate analysis of community structure data

Post on 14-Jan-2016

38 views 3 download

description

Multivariate analysis of community structure data. Colin Bates UBC Bamfield Marine Sciences Centre. Goals. To understand the ideas behind multivariate community structure analysis. To understand how to perform these analyses in PRIMER. - PowerPoint PPT Presentation

Transcript of Multivariate analysis of community structure data

Multivariate analysis of community structure data

Colin Bates UBC Bamfield Marine Sciences Centre

2000 May2000 May2000 May2000 May2000 May2000 May2000 Nov2000 Nov2000 Nov2000 Nov2000 Aug2001 Oct2000 Nov2000 Nov2000 Nov2000 Nov2001 April2001 April2001 July2001 July2000 Aug2001 July2001 July2000 Aug2000 Aug2001 Oct2001 April2000 Aug2000 Aug2000 Aug2000 May2000 May2001 Oct2001 Oct2001 Oct2001 Oct2001 July2000 Aug2001 July2001 July2001 July2001 April2001 April2001 April2001 April2001 Oct2001 April2001 Oct

20 40 60 80 100

Similarity

Goals

1)To understand the ideas behind multivariate community structure analysis.

2)To understand how to perform these analyses in PRIMER.

3)To be prepared to analyse and interpret your class data later today.

What are multivariate statistics?

Statistics that allow us to look at how multiple variables change together

What are multivariate statistics?

Statistics that allow us to look at how multiple variables change together:

EG: How do 50 species in a community react to an environmental perturbation?

What are multivariate statistics?

Statistics that allow us to look at how multiple variables change together:

EG: How do 50 species in a community react to an environmental perturbation?

50 ANOVAs?

What are multivariate statistics?

Statistics that allow us to look at how multiple variables change together:

EG: How do 50 species in a community react to an environmental perturbation?

50 ANOVAs? No…

Multivariate stats allow us to “condense” information for simplicity

When might I use this type of analysis?

For a multi-species community, you may wish to:

- pull order from complex systems

- visualize these patterns

- comparisons over time and space

- test hypotheses

The vehicle:

Example: Seaweed Communities at Cape Beale

- Is flora different at two close sites, each exposed to different wave intensity?

Data collection:

2. Data Analysis

Step 1: Entering your data into PRIMER

How to analyze this type of data?

1. Diversity indices

How to analyze this type of data?

1. Diversity indices

Yet, most diversity indices do not consider species identity…

How to analyze this type of data?

1. Diversity indices

Yet, most diversity indices do not consider species identity…

Multivariate community structure analyses

Analysis flow

samples

spec

ies

sample similarities

aaa

bb

b

cc

c

ordination

aaa b

bb

ccc

are sites different?

How?

Analysis flow

samples

spec

ies

sample similarities

aaa

bb

b

cc

c

ordination

Calculate Bray – Curtis Similarity

gives a triangular similarity matrix

within

within

between

Analysis flow

samples

spec

ies

sample similarities

aaa

bb

b

cc

c

ordination

aaa b

bb

ccc

are sites different?

How?

Visualizing similarities

Ordination “maps” similarity relationships between samples

aa

ab

bb

cc

c

ordination

nMDS ordination example

nMDS ordination example

Distance between points reflects relative similarity!

Nonmetric multidimensional scaling (nMDS)

Nonmetric: no axes

Multidimensional: represents relationships between multiple variables in two or three dimensions

Scaling: the ratio between reality and representation

“the future of ordination is in nonmetric multidimensional scaling” – McCune & Grace, 2002

How does nMDS work?

nMDS uses the RANK ORDER of similarity relationships between samples:

Sample Sample % similarity

rank

A1 A2 99% 1

A1 A3 96% 2

A2 A3 95% 3

A1 is closer to A2 than it is to A3

How does nMDS work?

Then, nMDS tries to place points in 2 (or 3) dimensional space to represent this ranked order:

A1A2

A3

A1 is closer to A2 than it is to A3

How does nMDS work?

Then, nMDS tries to place points in 2 (or 3) dimensional space to represent this ranked order:

A1 A2

A3

A1 is closer to A2 than it is to A3

How accurate is the nMDS map?

- Sometimes the nMDS can’t represent all relationship accurately

- this is reflected by a high STRESS value

- Sometimes the nMDS can’t represent all relationship accurately

- this is reflected by a high STRESS value

distance on nMDS

sim

ilarit

y in

sim

. m

atrix

. ... .

...

. .

...

... .

.

.

.

If Stress Value =

0.0 : perfect map

0.1 : decent map

0.2 : ok map

0.3 : don’t bother

How accurate is the nMDS map?

- Ordination is a way to visualize how similar your samples are

- nMDS tries to represent visually the rank order within the underlying similarity matrix

- all that matters is the relative distance between points.

- stress value allows you to estimate ‘quality’ of the nMDS’

Main points about ordination!

sample similarities

aaa

bb

b

cc

c

ordination

Obviously distinct groups

Less obvious! Are they really different?

Analysis flow

samples

spec

ies

sample similarities

aaa

bb

b

cc

c

ordination

aaa b

bb

ccc

are sites different?

Analysis flow

samples

spec

ies

sample similarities

aaa

bb

b

cc

c

ordination

aaa b

bb

ccc

are sites different?

How?

Analysis of Similarities – a statistical approach

Are groups different?

exposedsheltered

Analysis of Similarities – a statistical approach

Are groups different?

Ho = sites the same

Ha = sites are differentexposedsheltered

If Ho (sites the same) = true

Similarity within = Similarity between

If Ha (sites different) = true

Similarity within > Similarity between

Analysis of Similarities – a statistical approach

Are groups different?

(rbetween - rwithin )R = standardizing factor

Analysis of Similarities – a statistical approach

Are groups different?

(rbetween - rwithin )R = ~1

If Ho (sites the same) = true

Similarity within = Similarity between

~ 0(rbetween - rwithin )

R = ~1

If Ha (sites different) = true

Similarity within > Similarity between

~ 1(rbetween - rwithin )

R = ~1

To simulate null distribution

To simulate null distribution

Similarity within = Similarity between

To simulate null distribution

Similarity within = Similarity between

Calculate R

To simulate null distribution

Similarity within = Similarity between

Calculate R

Phyc 2003 Practice data setFr

eque

ncy

R

6

88

243

232

189

109

58

35

1910 9

1

-0.05-0.10-0.15-0.20 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

0.48

.477

Phyc 2003 Practice data setFr

eque

ncy

R

6

88

243

232

189

109

58

35

1910 9

1

-0.05-0.10-0.15-0.20 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

0.48

.477

1999

P= = 0.001

Analysis flow

samples

spec

ies

sample similarities

aaa

bb

b

cc

c

ordination

aaa b

bb

ccc

are sites different?

How?

Sites are different – why?

• We will use the SIMPER routine:

- Similarity Percentages

Basically indicates which species are responsible for the patterns that we see.

Data analysis summarysamples

spec

ies

sample similarities

aaa

bb

b

cc

c

nMDS

aaa b

bb

ccc

are sites different?

How? SIMPERANOSIM