Multivariate analysis of community structure data
description
Transcript of Multivariate analysis of community structure data
Multivariate analysis of community structure data
Colin Bates UBC Bamfield Marine Sciences Centre
2000 May2000 May2000 May2000 May2000 May2000 May2000 Nov2000 Nov2000 Nov2000 Nov2000 Aug2001 Oct2000 Nov2000 Nov2000 Nov2000 Nov2001 April2001 April2001 July2001 July2000 Aug2001 July2001 July2000 Aug2000 Aug2001 Oct2001 April2000 Aug2000 Aug2000 Aug2000 May2000 May2001 Oct2001 Oct2001 Oct2001 Oct2001 July2000 Aug2001 July2001 July2001 July2001 April2001 April2001 April2001 April2001 Oct2001 April2001 Oct
20 40 60 80 100
Similarity
Goals
1)To understand the ideas behind multivariate community structure analysis.
2)To understand how to perform these analyses in PRIMER.
3)To be prepared to analyse and interpret your class data later today.
What are multivariate statistics?
Statistics that allow us to look at how multiple variables change together
What are multivariate statistics?
Statistics that allow us to look at how multiple variables change together:
EG: How do 50 species in a community react to an environmental perturbation?
What are multivariate statistics?
Statistics that allow us to look at how multiple variables change together:
EG: How do 50 species in a community react to an environmental perturbation?
50 ANOVAs?
What are multivariate statistics?
Statistics that allow us to look at how multiple variables change together:
EG: How do 50 species in a community react to an environmental perturbation?
50 ANOVAs? No…
Multivariate stats allow us to “condense” information for simplicity
When might I use this type of analysis?
For a multi-species community, you may wish to:
- pull order from complex systems
- visualize these patterns
- comparisons over time and space
- test hypotheses
The vehicle:
Example: Seaweed Communities at Cape Beale
- Is flora different at two close sites, each exposed to different wave intensity?
Data collection:
2. Data Analysis
Step 1: Entering your data into PRIMER
How to analyze this type of data?
1. Diversity indices
How to analyze this type of data?
1. Diversity indices
Yet, most diversity indices do not consider species identity…
How to analyze this type of data?
1. Diversity indices
Yet, most diversity indices do not consider species identity…
Multivariate community structure analyses
Analysis flow
samples
spec
ies
sample similarities
aaa
bb
b
cc
c
ordination
aaa b
bb
ccc
are sites different?
How?
Analysis flow
samples
spec
ies
sample similarities
aaa
bb
b
cc
c
ordination
Calculate Bray – Curtis Similarity
gives a triangular similarity matrix
within
within
between
Analysis flow
samples
spec
ies
sample similarities
aaa
bb
b
cc
c
ordination
aaa b
bb
ccc
are sites different?
How?
Visualizing similarities
Ordination “maps” similarity relationships between samples
aa
ab
bb
cc
c
ordination
nMDS ordination example
nMDS ordination example
Distance between points reflects relative similarity!
Nonmetric multidimensional scaling (nMDS)
Nonmetric: no axes
Multidimensional: represents relationships between multiple variables in two or three dimensions
Scaling: the ratio between reality and representation
“the future of ordination is in nonmetric multidimensional scaling” – McCune & Grace, 2002
How does nMDS work?
nMDS uses the RANK ORDER of similarity relationships between samples:
Sample Sample % similarity
rank
A1 A2 99% 1
A1 A3 96% 2
A2 A3 95% 3
A1 is closer to A2 than it is to A3
How does nMDS work?
Then, nMDS tries to place points in 2 (or 3) dimensional space to represent this ranked order:
A1A2
A3
A1 is closer to A2 than it is to A3
How does nMDS work?
Then, nMDS tries to place points in 2 (or 3) dimensional space to represent this ranked order:
A1 A2
A3
A1 is closer to A2 than it is to A3
How accurate is the nMDS map?
- Sometimes the nMDS can’t represent all relationship accurately
- this is reflected by a high STRESS value
- Sometimes the nMDS can’t represent all relationship accurately
- this is reflected by a high STRESS value
distance on nMDS
sim
ilarit
y in
sim
. m
atrix
. ... .
...
. .
...
... .
.
.
.
If Stress Value =
0.0 : perfect map
0.1 : decent map
0.2 : ok map
0.3 : don’t bother
How accurate is the nMDS map?
- Ordination is a way to visualize how similar your samples are
- nMDS tries to represent visually the rank order within the underlying similarity matrix
- all that matters is the relative distance between points.
- stress value allows you to estimate ‘quality’ of the nMDS’
Main points about ordination!
sample similarities
aaa
bb
b
cc
c
ordination
Obviously distinct groups
Less obvious! Are they really different?
Analysis flow
samples
spec
ies
sample similarities
aaa
bb
b
cc
c
ordination
aaa b
bb
ccc
are sites different?
Analysis flow
samples
spec
ies
sample similarities
aaa
bb
b
cc
c
ordination
aaa b
bb
ccc
are sites different?
How?
Analysis of Similarities – a statistical approach
Are groups different?
exposedsheltered
Analysis of Similarities – a statistical approach
Are groups different?
Ho = sites the same
Ha = sites are differentexposedsheltered
If Ho (sites the same) = true
Similarity within = Similarity between
If Ha (sites different) = true
Similarity within > Similarity between
Analysis of Similarities – a statistical approach
Are groups different?
(rbetween - rwithin )R = standardizing factor
Analysis of Similarities – a statistical approach
Are groups different?
(rbetween - rwithin )R = ~1
If Ho (sites the same) = true
Similarity within = Similarity between
~ 0(rbetween - rwithin )
R = ~1
If Ha (sites different) = true
Similarity within > Similarity between
~ 1(rbetween - rwithin )
R = ~1
To simulate null distribution
To simulate null distribution
Similarity within = Similarity between
To simulate null distribution
Similarity within = Similarity between
Calculate R
To simulate null distribution
Similarity within = Similarity between
Calculate R
Phyc 2003 Practice data setFr
eque
ncy
R
6
88
243
232
189
109
58
35
1910 9
1
-0.05-0.10-0.15-0.20 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
0.48
.477
Phyc 2003 Practice data setFr
eque
ncy
R
6
88
243
232
189
109
58
35
1910 9
1
-0.05-0.10-0.15-0.20 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
0.48
.477
1999
P= = 0.001
Analysis flow
samples
spec
ies
sample similarities
aaa
bb
b
cc
c
ordination
aaa b
bb
ccc
are sites different?
How?
Sites are different – why?
• We will use the SIMPER routine:
- Similarity Percentages
Basically indicates which species are responsible for the patterns that we see.
Data analysis summarysamples
spec
ies
sample similarities
aaa
bb
b
cc
c
nMDS
aaa b
bb
ccc
are sites different?
How? SIMPERANOSIM