Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 ›...

82
Analysis of Multivariate Ecological Data School on Recent Advances in Analysis of Multivariate Ecological Data 24-28 October 2016 Prof. Pierre Legendre Dr. Daniel Borcard Département de sciences biologiques Université de Montréal C.P. 6128, succursale Centre Ville Montréal QC H3C 3J7 Canada [email protected] [email protected]

Transcript of Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 ›...

Page 1: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

Analysis of Multivariate� Ecological Data

School on Recent Advances in Analysis of Multivariate Ecological Data24-28 October 2016

Prof. Pierre LegendreDr. Daniel Borcard

Département de sciences biologiquesUniversité de Montréal

C.P. 6128, succursale Centre VilleMontréal QC

H3C 3J7 [email protected]@umontreal.ca

Page 2: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

Brain storming in Pierre Legendre's lab

Daniel Borcard Pierre Legendre Pedro Peres-Neto Stéphane Dray

Page 3: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

3

Day 1

Page 4: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

4

Introduction to data analysis

Day 1

Page 5: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

5

Main references

•  Legendre, P. & L. Legendre. 2012. Numerical ecology. 3rd English Edition. Elsevier, Amsterdam.

•  Borcard, D., F. Gillet & P. Legendre. 2011. Numerical Ecology with R. Springer, New York.

•  http://www.numericalecology.com/

Page 6: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

6

Introduction to data analysis

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Distribution de F, 15 et 30 d.l.

F

Densité

# Import data into R spe <- read.csv("DoubsSpe.csv", row.names=1) env <- read.csv("DoubsEnv.csv", row.names=1) # Hellinger-transform the species dataset spe.hel <- decostand(spe, "hellinger") # Redundancy analysis (RDA) # ************************* # RDA of the Hellinger-transformed fish species data, constrained # by all the environmental variables contained in env2 (spe.rda <- rda(spe.hel ~ ., env)) # Observe the shortcut # formula summary(spe.rda) # Scaling 2 (default) # Canonical coefficients from the rda object coef(spe.rda) # R^2 retrieved from the rda object RsquareAdj(spe.rda)

−0.5 0.0 0.5 1.0

−0.5

0.0

0.5

RDA triplot − Scaling 1 − lc

RDA 1

RDA

2 ●

●● ●

●●

●●

● ●

●●

12

3

4

56

7

9

10

11 12 1314

15

16

17181920

21 22

23

24

25

2627

28

2930SatrPhphChna

Baba

AlbiRham

Legi

CycaAbbr

Gyce

Ruru

Blbj

Alal

Anan

alt

flopH

har

pho

nit

amm

oxy

bdo

● ●

●slolowslomoderate

slosteep

slovery_steep

Page 7: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

7

Software

R 3.3.1 for Windows, Mac OS or Linux

Page 8: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

8

Definitions

Numerical ecology “The field of quantitative ecology devoted to the numerical analysis of ecological data sets. (...) The purpose of numerical ecology is to describe and interpret the structure of data sets by combining a variety of numerical approaches. Numerical ecology differs from descriptive or inferential ecological statistics in that it combines relevant multidimensional statistical methods with heuristic techniques (e.g. cluster analysis) (...) ” (Legendre & Legendre, 2012).

Page 9: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

9

Definitions

Numerical ecologyLet us add that a great number of the methods used in numerical ecology, especially the new approaches developed since the 1980’s, have been developed by ecologists (and not pure statisticians!), in response to specific ecological problems.

Page 10: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

10

Definitions

Multivariate, multidimensional analysis: methods of numerical analysis addressing whole data tables where every observation, every sampling unit is characterised by several variables:

- species abundances

- climatic measures- and so on...

Page 11: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

11

The data

Table I - Structure of an ecological data table DescriptorsObjects Variable 1 Variable 2 Variable j Variable pObject 1 y11 y12 ... y1j ... y1pObject 2 y21 y22 ... y2j ... y2p....Object i yi1 yi2 ... yij ... yip...Object n yn1 yn2 ... ynj ... ynpThe objects are the observations (sites, relevés...).

Page 12: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

12

The data

.

Species

Objects

1

n

p 1

n

m

Environnementalvariables

Spatialvariables1

n

q

Descriptors

Page 13: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

13

Aims of multivariate analysis

v  Measurement of resemblance among objects or variables of a data table;

v  Clustering of the objects or variables according to these resemblances;

v  Ordination in a reduced space allowing to emphasise their main structures (especially gradients);v  Modelling of the relationships between response data tables and explanatory variables;

v  Tests of these relationships for statistical significance.

Page 14: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

14

Ordination in reduced space

Day 1

Page 15: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

15

Ordination in reduced space

1. Generalities

.

00

1

2

3

4

5

6

1 2 3 4 5

Object 1

Object 2

Object 3Object 4

Object 5 Object 6

Variable 1

Var

iabl

e 2

Var.1 Var.2

Obj.1Obj.2Obj.3Obj.4Obj.5Obj.6

1 54 43 31 32 24 1

Page 16: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

16

Ordination in reduced space

2. Principal component analysis (PCA)

.

00

1

2

3

4

5

6

1 2 3 4 5

Object 1

Object 2Object 3

Object 4

Object 5

Object 6

Variable 1

Var

iabl

e 2

Axis PCA 1

Axis PC

A 2Object 1

Object 4 Object 5Object 6

Object 3

Object 2

Axis PCA 1

Axi

s PCA

2

Page 17: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

17

Ordination in reduced space

2. Principal component analysis (PCA)

..

Object 1

Object 4 Object 5

Object 3

Object 2

Axis PCA 1Object 6

Variable 1

Variable 2A

xis P

CA 2

λ1 = 0.705

λ 2 =

0.2

95

Page 18: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

18

Ordination in reduced space

2. Principal component analysis (PCA)

One can conduct a PCA and display its results in different ways:Covariance or correlation?

PCA on the raw(centred) variables (= PCA on a covariance matrix) is only appropriate when the variables are dimensionally homogeneous.

PCA on a correlation matrix: when the variables are not dimensionally homogeneous or when one wants all variables to have equal weights in the analysis.

Page 19: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

19

Ordination in reduced space

2. Principal component analysis (PCA)

One can conduct a PCA and display its results in different ways:

Scaling:

PCA Scaling 1 = Distance biplot: the eigenvectors are scaled to unit length; the main properties of the biplot are the following:

(1)  Distances among objects in the biplot are approximations of their Euclidean distances in multidimensional space.

(2)   Projecting an object at right angle on a descriptor approximates the position of the object along that descriptor.

Page 20: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

20

Ordination in reduced space

2. Principal component analysis (PCA)

One can conduct a PCA and display its results in different ways:

Scaling:

PCA Scaling 1 = Distance biplot: the eigenvectors are scaled to unit length; the main properties of the biplot are the following:

(3)  Since descriptors have length 1 in the full-dimensional space, the length of the projection of a descriptor in the reduced space indicates how much it contributes to that space.

(4) The angles among descriptor vectors do not reflect their correlations.

Page 21: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

PCA Scaling 1 = Distance biplot: the eigenvectors are scaled to unit length; the main properties of the biplot are the following: (1)   Distances among objects in the biplot are approximations of their Euclidean distances in multidimensional space. (2)  Projecting an object at right angle on a descriptor approximates the position of the object along that descriptor. (3)  Since descriptors have length 1 in the full-dimensional space, the length of the projection of a descriptor in the reduced space

indicates how much it contributes to that space. (4)  The angles among descriptor vectors do not reflect their correlations.

Page 22: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

22

Ordination in reduced space

2. Principal component analysis (PCA)

PCA Scaling 2 = correlation biplot: the eigenvectors are scaled to the square root of their eigenvalue.

(1) Distances among objects in the biplot are not approximations

of their Euclidean distances in multidimensional space. (2)   Projecting an object at right angle on a descriptor

approximates the position of the object along that descriptor.

Page 23: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

23

Ordination in reduced space

2. Principal component analysis (PCA)

(3)   (3a) PCA on a covariance matrix: descriptors have length sj (= their standard deviation) in the full-dimensional space; therefore, the length of the projection of a descriptor in the reduced space is an approximation of its standard deviation.

(4)   (3b) PCA on a correlation matrix: all the descriptors have unit variance (s = 1); the length of the projection of a descriptor in the reduced space reflects its contribution to that space.

(5)  The angles between descriptors in the biplot reflect their correlations.

Page 24: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

PCA: scaling 2

Observe thethree groups ofhighly correlatedspecies.

Page 25: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

25

Ordination in reduced space

2. Principal component analysis (PCA) - in summary

One can conduct a PCA and display its results in different ways:

Scaling:

If the main interest of the analysis is to interpret the relationships among objects, choose scaling 1.

If the main interest focuses on the relationships among descriptors, choose scaling 2.

Page 26: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

26

Ordination in reduced space

2. Principal component analysis (PCA)

Equilibrium contribution circle

•  In scaling 1.•  Equilibrium contribution: length that a descriptor-vector

would have if it contributed equally to all the dimensions (principal axes) of the PCA.

Page 27: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

27

Ordination in reduced space

2. Principal component analysis (PCA)

Equilibrium contribution circle

•  The circle has a radius equal to √(d/p), where d equals the number of dimensions of the reduced space considered (usually d=2) and p equals the total number of descriptors (and hence of principal components) in the analysis.

Page 28: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

PCA: Equilibrium contribution circle

-5 0 5

-6-4

-20

24

6

PCA biplot - Scaling 1

PCA 1

PC

A 2

1

234

56

7

910

1112131415

16 171819202122

23

24

25

26

27

282930

dfs

ele

slo

dis

pH

har

pho

nit

amm

oxy

bod

-2 0 2 4

-3-2

-10

12

3

PCA biplot - Scaling 2

PCA 1

PC

A 2

1

2

34

56

7

910

1112

131415

16 171819202122

23

24

25

26

27

28

29

30 dfs

eleslo

dis

pH

har

pho

nit

amm

oxy

bod

Page 29: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

29

Ordination in reduced space

2. Principal component analysis (PCA)

Interpretation of meaningful components

Not all ordination axes are meaningful. Which ones are?1.  Arbitrary decision (ex.: interpretation of axes representing

75% variance)2.  Interpretation of axes whose eigenvalues are larger than the

average of all eigenvalues (Kaiser-Guttman criterion)

Page 30: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

30

Ordination in reduced space

2. Principal component analysis (PCA)

Interpretation of meaningful components

Not all ordination axes are meaningful. Which ones are?3.  Broken stick model: comparison of eigenvalues to a model

predicting the (decreasing) sizes of the pieces of a stick broken in as many pieces as there are eigenvalues.1.  Comparison of individual eigenvalues;2.  Comparison of cumulative eigenvalues.

Page 31: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

31

Ordination in reduced space

2. Principal component analysis (PCA)

Uses and misuses of PCA

Main applications in ecology: ordination of sites on the basis of quantitative environmental variables (physical and chemical descriptors), not raw (untransformed) species abundances in most cases (see Chapter 2, double zero problem). However, PCA is now very often applied to species data that have undergone a pre-transformation following Legendre and Gallagher (2001). See slides 36 and following.

Legendre, P. & E. D. Gallagher. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129: 271-280.

Page 32: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

32

Ordination in reduced space

2. Principal component analysis (PCA)

Uses and misuses of PCA

PCA has originally been defined for data with multinormal distributions. However, PCA is not very sensitive to departure from multinormality, as long as the distributions are not exaggerately skewed. If some are, then the first few PCA axes will display several objects with extreme values on these variables instead of main trends.

Page 33: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

33

Ordination in reduced space

2. Principal component analysis (PCA)

Uses and misuses of PCA

PCA is computed from a matrix of dispersion (linear covariances or correlations). This means that the method must be applied on a dispersion matrix among descriptors that are quantitative and for which valid estimations of covariances or correlations may be obtained.

Page 34: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

34

Ordination in reduced space2. Principal component analysis (PCA)

Uses and misuses of PCAThese conditions are violated in the following cases: (1) The data matrix must not be transposed, since covariances or correlations among objects are meaningless. (2) Covariances and correlations are defined for quantitative variables. However, PCA is very robust to variations in the precision of data.

Page 35: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

35

Ordination in reduced space

2. Principal component analysis (PCA)

Uses and misuses of PCAThese conditions are violated in the following cases: (3) With data sets with many zeros, PCA produce inadequate estimates of the site positions, unless the abundance data have been transformed in an ecologically appropriate way.(4) Avoid the (all too common) mistake of interpreting the relationships among variables with help of the proximities of the apices (points) of the vector-arrows instead of the angles (when appropriate) on the biplots!

Page 36: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

36

Transformation of species abundance data tables prior to

linear analyses

Day 1

Page 37: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

37

Data transformation

0. Preliminary remarks: general types of transformation for any kind of variables

Page 38: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

38

Data transformation

1. Make comparable descriptors that have been measured in different units:

Ranging: Standardization:

yi' =

yi − yminymax − ymin

yi' =

yiymax

yi' = zi =

yi − y sy

Example: model II regression, ranged major axis

Example: explanatory variables in RDA

Page 39: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

39

Data transformation

2. Normalize the data and stabilize their varianceSquare root transformation: y'i = √(yi+c):

.

0

10

20

30

0 2 4 6 8 10 12Nb. Syrphid Diptera

.5 1 1.5 2 2.5 3 3.5 4Square root(Nb. Syrphids)

Nb.

obs

erva

tion

hour

s

Example: species data with moderate asymmetryNote: constant c must be added only if there are negative values in the data, i.e., not in the case of species abundances.

Page 40: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

40

Data transformation

2. Normalize the data and stabilize their varianceLog transformation: y'i = ln(yi+c):

.

02468

10121416

0 .5 1 1.5 2 2.5 3 3.5 4 4.5ln(nb. ind.+1)

05

1015202530

0 10 20 30 40 50 60 70Nb. individuals

Nb.

of s

oil c

ores

Oppiella nova

Example: species data with more severe asymmetry

Page 41: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

41

Data transformation

3. Recode semi-quantitative variables as quantitativeExample: transformation of Braun-Blanquet's phytosociological scale into

quantitative values:

Page 42: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

42

Data transformation

4. Binary coding of nominal variables

1 qualitative descriptor 4 binary descriptors (dummy variables)

Modalities Classes (factors) Calcosol Brunisol Neoluvisol Calcisol

Calcosol 1 1 0 0 0

Brunisol 2 0 1 0 0

Neoluvisol 3 0 0 1 0

Calcisol 4 0 0 0 1 Translated from F. Gillet (Neuchâtel, Switzerland) course notes.

For k levels, (k – 1) dummy variables needed <-> (k – 1) d.f.

à later: see orthogonal Helmert contrasts

Page 43: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

43

Data transformation

5. Transformations to obtain ecologically meaningful relationships among sites while using linear techniques

Page 44: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

44

1. The double-zero problem

Community composition data sampled over variable environmental conditions, e.g. along long environmental gradients, typically contain many zero values because species are known to generally have unimodal distributions along environmental gradients and to be absent from sites far from their optimal living conditions.

The proportion of zeros is greater when the environmental conditions are more variable across the sampling sites.

Transformation of species data

Page 45: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

45

1. The double-zero problem

Transformation of species data

Brachysp Phthirsp Hoplcfpa Rhysardu Atrostri Protopsp Malacfeg Malacfpr Malacosp Eniominu 1 17 5 5 3 2 1 4 2 2 1 2 2 7 16 0 6 0 4 2 0 0 3 4 3 1 1 2 0 3 0 0 0 4 23 7 10 2 2 0 4 0 1 2 5 5 8 13 9 0 13 0 0 0 3 6 19 7 5 9 3 2 3 0 0 20 7 17 3 8 2 3 0 3 0 0 19 8 5 4 8 2 1 2 3 0 0 1 9 3 3 2 2 1 1 12 0 0 0 10 22 4 5 3 0 0 0 0 0 11

Page 46: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

46

1. The double-zero problem

The zero value in a matrix of species abundances (or presence-absence) is tricky to interpret:

The presence of a species at a given site (i.e., non-zero value) generally implies that this site provides a set of minimal conditions allowing the species to survive (the dimensions of its ecological niche).

Transformation of species data

Page 47: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

47

1. The double-zero problem

The zero value in a matrix of species abundances (or presence-absence) is tricky to interpret:

The absence of a species from a site may be due to a variety of causes: the species' niche may be occupied by a replacement species, or the absence of the species is due to adverse conditions on any of the important dimensions of its ecological niche, or the species has been missed because of a purely stochastic component of its spatial distribution.

Transformation of species data

Page 48: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

48

1. The double-zero problem

Double zero: absence of a given species from two sites.

A double zero cannot readily be interpreted as a resemblance between the two sites.

Transformation of species data

The methods of analysis of species data must take this problem into account.

Most methods of analysis of species data work by comparing the sites, explicitly or implicitly, on the basis of an association measure.

Page 49: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

49

1. The double-zero problem

• The association coefficients that consider the double zero as a resemblance (as any other value) are said to be symmetrical.

• The association coefficients that do not consider the double zero as a resemblance (as any other value) are said to be asymmetrical.

It is preferable to use asymmetrical coefficients when analysing species data

Transformation of species data

Page 50: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

50

2. Pre-transformation of species data

Since it is a linear method working in a Euclidean space, PCA is not adapted to raw species abundance data, since the zero is treated as any other value. The Euclidean distance imbedded in PCA is a symmetrical measure.

Therefore….

Transformation of species data

Page 51: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

51

2. Pre-transformation of species data

Legendre, P. & E. D. Gallagher. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129: 271-280.

Pre-transform the species data in such a way that, after PCA, the distance preserved among objects is no more the Euclidean distance, but an ecologically meaningful one.

Transformation of species data

Page 52: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

52

Ordination in reduced space

2. Pre-transformation of species data

Chord distance (D3)

χ2 metric (D15)

χ2 distance (D16)

Distance between species profiles

Hellinger distance (D17)

yij' =

yij

yij2

j=1

p

yij' =

yijyi+ y+ j

yij' = y++

yijyi+ y+ j

yij' =

yijyi+

yij' =

yijyi+

Page 53: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

53

Ordination in reduced space

2. Pre-transformation of species data: illustrationThe species abundance paradox (Orlóci, 1978)

Transformations for community composition data 329

Figure 7.8 Species abundance paradox data, modified from Orlóci (1978). The paradox is that the

Euclidean distance between sites 2 and 3, which have no species in common, is smaller than that

between sites 1 and 2 which share species 2 and 3. This results in an incorrect assessment of the

ecological relationships among sites. With the other coefficients in this figure, which are

asymmetrical, the distance between sites 2 and 3 is larger than that between sites 1 and 2, and

the distance between sites 1 and 3 is the same as between sites 2 and 3, or very nearly so.

Distance matrix D15 (not shown) is equal to D16/ = D16/ .y++

15

In these data, the Euclidean distance between sites 2 and 3 (no species in common) is smaller than the distance between sites 1 and 2.

Transformations for community composition data 329

Figure 7.8 Species abundance paradox data, modified from Orlóci (1978). The paradox is that the

Euclidean distance between sites 2 and 3, which have no species in common, is smaller than that

between sites 1 and 2 which share species 2 and 3. This results in an incorrect assessment of the

ecological relationships among sites. With the other coefficients in this figure, which are

asymmetrical, the distance between sites 2 and 3 is larger than that between sites 1 and 2, and

the distance between sites 1 and 3 is the same as between sites 2 and 3, or very nearly so.

Distance matrix D15 (not shown) is equal to D16/ = D16/ .y++

15

Site 1 Site 2 Site 3Site 1Site 2Site 3

Page 54: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

54

Ordination in reduced space

3. Pre-transformation of species data: illustrationThe species abundance paradox (Orlóci, 1978)

Transformations for community composition data 329

Figure 7.8 Species abundance paradox data, modified from Orlóci (1978). The paradox is that the

Euclidean distance between sites 2 and 3, which have no species in common, is smaller than that

between sites 1 and 2 which share species 2 and 3. This results in an incorrect assessment of the

ecological relationships among sites. With the other coefficients in this figure, which are

asymmetrical, the distance between sites 2 and 3 is larger than that between sites 1 and 2, and

the distance between sites 1 and 3 is the same as between sites 2 and 3, or very nearly so.

Distance matrix D15 (not shown) is equal to D16/ = D16/ .y++

15

Chord

Species profiles

Hellinger

χ2 distance

Euclidean

Page 55: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

55

2. Pre-transformation of species data

The chord transformation consists in transforming the species abundances from each object into a vector of length 1.

Transformation of species data

y.chord <- decostand(y, method="normalize")

If one transforms a data table with this formula, any method that preserves the Euclidean distance among the objects (e.g., PCA, RDA, ANOVA) is now accessible for species abundance data.

Page 56: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

56

2. Pre-transformation of species data

The Hellinger transformation consists in taking the squareroot of the relative species abundances.

Transformation of species data

y.hel <- decostand(y, method="hellinger")

If one transforms a data table with this formula, any method that preserves the Euclidean distance among the objects (e.g., PCA, RDA, ANOVA) is now accessible for species abundance data.

Page 57: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

57

2. Pre-transformation of species data

The chord and Hellinger transformations appear to be the best for general use.

They can both be applied to presence-absence data as well, with identical results.

Transformation of species data

Page 58: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

58

Correspondence analysis (CA)

Day 1

Page 59: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

59

Ordination in reduced space

4. Correspondence analysis (CA)

CA is actually a PCA on a species data table that has been transformed into a table of Pearson's χ2 statistic (the one used in contingeny tables).

The χ2 distance (D16) is preserved among sites or species.

CA is a method adapted to the analysis of species abundance data.

Page 60: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

60

Ordination in reduced space

4. Correspondence analysis (CA)

The data must be dimensionally homogeneous and 0 or positive.CA produces one axis less than min[n,p].The variation of the data is measured by a quantity named inertia.Inertia is partitioned among ordination axes as in PCA. The weight

of each axis is also measured by an eigenvalue.

The objects and the species are represented as points on the same joint plot.

Page 61: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

61

Ordination in reduced space

4. Correspondence analysis (CA)

Scalings:

CA scaling type 1: rows (objects) are at the centroids of columns.

The most appropriate if one is primarily interested in the

ordination of objects (sites). In the multidimensional space, χ2 distance is preserved among

objects.

Page 62: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

62

Ordination in reduced space

4. Correspondence analysis (CA)

CA scaling type 1: interpretation:(1)  The distances among objects in the reduced space

approximate their χ2 distance. Thus, object points that are close to one another are likely to be relatively similar in their species relative frequencies.

(2)  Any object found near the point representing a species is likely to have a high contribution of that species. For presence-absence data, the object is more likely to posses the state "1" for that species.

Page 63: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

63

Ordination in reduced space

4. Correspondence analysis (CA)Scalings:

CA scaling type 1

..

Species 1 Species 3

Species 2

Object 1

Object 2Object 4

Object 3

Object 6 Object 5

Axis CA 1

Axi

s CA

2

λ1 = 0.2295

λ 2 =

0.0

857

Université Laval Analyse multivariable - mars-avril 2008 13

Dr. Daniel Borcard Université de Montréal

(2) Any object found near the point representing a species is likelyto have a high contribution of that species. For presence-absencedata, the object is more likely to possess the state "1" for thatspecies.

• CA scaling type 2: columns are at the centroids of rows. This scalingis the most appropriate if one is primarily interested in the ordination ofspecies. In the multidimensional space, χ2 distance is preserved amongspecies. Interpretation:

(1) The distances among species in the reduced space approximatetheir χ2 distance. Thus, species points that are close to one anotherare likely to have relatively similar relative frequencies in theobjects.(2) Any species that lies close to the point representing an object ismore likely to be found in that object, or to have a higherfrequency there than in objects that are further away in the jointplot.

The following example (Table VII) will be submitted to acorrespondence analysis:

Table VII - Artificial data for CA

Spec.1

Obj.1Obj.2Obj.3Obj.4Obj.5Obj.6

1 54 43 31 32 24 1

260540

Spec.3Spec.2

Page 64: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

64

4. Ordination in reduced space

-2 -1 0 1 2

-2-1

01

2

Oribatid mites, Lac Geai

CA1

CA2 Brachy

PHTH

HPAV

RARD

SSTR

ProtoplMEGR

MPRO

TVIE

HMIN

HMIN2

NPRA

TVEL

ONOV

SUCT

LCIL

Oribatl1

Ceratoz1

PWIL

Galumna1

Stgncrs2

HRUF

Trhypch1

PPEL

NCOR

SLAT

FSETLepidzts

Eupelops

Miniglmn

LRUGPLAG2Ceratoz3

Oppiminu

Trimalc2

123

45678

9

10 111213

14

15

161718

19

20

212223

24

252627

28

29 30 31

32

33

34

3536

37

38

39

40

41

42

4344

45

46

47

48

49

50

5152

53

5455

56

57

58

59

60

6162

63

64

65

66

67

6869

70

CA scaling type 1

library(vegan) data(mite) mite.ca <- cca(mite) plot(mite.ca, scaling=1, main="Oribatid mites, Lac Geai")

Page 65: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

65

Ordination in reduced space

4. Correspondence analysis (CA)

Scalings:

CA scaling type 2: columns (species) are at the centroids of rows.

The most appropriate if one is primarily interested in the

ordination of species. In the multidimensional space, χ2 distance is preserved among

variables (species).

Page 66: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

66

Ordination in reduced space

4. Correspondence analysis (CA)

CA scaling type 2: interpretation:(1)  The distances among species in the reduced space

approximate their χ2 distance. Thus, species points that are close to one another are likely to have relatively similar relative frequencies in the objects.

(2) Any species that lies close to the point representing an object is more likely to be found in that object, or to have a higher frequency there than in objects that are further away in the joint plot.

Page 67: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

67

Ordination in reduced space

4. Correspondence analysis (CA)

Scalings:

CA scaling type 2

..

Species 1

Species 2

Object 1

Object 2Object 4

Object 3

Object 6 Object 5

Axis CA 1

Axi

s CA

2

λ1 = 0.2295

λ 2 =

0.0

857

Species 3

Université Laval Analyse multivariable - mars-avril 2008 13

Dr. Daniel Borcard Université de Montréal

(2) Any object found near the point representing a species is likelyto have a high contribution of that species. For presence-absencedata, the object is more likely to possess the state "1" for thatspecies.

• CA scaling type 2: columns are at the centroids of rows. This scalingis the most appropriate if one is primarily interested in the ordination ofspecies. In the multidimensional space, χ2 distance is preserved amongspecies. Interpretation:

(1) The distances among species in the reduced space approximatetheir χ2 distance. Thus, species points that are close to one anotherare likely to have relatively similar relative frequencies in theobjects.(2) Any species that lies close to the point representing an object ismore likely to be found in that object, or to have a higherfrequency there than in objects that are further away in the jointplot.

The following example (Table VII) will be submitted to acorrespondence analysis:

Table VII - Artificial data for CA

Spec.1

Obj.1Obj.2Obj.3Obj.4Obj.5Obj.6

1 54 43 31 32 24 1

260540

Spec.3Spec.2

Page 68: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

68

4. Ordination in reduced space

-2 -1 0 1 2

-2-1

01

Oribatid mites, Lac Geai

CA1

CA2

Brachy

PHTH

HPAV

RARD

SSTR

Protopl MEGR

MPRO

TVIE

HMIN

HMIN2

NPRA

TVEL

ONOV

SUCT

LCIL

Oribatl1

Ceratoz1

PWIL

Galumna1

Stgncrs2

HRUFTrhypch1

PPEL

NCOR

SLATFSETLepidzts

Eupelops

Miniglmn

LRUGPLAG2Ceratoz3

Oppiminu

Trimalc2

12

3

4

5

6

7

8

9

10 11

1213

14

15

1617

18

19

20

2122

23

24

2526

27

28

2930 31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

6162

63

64

65

66

67

68

69

70

CA scaling type 2library(vegan) data(mite) mite.ca <- cca(mite) plot(mite.ca, scaling=2, main="Oribatid mites, Lac Geai")

Page 69: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

69

Ordination in reduced space

4. Correspondence analysis (CA)

Words of caution

•  Graphically, correspondence analysis seems very sensitive to rare species, which take extreme values in plots.

•  Rare species have a small influence on the extraction of the principal axes, however.

Page 70: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

70

Ordination in reduced space

4. Correspondence analysis (CA)

Words of caution

•  Strong gradients generate arch effects:Environmental gradients often support a succession of species that tend to have unimodal distributions.At the ends of the gradients, series are distorted.Distortion occurs also in PCA (horseshoe effects).

Page 71: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

71

Ordination in reduced space

4. Correspondence analysis (CA)

Words of cautionSpecies packing model:

CA approximates a Gaussian ordination.

Position of species along axes approximate their ecological optimum. 0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Page 72: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

72

Ordination in reduced space

Arch and horseshoe effects

-1.41 -1.34

-0.70

-0.66

0.00

0.02

0.70

0.71

1.41

1.39

-14.63-17.31

-7.31

-10.15

0.00

-2.99

7.32

4.18

14.63

11.34

CA: arch effect PCA: horseshoe effect

A B

Page 73: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

73

Ordination in reduced space

4. Correspondence analysis (CA)

Detrended correspondence analysis (DCA):

Detrending by segments:

Axis I is divided into an arbitrary number of segments.

Within each segment, the mean of the object scores along axis 2 is made equal to zero.

This methods has been strongly rejected by many authors.

The scores on the second axis are essentially meaningless.

Page 74: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

74

Principal coordinate analysis (PCoA)

Day 1

Page 75: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

75

Ordination in reduced space

5. Principal coordinate analysis (PCoA)

Allows to obtain a Euclidean representation of a set of objects whose relationships are measured by any similarity or dissimilarity coefficient chosen by the user.PCoA produces a set of orthogonal ordination axes whose importance is measured by eigenvalues (as in PCA and CA).

PCoA can only represent the relationships among objects (Q mode) or variables (R mode), but not both at the same time, unless scores for variables are computed after a Q-mode PCoA (using correlations or weighted averages).

Page 76: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

76

Ordination in reduced space

5. Principal coordinate analysis (PCoA)In the case of Euclidean association measures, PCoA will behave in a Euclidean manner.

If the association coefficient used is nonmetric, semimetric or has other problems of "non-Euclideanarity", then PCoA will react by producing several negative eigenvalues in addition to the positive ones.If one wants to retain all the variance of the original association matrix, there are technical solutions to this problem (e.g. taking the square root of the dissimilarities, or else apply Lingoes or Caillez correction).

Page 77: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

77

Ordination in reduced space

5. Principal coordinate analysis (PCoA)

..

-0.30 -0.15

-0.10

-0.05

0.10

0.05

0.30

0.15

0.50

0.25

Object 5

Object 1

Object 2

Object 4

Object 3

Object 6

Axi

s PCo

A 2

λ 2 =

0.0

95

Axis PCoA 1λ1 = 0.359

1 × 2 axis of a PCoA on a D14 (percentage difference, aka Bray-Curtis) matrix of the 6 × 3 data used in the first CA example. The two first axes have eigenvalues of 0.359 and 0.095; they represent 59.5% and 20.2% variance respectively. This PCoA gives 4 positive, one zero and one negative eigenvalue.

Page 78: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

78

-0.4 -0.2 0.0 0.2 0.4

-0.4

-0.2

0.0

0.2

0.4

Oribatid mites, D14, PCoA

Dim1

Dim2

1

234

5 6

78

9

10

11

1213

1415

161718

19

20

21

22

23

24

25

26

27

28

2930

31

32

33

34

35

36

37

38

39

40

41

42

43

44

4546

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

62

6364

65

66

67

68 69

70

Brachy

PHTH

HPAV

RARD

SSTR

Protopl

MEGR

MPRO

TVIEHMIN

HMIN2

NPRA

TVELONOV

SUCT

LCIL

Oribatl1

Ceratoz1

PWILGalumna1

Stgncrs2

HRUFTrhypch1

PPEL

NCOR

SLATFSETLepidzts

Eupelops

Miniglmn

LRUG PLAG2Ceratoz3Oppiminu

Trimalc2

mite.bray <- vegdist(mite, "bray") mite.PCoA <- cmdscale(mite.bray, k=ncol(mite)-1, eig=TRUE) ordiplot(scores(mite.PCoA)[,c(1,2)], type="t", main="Oribatid mites, D14, PCoA") abline(h=0, lty=3) abline(v=0, lty=3) sp.wa <- wascores(mite.PCoA$points[,1:2], mite) text(sp.wa, rownames(sp.wa), cex=0.7, col="red")

Species scores have been computed after the PCoA analysis by weighted averages of site scores

Page 79: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

79

PCoA on the Doubs fish data. Scaling 1.Species scores have been computed after the PCoA analysis by correlation with the ordination axes, as in PCA.

-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

Axis.1

Axis.2

1

2

3

45

6

7

8

9

10

1112

13

14

15

1617

18

19

20 2122

23

24

25

26

2728

29

30

-0.4 -0.2 0.0 0.2 0.4

-0.4

-0.2

0.0

0.2

0.4

Cogo

Satr

Phph

Babl

Thth

Teso

Chna

PatoLele

Sqce

Baba

Albi

Gogo

Eslu

Pefl

Rham

Legi

Scer

Cyca

Titi

Abbr

Icme

Gyce

Ruru

Blbj

Alal

Anan

PCoA biplotResponse variables projected

as in PCA with scaling 1

Page 80: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

80

Ordination in reduced space

5. Principal coordinate analysis (PCoA)

Sometimes used as an intermediate, technical step in more complex analyses.Examples:•  Raw variables neither quantitative nor orthogonal ==> PCoA

==> one obtains a set of quantitative and orthogonal variables (the PCoA axes)!

•  Spatial analysis using dbMEM (distance-based Moran's eigenvector maps à Day 5

Page 81: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

81

Ordination in reduced space

X =Explanatory

variables

Y = Raw data

(sites x species)

Short gradients: CCA or RDALong gradients: CCA

Constrained ordination of species data(d) Classical approach

(e) Transformation-based approach (tb-RDA)

(f) Distance-based approach (db-RDA)

Raw data

(sites x species)

X =Explanatory

variables

Y=Transformeddata

(sites x species)

RDA

Raw data

(sites x species)Distancematrix

X =Explanatory

variables

Y =(sites x

principal coord.)

PCoA

RDA

Representation of elements:Species = arrowsSites = symbolsExplanatory variables = lines

Canonicalordination triplot

Unconstrained ordination of species data(a) Classical approach

(b) Transformation-based approach (tb-PCA)

Y = Raw data

(sites x species)

Short gradients: CA or PCALong gradients: CA

Raw data

(sites x species)

Y=Transformeddata

(sites x species)

PCA

Ordination biplot

Representation of elements:Species = arrowsSites = symbols

Raw data

(sites x species)Distancematrix

PCoA

(c) Distance-based approach (PCoA)Ordination of sites

Representation of elements:Sites = symbols

Page 82: Analysis of Multivariate Ecological Databiol09.biol.umontreal.ca › Trieste16 › ICTP_Theory_DB_2016_Day1.pdf · Numerical ecology Let us add that a great number of the methods

82

Ordination in reduced spaceX =Explanatory

variables

Y = Raw data

(sites x species)

Short gradients: CCA or RDALong gradients: CCA

Constrained ordination of species data(d) Classical approach

(e) Transformation-based approach (tb-RDA)

(f) Distance-based approach (db-RDA)

Raw data

(sites x species)

X =Explanatory

variables

Y=Transformeddata

(sites x species)

RDA

Raw data

(sites x species)Distancematrix

X =Explanatory

variables

Y =(sites x

principal coord.)

PCoA

RDA

Representation of elements:Species = arrowsSites = symbolsExplanatory variables = lines

Canonicalordination triplot

Unconstrained ordination of species data(a) Classical approach

(b) Transformation-based approach (tb-PCA)

Y = Raw data

(sites x species)

Short gradients: CA or PCALong gradients: CA

Raw data

(sites x species)

Y=Transformeddata

(sites x species)

PCA

Ordination biplot

Representation of elements:Species = arrowsSites = symbols

Raw data

(sites x species)Distancematrix

PCoA

(c) Distance-based approach (PCoA)Ordination of sites

Representation of elements:Sites = symbols