Introduction
Theory, Examples & Exercises

 Constrained ordination
Data & Functions
Others
Permalink: http://bit.ly/anadatr Author: David Zelený
Introduction
Theory, Examples & Exercises
Data & Functions
Others
Permalink: http://bit.ly/anadatr Author: David Zelený
Correspondence analysis (CA, previously know also as reciprocal averaging and several other names), is a unimodal unconstrained ordination method. In the space of all ordination axes, it preserves chisquare distances among samples, which does not suffer from the doublezero problem but is blamed by some for being too much influenced by rare species (which is perhaps not true, see below). The data must be nonnegative abundances or presencesabsences. Correspondence analysis suffers from creating often strong arch artefact in ordination diagrams, which is caused by nonlinear correlation between first and higher axes. Arch can be removed by detrending, which is the base of detrended correspondence method (DCA). Distribution of samples along the first (D)CA axis is used as a base of TWINSPAN classification algorithm.
Although nowaday's software is using matrix algebra to calculate CA, the original algorithm is based on reciprocal averaging of column and row scores, which starts from random values, and by interative row and columnaveraging converge into a unique solution, which represents the sample and species scores.
It has the following five calculation steps:
After calculating the sample and species scores for the first axis, one can continue to the second and higher axes, while maintaining linear independence from all previously calculated axes.
The following table (modified Table 45 from Šmilauer & Lepš 2014) shows a simple example how to calculate sample and species scores“
Cirsium  Glechoma  Rubus  Urtica  initial score  x.WA1  x.WA1resc  x.WA2  

Sample 1  0  5  6  8  0  1.095  0  0.41 
Sample 2  0  2  2  1  4  1.389  0.422  0.594 
Sample 3  3  1  0  0  10  8.063  10  7.839 
u.WA1  10  2.25  1  0.444  
u.WA2  10  1.355  0.105  0.047  
u.WA3  10  1.312  0.062  0.028  
u.WA4  10  1.31  0.06  0.027  
Calculation steps: 1. Initial scores (0, 4, and 10) 2. Species scores: u.WA1_{Cirsium} = (0*0 + 0*4 + 3*10)/(0 + 0 + 3) = 30 u.WA1_{Glechoma} = (5*0 + 2*4 + 1*10)/(5 + 2 + 1) = 2.25 u.WA1_{Rubus} = (6*0 + 2*4 + 0*10)/(6 + 2 + 0) = 1 u.WA1_{Urtica} = (8*0 + 1*4 + 0*10)/(8 + 1 + 0) = 0.444 3. Sample scores: x.WA1_{Sample 1} = (0*10 + 5*2.25 + 6*1 + 8*0.444)/(0 + 5 + 6 + 8) = 1.095 x.WA1_{Sample 2} = (0*10 + 2*2.25 + 2*1 + 1*0.444)/(0 + 2 + 2 + 1) = 1.389 x.WA1_{Sample 3} = (3*10 + 1*2.25 + 0*1 + 0*0.444)/(3 + 1 + 0 + 0) = 8.063 4. Rescale to the original range (010 here) 5. Continue by step 2 until the values converge. 
Important property of this algorithm is that it actually does not depends on the arbitrary choice of initial scores, as can be seen on Fig. 1 (in the example table above, the initial scores were preselected in the way that the convergence is faster; if they are random values, the convergence will still occur but will happen later).
CA algorithm has, however, two unpleasent properties: it produces more or less pronounced arch artefact, and it compresses the samples at the 1staxis ends relative to the middle (see example on Fig. 2).
A detrended version of correspondence analysis (DCA) attempts to remove the arch effect from ordination (Fig. 3. The method was (and still is) very popular, especially among vegetation ecologists, because it gives often rather meaningful distribution of samples in ordination diagrams. Additionally, it has one interesting property: the length of the first axis (in SD units) refers to the heterogeneity or homogeneity of the dataset, and can be used to decide whether data should be analysed by linear (axis shorter than 3 SD) or unimodal (axis longer than 4 SD) ordination methods (details here). However, detrending (by segments) resembles using a hammer on data  arch is hammered by cutting the first axis into segments and moving the sample points up and down along the second axis (you may see rescaling from CA to DCA here). For this and other reasons, the method is criticized and not recommended for use by some of the researchers (see e.g. Legendre & Legendre 1998, Borcard et al. 2011, or Jari Oksanen), while defended by others (e.g. ter Braak & Šmilauer 2015).
Alternative solution, proposed by ter Braak (1986), is to remove arch artefact by applying linear constrains (explanatory variables) by calculating canonical correspondence analysis (CCA).
In CA, both objects and species are represented by points in the ordination diagram (compare to PCA, where species/descriptors are vectors and sites are points). Similarly to PCA, two types of scaling are available (Borcard et al. 2011):