Theory, R functions & Examples
The key unit in the analysis of community ecology data sets is community sample (plot, sample, sampling unit, relevé), representing presence/absence or quantity (count, cover or biomass) of each species in each sample. The way how to handle such samples is via ecological resemblance, which can be quantified, e.g. by compositional dissimilarity between two such community samples. Compositional dissimilarity describes the imaginary distance between two community samples in the multidimensional compositional space - two samples with exactly the same species composition will occupy exactly the same spot in this space, and their distance will increase with increasing dissimilarity regarding species composition. Ecological resemblance and multidimensional compositional space are two main concepts which you need to understand before you turn into learning multivariate methods. Most ordination and classification methods are based on some of the compositional dissimilarity measures, although in some of them the measure itself is not explicitly mentioned.
Ordination is a way to make an order in the set of community samples and the way to reduce multidimensional information stored in community data into a few imaginable, interpretable and printable dimensions. There are many ordination methods, with different fields (botany, zoology, microbiology) preferring different ones. Ordinations are focused on finding interpretable trends in data, represented by changes in species composition with possible underlying changes in environmental gradients. We may use it either for a description of community pattern (which is usually the purpose of unconstrained = indirect ordination) or to explain and test changes in species composition by some (e.g. environmental, spatial, temporal) variables (constrained = direct ordination).
Numerical classification is conceptually a contrast of ordination (Fig. 1) - while ordination seeks the main gradients in the continuum of community samples, classification tries to separate this continuum into a finite number of groups (clusters), each containing more or less similar samples.
Analysis of species attributes comes into question when, in addition to the matrix of species composition (L matrix, sample × species) and matrix of sample attributes (R matrix, e.g. with environmental variables, sample × sample attributes), we have also the third matrix, which contains species attributes, like species traits or species indicator values (Q matrix, species × species attributes). There are several methods how to relate species attributes (e.g. traits) to sample attributes (env. variables), e.g. by calculating a community-weighted mean of species attributes for individual samples and relating them to environmental variables by regression or correlation (community weighted mean approach), or the fourth corner, a three-matrix method (numerically closely related to CWM approach). Other options include ordination analyses like CWM-RDA or RLQ.
Diversity analysis is in certain sense also analysis of species composition matrix, whose originally multidimensional information stored in samples × species matrix is reduced into one-dimensional variables (like numbers of species in samples - alpha diversity, differences in species composition among samples - beta diversity, or the number of all species in the matrix - gamma diversity). But diversity is not only about numbers of species, but also about their relative abundances - we will briefly review also the concepts of true diversity, evenness and their representation by different diversity indices. Diversity is also influenced by the sampled area (species-area curve) and sampling effort (bias due to undersampling).
This website is focused on the use of R for all analyses. For most ordination methods we will use
vegan package written by Jari Oksanen et al., with some functions also from
ade4 by Stephen Dray & Anne B. Dufour and
labdsv by Dave Roberts. You may want to, however, learn also some other software options. For most ordination methods, an excellent and user-friendly solution is CANOCO 5 (Windows only) developed by Cajo ter Braak & Petr Šmilauer, an update of favourite CANOCO for Windows 4.5 (see ad on Fig. 2 and check here if you want to know how does CANOCO 5 compare with
vegan from the view of CANOCO authors). Although I prefer to do ordination analyses in R, I sometimes opt for CANOCO 5 because of its convenience, and also because it does great ordination diagrams (R is still staying behind in this aspect). Cluster analysis (and also most of the ordination methods) are available in PC-ORD 6 software of Bruce McCune et al. (see here for comparison of CANOCO 5 to PC-ORD 6); I must admit I do not favour this software for somewhat not an intuitive workflow, but it does offer a rather wide range of multivariate methods (some more elaborated than in CANOCO 5, e.g. NMDS with more options and detail report). For analysis of diversity (e.g. diversity indices or rarefaction) you may consider using EstimateS 9 by Robert R. Colwell. These are just some examples with which I am familiar; you may find a range of other (free or paid) software.