Communities of Harvestmen

3: Comparing Communities

Brief overview of comparative measures
and multivariate approaches.

Print Page Back to O.C. homepage

Similarity vs. distance measures:

Many of the studies that describe communities in terms of species diversity also use various comparative indices to describe the degree of similarity (or difference) between communities; some methods are described by Wainstein (1967). To simplify matters, I make no distinction here between “community”, “assemblage”, “collection” and such terms, i.e. I am presuming that the data have been obtained in such a manner that they give a good description of the communities from which the samples were drawn. Here are two examples of such indices.

One possible measure of similarity between communities is an index of overlap: Cjk = 100( 1-½Σ|pij-pik| ), where pij and pik are the proportional abundance of species i  in communities j and k respectively, the sign of the difference being ignored. This measure has a range from 0 - 100%, from completely different (no species in common) to 100% (not only the same species, but present in the same proportions).

In contrast, a distance measure indicates how far apart communities are in terms of species composition; an example is the Euclidean distance measure, Djk = √Σ(nij - nik)2, where nij and nik are the actual abundances of species i  in communities j and k respectively. Examples of the use of such measures can be seen in Bliss & Trietze (1984), Chemini (1979), Curtis (1978a), Freudenthaler (1989, 1994b), Schaefer (1980a) and Komposch (2000).

Such measures can be used in cluster analysis to show relationships between communities. A good coverage of such techniques is provided by Sneath & Sokal (1973), but multivariate ordination and classification analyses are generally more powerful (see below).

Communities as multivariate entities:

Communities, by definition, contain more than one species and so the most effective way to see how they compare with each other is by means of multivariate methods, analysing the distribution of all species over all communities at the same time. My brief summary emphasises the contrast between ordination (analysing continuous variation) and classification (allocating discrete categories); a good overview is provided by Gauch (1982).

Multivariate analysis and description of communities is easily achieved using computer programs such as DECORANA (detrended correspondence analysis or DCA), TWINSPAN (two-way indicator species analysis) and CANOCO (canonical community ordination). Two different approaches are involved: ordination and classification.

Ordination: principles

Ordination places the samples (and the species) in order in relation to continuous scales (axes) of variation, calculated to emphasise, for example, either maximal variation (principal components analysis, reciprocal averaging) or differences between groups (discriminant analysis). This is the purpose of DECORANA (Hill, 1979a) which also carries out the superior ordination process of detrended correspondence analysis (Hill & Gauch, 1980). There is, in fact, a wide range of ordination techniques with varying properties, to analyse your data in slightly different ways (see review in Ter Braak, 1988 and Gauch, 1982). CANOCO (Ter Braak, 1988) is a powerful computer program for canonical community ordination by various ordination techniques, within which environmental variables can be included alongside the species data.

A standard run of DECORANA uses detrended correspondence analysis to arrange the species in the best order in terms of their occurrence in samples and, at the same time, arrange the samples in terms of their species composition (hence the name ‘reciprocal averaging’ (RA) or ‘correspondence analysis’ (CA) for versions of this technique); “detrended” correspondence analysis (DCA) applies rescaling to reduce the arched data distortion introduced by the mathematics of the method (see Hill, 1979a, Hill & Gauch 1980). The process can then be repeated and another axis obtained against which samples/species can be ordered. Each sample’s score on an axis is calculated as a weighted average of all species present in the sample, so the axes represent combinations of species calculated so as to maximise the overall display of variation in a plot of one axis against another. The species present in samples will show a gradual change as one proceeds along one of these axes and a particular feature of RA and DCA is that they are based on a curvilinear distribution of species, reflecting ecological relationships.

Ideally, a species will appear in low density in unfavourable conditions,  rise to peak abundance in optimal conditions, and then decline in abundanceas conditions become less favourable. Thus each species is expected to show a Gaussian response to the environment, similar to the bell-shaped curves illustrated in Figure 3 (see also Fig. 5 of Hill & Gauch, 1980 and Fig. 5. of Hill, 1979a). The shape of such a Gaussian curve is described by its peak, the mean, and its spread, measured as standard deviation (sd) and, as described by Hill (1979a), a species will appear, rise to its peak and drop back to zero over a distance of about four sd. The length of the gradient expressed by an axis (either RA or DCA) can be expressed in sd units giving an impression of the amount of species turnover along it, with a full turnover of samples’ species composition occurring in about 4sd and a 50% change, or half-change (Gauch, 1973) taking only about 1sd. The quality of an axis is also expressed by the associated eigenvalue (λ), which is approximately proportional to the square of the length of the sample ordination axis (Hill, 1979a), with a maximal possible value of 1.0 indicating completely different species composition in samples at opposite ends of the axis.

Classification: principles

In contrast to ordination, classification places the samples (and/or the species) into discrete categories/classes, ideally in relation to a hierarchical classification that can be expressed in the form of a dendrogram. This is easily achieved (if the data are suitable) using two-way indicator species analysis (Hill, 1979b).

TWINSPAN is a divisive, polythetic method and, as such, maximises the information used to determine where best to identify groups in the data set and so is most likely to generate a satisfactory hierarchical classification of the samples (and of the species). Essentially the method picks out the best species to differentiate between groups of samples as picked out along an ordination axis (RA based), describing each of the divisions in terms of indicator species and resulting in a branching diagram (i.e. a dendrogram) showing the relationships between the sample-groups. Each positive indicator species contributes +1 to a sample’s indicator score, and each negative indicator species –1; an indicator threshold is also calculated and if the summed indicator score equals or exceeds this the sample is placed in the positive group, otherwise into the negative group. The quality of a division is reflected in its eigenvalue (λ), for which the maximum possible value of 1 indicates completely different species composition in the samples in the two groups. Good divisions have eigenvalues greater than about 0.20 - 0.25, values approximately corresponding to a 50% difference in species composition (M.O. Hill, pers. comm.). The ordination and splitting process is then repeated for the positive and negative groups and so on down to required levels in the dendrogram. More technical details can be obtained from Hill’s (1979b) description of TWINSPAN.

TWINSPAN also places the species into groups, based on the samples in which the species occur. This does not automatically give particular community descriptions, but it does group together species which tend to occur together in samples, and which possibly group together (in all or in part) to form communities. More informative is a profile indicating the constancy (i.e. percentage frequency of occurrence) for each species in each of the samples classes. This can be summarised by a table in which the species are ordered and classified according to the TWINSPAN process.


Note:   There is further consideration of these techiques in §4, where they are applied to our opilionid data. Generally, the main benefit gained by the use of these techniques is a description of the overall patterns to be seen in the data set, in terms of the distribution of all species across all samples.

Back to O.C. homepage
Back to Opilionid Communities

or

Back to Arachnologia
Back to Arachnologia

 
Ariadne's thread Back to Home page