Theory, R functions & Examples
Two datasets are provided here.
gentry197 is composed of 197 plots distributed around the globe, while
gentry.ea is a subset of five plots from East and South-East Asia. Each site is of 0.1ha and consists of 10 transects, each 2×50 m.
Alwyn Howard Gentry (1945-1993) was an American botanist and taxonomist, with a wide range of activities. He developed the sampling design for a quick inventory of diversity in species-rich tropical forests, later applied also to other regions. In relatively homogeneous forest stand, he placed ten contiguous transects of the size 2×50 m (100m2), with the long and narrow shape allowing fast census. Within each transect, he measured DBH of each individual of woody species and lianas above a certain threshold1). Data thus contain information about DBH of each individual in each transect.
After Gentry's tragic death (in a flight accident on the way to Ecuador), the dataset was made available on the website mentioned above and also published in printed version (Phillips & Miller 2002).
Data from 225 forest plots are available on http://www.wlbcenter.org/gentry_data.htm. I prepared them for use in R (check this script if you want to know how). From the total of 225 plots, I removed those made by a different method (or containing less than ten transects), arriving at the number of 197 plots used for further analysis. Data contain many errors, which call for a manual treatment - I haven't attempted to do it for the purpose of this exercise 2). Dataset from 197 localities prepared for use in R is available below. The subset of five localities from East and South-East Asia is also available.
Highest concentration of localities are in South and North America, however, Gentry has other localities scattered around the globe (link for coordinates is here).
Distribution of 197 localities around the World (used in this dataset):
Original data contain, along to coordinates, also altitude and average year precipitation - see table available in Phillips & Miller (2002).
This table have been manually retyped into electronic form, available here as data frame gentry.coord.txt, with the following variables:
Upload all 197 plots:
# load the object gentry197 (list of 197 elements) load (url ('https://raw.githubusercontent.com/zdealveindy/anadat-r/master/data/gentry197.r')) # load data frame with plot coordinates gentry.coord <- read.delim ('https://raw.githubusercontent.com/zdealveindy/anadat-r/master/data/gentry.coord.txt', row.names = 1)
Upload the subset of five plots in East and South-East Asia:
# load the object gentry.ea (list of 5 elements, c('chiba', 'nanjensh', 'kenting', 'palanan', 'semengoh')) load (url ('https://raw.githubusercontent.com/zdealveindy/anadat-r/master/data/gentry.ea.r')) # load data frame with plot coordinates gentry.coord.ea <- read.delim ('https://raw.githubusercontent.com/zdealveindy/anadat-r/master/data/gentry.coord.ea.txt', row.names = 1)
The script loads an R object
gentry197.r in case of the whole dataset and
gentry.ea.r in case of asian subset into R; it will appear in the working space as a list
gentry197 with 197 components or
gentry.ea with five components, respectively. Each component represents data from one locality (composed of 10 transects). The preview of the data frame from the first locality in
gentry197looks like this (each row = transects, only 5 columns = species) :
ANACARDIACEAE M1 M1 ANNONACEAE Monanthotaxis M1 ANNONACEAE Polyalthia henriceii APOCYNACEAE Landolphia M1 APOCYNACEAE Oncinotis M1 1 0 0 1 2 1 2 0 0 0 2 0 3 0 0 0 0 0 4 0 0 0 1 0 5 1 0 0 0 0 6 0 0 0 1 0 7 0 0 0 1 0 8 0 0 0 3 0 9 0 3 0 0 0 10 0 0 0 1 0
SANSEBAS.XLScontains one empty column inserted among
Family, which needs to be removed from *.xls file prior to application of the script. There are several such problems, and you need to follow error messages if trying to extract data by yourself.
sierraro(Cuba), the latitude reported in the original table was fixed - it should be 22.83 N, not 22.83 S.