User Tools

Site Tools


This is an old revision of the document!

Simulated ecological data

Source of data

Zelený D. (unpubl.), script is based on the simulation model written by Fridley et al. (2007) (see Appendix S2 of their paper), which itself was based on the work of Minchin (1987).

Description of the dataset

Simulated response of species along virtual gradient (only first 50 species are displayed)

These simulated community datasets represent the model of community, which is fully based on the ecological niche theory. Unimodal species response curves are randomly distributed along one (or two, respectively) virtual ecological gradients, reflecting the probability of species occurrence in given part of the gradient (species response curve is based on Beta function). Each species is defined by its ecological optimum along the gradient, niche width, maximum probability of occurrence and few other parameters. In the next step, random positions along gradient are generated, and within each position (“plot”) are collected individual species in the following way: first, random number is generated, corresponding to the number of individuals in given plot; than, each individual is randomly assigned to a species and probability of the assignment to given species is weighted by probability of this species occurrence in particular part of the gradient. One species could be hence assigned to more individuals per plot, if its probability of occurrence in given part of the gradient is high. In case of two virtual gradients, the probability of occurrence for particular species is given by multiplying the probabilities of given species along each of the gradient. For details, see the scripts below.

Parameters of the files

  • simul1 - 1 gradient (length 5000 units), 500 samples, 300 species1)
  • simul2 - 2 gradients of different length (5000 a 2000 units), 500 samples, 300 species
  • simul3 - 2 gradients of the same length (5000 units), 500 samples, 300 species
  • simul.short - 2 gradients of different length, both rather short (1100 and 800 units), 70 samples, 300 species, samples are distributed evenly along the gradient (distance among samples is 100 units)
  • simul.long - 2 gradients of different length, both rather long (5500 and 4000 units), 70 samples, 300 species, samples are distributed evenly along the gradient (distance among samples is 100 units)

Note: number of species (e.g. 300) is a parameter set up for simulated models - number of species in the resulting community matrix doesn't have to fit to number of simulated species, as some of the species with low abundance were not “sampled”.

Data for download

Files with -spe contains presence-absence matrix of species data, files with -env contains position of simulated sample along virtual gradient (analogy to measured environmental variable), files with -specivalues contain information about position of species optima along the gradient and niche width (both in arbitrary gradient units).

Script for direct import of data to R

simul1.spe <- read.delim ('', row.names = 1)
simul1.env <- read.delim ('', row.names = 1)
simul1.specvalues <- read.delim ('', row.names = 1)
simul2.spe <- read.delim ('', row.names = 1)
simul2.env <- read.delim ('', row.names = 1)
simul2.specvalues <- read.delim ('', row.names = 1)
simul3.spe <- read.delim ('', row.names = 1)
simul3.env <- read.delim ('', row.names = 1)
simul3.specvalues <- read.delim ('', row.names = 1)
simul.short.spe <- read.delim ('', row.names = 1)
simul.short.env <- read.delim ('', row.names = 1)
simul.long.spe <- read.delim ('', row.names = 1)
simul.long.env <- read.delim ('', row.names = 1)

Scripts for creating simulated datasets


  • Function compas in package CommEcol written by Adriano Sanches Melo should be direct implementation of Minchin's software COMPAS (Minchin 1987). The principles are similar (actually Fridley et al. 2007 paper cites the Minchin's paper using COMPAS), but compas allows generation of community matrix in more than two dimensions, and adding quantitative and qualitative noise.
  • Even more comprehensive is package coenocliner developed by Gavin Simpson - apart to Minchin's model, it can simulate bunch of other types of community data along coenocline.


  • Fridley, J.D., Vandermast, D.B., Kuppinger, D.M., Manthey, M., and Peet, R.K. (2007) Co-occurrence-based assessment of habitat generalists and specialists: a new approach for the measurement of niche width. Journal of Ecology 95: 707-722 pdf Appendix S2
  • Minchin, P.R. (1987) Simulation of multidimensional community patterns: towards a comprehensive model. Vegetatio 71: 145-156.
Number of species in resulting community matrix is slightly smaller - this is because 300 is the number of species used in simulated community, but not all species made it through random sampling into realized community matrix.
en/data/simul.1419087627.txt.gz · Last modified: 2017/10/11 20:36 (external edit)