multiDiv function

Calculating Diversity Curves Across Multiple Datasets

Calculating Diversity Curves Across Multiple Datasets

Calculates multiple diversity curves from a list of datasets of taxic ranges and/or phylogenetic trees, for the same intervals, for all the individual datasets. A median curve with 95 percent quantile bounds is also calculated and plotted for each interval.

multiDiv( data, int.length = 1, plot = TRUE, split.int = TRUE, drop.ZLB = TRUE, drop.cryptic = FALSE, extant.adjust = 0.01, plotLogRich = FALSE, yAxisLims = NULL, timelims = NULL, int.times = NULL, plotMultCurves = FALSE, multRainbow = TRUE, divPalette = NULL, divLineType = 1, main = NULL ) plotMultiDiv( results, plotLogRich = FALSE, timelims = NULL, yAxisLims = NULL, plotMultCurves = FALSE, multRainbow = TRUE, divPalette = NULL, divLineType = 1, main = NULL )

Arguments

  • data: A list where each element is a dataset, formatted to be input in one of the diversity curve functions listed in DiversityCurves.

  • int.length: The length of intervals used to make the diversity curve. Ignored if int.times is given.

  • plot: If TRUE, the median diversity curve is plotted.

  • split.int: For discrete time data, should calculated/input intervals be split at discrete time interval boundaries? If FALSE, can create apparent artifacts in calculating the diversity curve. See details.

  • drop.ZLB: If TRUE, zero-length terminal branches are dropped from the input tree for phylogenetic datasets, before calculating standing diversity.

  • drop.cryptic: If TRUE, cryptic taxa are merged to form one taxon for estimating taxon curves. Only works for objects from simFossilRecord

    via fossilRecord2fossilTaxa.

  • extant.adjust: Amount of time to be added to extend start time for (0,0) bins for extant taxa, so that the that 'time interval' does not appear to have an infinitely small width.

  • plotLogRich: If TRUE, taxic diversity is plotted on log scale.

  • yAxisLims: Limits for the y (i.e. richness) axis on the plotted diversity curves. Only affects plotting. Given as either NULL (the default) or as a vector of length two as for xlim in the basic R function plot. Time axes will be plotted exactly to these values. The minimum value must be more than 1 if plotLogRich = TRUE.

  • timelims: Limits for the x (time) axis for diversity curve plots. Only affects plotting. Given as either NULL (the default) or as a vector of length two as for xlim in the basic R function plot. Time axes will be plotted exactly to these values.

  • int.times: An optional two-column matrix of the interval start and end times for calculating the diversity curve. If NULL, calculated internally. If given, the argument split.int and int.length are ignored.

  • plotMultCurves: If TRUE, each individual diversity curve is plotted rather than the median diversity curve and 95 percent quantiles. plotMultCurves = FALSE by default.

  • multRainbow: If TRUE and plotMultCurves = TRUE, each line is plotted as a different, randomized color using the function rainbow. If FALSE, each line is plotted as a black line. This argument is ignored if divPalette is supplied.

  • divPalette: Can be used so users can pass a vector of chosen color identifiers for each diversity curve in data which will take precedence over multRainbow. Must be the same length as the number of diversity curves supplied.

  • divLineType: Used to determine line type (lty) of the diversity curves plotted when plotMultCurves = TRUE. Default is lty = 1 for all curves. Must be either length of 1 or exact length as number of diversity curves.

  • main: The main label for the figure.

  • results: The output of a previous run of multiDiv for replotting.

Returns

A list composed of three elements will be invisibly returned: - int.times: A two column matrix giving interval start and end times

  • div: A matrix of measured diversities in particular intervals by rows, with each column representing a different dataset included in the input

  • median.curve: A three column matrix, where the first column is the calculated median curve and the second and third columns are the 95 percent quantile upper and lower bounds

Details

This function is essentially a wrapper for the individual diversity curve functions included in paleotree. multiDiv will intuitively decide whether input datasets are continuous-time taxic ranges, discrete-time (binned interval) taxic ranges or phylogenetic trees, as long as they are formatted as required by the respective diversity curve functions. A list that contains a mix of data types is entirely acceptable. A list of matrices output from fossilRecord2fossilTaxa, via simulation with simFossilRecord is allowable, and treated as input for taxicDivCont. Data of an unknown type gives back an error.

The argument split.int splits intervals, if and only if discrete interval time data is included among the datasets. See the help file for taxicDivDisc

to see an explanation of why split.int = TRUE by default is probably a good thing.

As with many functions in the paleotree library, absolute time is always decreasing, i.e. the present day is zero.

The 'averaged' curve is actually the median rather than the mean as diversity counts are often highly skewed (in this author's experience).

The shaded certainty region around the median curve is the two-tailed 95 percent lower and upper quantiles, calculated from the observed data. It is not a true probabilisitic confidence interval, as it has no relationship to the standard error.

Examples

# let's look at this function # with some birth-death simulations set.seed(444) # multiDiv can take output from simFossilRecord # via fossilRecord2fossilTaxa # what do many simulations run under some set of # conditions 'look' like on average? set.seed(444) records <- simFossilRecord( p = 0.1, q = 0.1, nruns = 10, totalTime = 30, plot = TRUE ) taxa <- lapply(records, fossilRecord2fossilTaxa) multiDiv(taxa) # increasing cone of diversity! # Its even better on a log scale: multiDiv(taxa, plotLogRich = TRUE) ####################################### # pure-birth example with simFossilRecord # note that conditioning is tricky set.seed(444) recordsPB <- simFossilRecord( p = 0.1, q = 0, nruns = 10, totalTime = 30, plot = TRUE ) taxaPB <- lapply(recordsPB, fossilRecord2fossilTaxa) multiDiv(taxaPB, plotLogRich = TRUE) #compare many discrete diversity curves discreteRanges <- lapply(taxaPB, function(x) binTimeData( sampleRanges(x, r = 0.5, min.taxa = 1 ), int.length = 7) ) multiDiv(discreteRanges) ######################################### # plotting a multi-diversity curve for # a sample of stochastic dated trees record <- simFossilRecord( p = 0.1, q = 0.1, nruns = 1, nTotalTaxa = c(30,40), nExtant = 0) taxa <- fossilRecord2fossilTaxa(record) rangesCont <- sampleRanges(taxa, r = 0.5) rangesDisc <- binTimeData(rangesCont, int.length = 1) # get the cladogram cladogram <- taxa2cladogram(taxa, plot = TRUE) #using multiDiv with samples of trees ttrees <- timePaleoPhy( cladogram, rangesCont, type = "basic", randres = TRUE, ntrees = 10, add.term = TRUE ) multiDiv(ttrees) # uncertainty in diversity history is solely due to # the random resolution of polytomies ######################################################### #using multiDiv to compare very different data types: # continuous ranges, discrete ranges, dated tree # get a single dated tree ttree <- timePaleoPhy( cladogram, rangesCont, type = "basic", add.term = TRUE, plot = FALSE ) # put them altogether in a list input <- list(rangesCont, rangesDisc, ttree) multiDiv(input, plot = TRUE) # what happens if we use fixed interval times? multiDiv(input, int.times = rangesDisc[[1]], plot = TRUE) layout(1)

See Also

The diversity curve functions used include: phyloDiv, taxicDivCont and taxicDivDisc.

Also see the function LTT.average.root in the package TreeSim, which calculates an average LTT curve for multiple phylogenies, the functions mltt.plot in ape and ltt in phytools.

  • Maintainer: David W. Bapst
  • License: CC0
  • Last published: 2024-07-06