phylo: An object of class 'phylo' (see ape documentation)
data: A named vector of phenotypic trait values.
env_data: Environmental data, given as a time continuous function (see, e.g. splinefun) or a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance).
error: A named vector with standard errors (SE) of trait values for each species (with names matching "phylo$tip.label"). The default is NULL, in this case potential error is ignored in the fit. If set to NA, the SE is estimated from the data (to be used when there are no error measurements, a nuisance parameter is estimated). Note: When standard errors are provided, a nuisance parameter is also estimated.
model: The model describing the functional form of variation of the evolutionary rate σ2 with time and the environmental variable. Default models are "EnvExp" and "EnvLin" (see details). An user defined function of any functional form may be used (forward in time). This function has three arguments: the first argument is time; the second argument is the environmental variable; the third argument is a numeric vector of the parameters controlling the time and environmental variation (to be estimated). See the example below.
method: Methods used by the optimization routine (see ?optim for details).
control: Max. bound for the number of iteration of the optimizer; other options can be fixed on the list (see ?optim).
...: Arguments to be passed to the function. See details.
Returns
a list with the following components
LH: the maximum log-likelihood value
aic: the Akaike's Information Criterion
aicc: the second order Akaike’s Information Criterion
free.parameters: the number of estimated parameters
param: a numeric vector of estimated parameters, sigma and beta respectively for the defaults models. In the same order as defined by the user if a customized model is provided
root: the estimated root value
convergence: convergence status of the optimizing function; "0" indicates convergence (See ?optim for details)
hess.value: reliability of the likelihood estimates calculated through the eigen-decomposition of the hessian matrix. "0" means that a reliable estimate has been reached
env_func: the environmental function
tot_time: the root age of the tree
model: the fitted model (default models or user specified)
nuisance: maximum-likelihood estimate of nuisance, the unknown, nuisance contribution to measurement error when error argument is used (i.e., NA or a vector provided by the user)
Details
fit_t_env allows fitting environmental models of trait evolution. The default models EnvExp and EnvLin represents models for which the evolutionary rates are changing as a function of environmental changes though times as defined below.
EnvExp:
σ2(t)=σ02e(βT(t))sigma2(t)=sigma2∗e(beta∗T(t))
EnvLin:
σ2(t)=σ02+βT(t)sigma2(t)=sigma2+beta∗T(t)
Users defined models should have the following form (see also examples below):
fun <- function(t, env, param){ param*env(t)}
t: is the time parameter.
env: is a time function of an environmental variable. See for instance object created by splinefun when interpolating coordinate of points.
param: is a vector of parameters to estimate.
For instance, the EnvExp function can be coded as:
fun <- function(t, env, param){ param[1]*exp(param[2]*env(t))}
where param[1] is the σ2 parameter and param[2] is the β parameter. Note that in this later case, two starting values should be provided in the param
-param: The starting values used for the model. Must match the total number of parameters of the specified models. If "error=NA", a starting value for the SE to be estimated must be provided with user-defined models.
-scale: scale the amplitude of the environmental curve between 0 and 1. This may improve the parameters search in some situations.
-df: the degree of freedom to use for defining the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details.
-upper: the upper bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-lower: the lower bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-sig2: can be used instead of param to define the starting sigma value only
-beta: can be used instead of param to define the beta starting value only
-maxdiff: difference in time between tips and present day for phylogenetic trees with no contemporaneous species (default is 0)
Note
The users defined function is evaluated forward in time i.e.: from the root to the tips (time = 0 at the (present) tips). The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
References
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Sciences, 114(16): 4183-4188.
Author(s)
J. Clavel
See Also
plot.fit_t.env, likelihood_t_env
Examples
if(test){data(Cetacea)data(InfTemp)# Simulate a trait with temperature dependence on the Cetacean treeset.seed(123)trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.001, plot=TRUE)## Fit the Environmental-exponential model# Fit the environmental model result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) plot(result1)# Add to the plot the results from different smoothing of the temperature curve result2=fit_t_env(Cetacea, trait, env_data=InfTemp, df=10, scale=TRUE) lines(result2, col="red") result3=fit_t_env(Cetacea, trait, env_data=InfTemp, df=50, scale=TRUE) lines(result3, col="blue")## Fit the environmental linear model fit_t_env(Cetacea, trait, env_data=InfTemp, model="EnvLin", df=50, scale=TRUE)## Fit user defined model (note that several other environmental variables ## can be simultaneously encapsulated in this function through the env argument)# We define the function for the model my_fun<-function(t, env_cont, param){ param[1]*exp(param[2]*env_cont(t))} res<-fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun, param=c(0.1,0), scale=TRUE)# Retrieve the parameters and compare to 'result1' res
plot(res, col="red")## Fit user defined environmental functionif(require(pspline)){ spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <-function(t){predict(spline_result,t)} t<-unique(InfTemp[,1])# We build the interpolated smoothing spline function env_data<-splinefun(t,env_func(t))# We then fit the model fit_t_env(Cetacea, trait, env_data=env_data)}## Various parameterization (box constraints, df, scaling of the curve...) example fit_t_env(Cetacea, trait, env_data=InfTemp, model="EnvLin", method="L-BFGS-B", scale=TRUE, lower=-30, upper=20, df=10)## A very general model...# We define the function for the Early-Burst/AC model:maxtime = max(branching.times(Cetacea))# sigma^2*e^(r*t)my_fun_ebac <-function(t, env_cont, param){ time =(maxtime - t) param[1]*exp(param[2]*time)}res<-fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun_ebac, param=c(0.1,0), scale=TRUE)res # note that "r" is positive: it's the AC model (~OU model on ultrametric tree)}