Maximum likehood estimation for (SecSSE) with parameter as complex functions.
Maximum likehood estimation for (SecSSE) with parameter as complex functions.
Maximum likehood estimation under Several examined and concealed States-dependent Speciation and Extinction (SecSSE) where some paramaters are functions of other parameters and/or factors.
phy: phylogenetic tree of class phylo, rooted and with branch lengths.
traits: vector with trait states for each tip in the phylogeny. The order of the states must be the same as the tree tips. For help, see vignette("starting_secsse", package = "secsse").
num_concealed_states: number of concealed states, generally equivalent to the number of examined states in the dataset.
idparslist: overview of parameters and their values.
idparsopt: a numeric vector with the ID of parameters to be estimated.
initparsopt: a numeric vector with the initial guess of the parameters to be estimated.
idfactorsopt: id of the factors that will be optimized. There are not fixed factors, so use a constant within functions_defining_params.
initfactors: the initial guess for a factor (it should be set to NULL
when no factors).
idparsfix: a numeric vector with the ID of the fixed parameters.
parsfix: a numeric vector with the value of the fixed parameters.
idparsfuncdefpar: id of the parameters which will be a function of optimized and/or fixed parameters. The order of id should match functions_defining_params.
functions_defining_params: a list of functions. Each element will be a function which defines a parameter e.g. id_3 <- (id_1 + id_2) / 2. See example.
cond: condition on the existence of a node root: "maddison_cond", "proper_cond" (default). For details, see vignette.
root_state_weight: the method to weigh the states: "maddison_weights", "proper_weights" (default) or "equal_weights". It can also be specified for the root state: the vector c(1, 0, 0)
indicates state 1 was the root state.
sampling_fraction: vector that states the sampling proportion per trait state. It must have as many elements as there are trait states.
tol: A numeric vector with the maximum tolerance of the optimization algorithm. Default is c(1e-04, 1e-05, 1e-05).
maxiter: max number of iterations. Default is 1000 * round((1.25) ^ length(idparsopt)).
optimmethod: A string with method used for optimization. Default is "subplex". Alternative is "simplex" and it shouldn't be used in normal conditions (only for debugging). Both are called from DDD::optimizer(), simplex is implemented natively in DDD , while subplex is ultimately called from subplex::subplex().
num_cycles: Number of cycles of the optimization. When set to Inf, the optimization will be repeated until the result is, within the tolerance, equal to the starting values, with a maximum of 10 cycles.
loglik_penalty: the size of the penalty for all parameters; default is 0 (no penalty).
is_complete_tree: logical specifying whether or not a tree with all its extinct species is provided. If set to TRUE, it also assumes that all all extinct lineages are present on the tree. Defaults to FALSE.
num_threads: number of threads to be used. Default is one thread.
atol: A numeric specifying the absolute tolerance of integration.
rtol: A numeric specifying the relative tolerance of integration.
method: integration method used, available are: "odeint::runge_kutta_cash_karp54", "odeint::runge_kutta_fehlberg78", "odeint::runge_kutta_dopri5", "odeint::bulirsch_stoer" and "odeint::runge_kutta4". Default method is: "odeint::bulirsch_stoer".
Returns
Parameter estimated and maximum likelihood
Examples
# Example of how to set the arguments for a ML search.rm(list=ls(all=TRUE))library(secsse)library(DDD)set.seed(16)phylotree <- ape::rbdtree(0.07,0.001,Tmax=50)startingpoint<-bd_ML(brts = ape::branching.times(phylotree))intGuessLamba <- startingpoint$lambda0
intGuessMu <- startingpoint$mu0
traits <- sample(c(0,1,2), ape::Ntip(phylotree),replace=TRUE)#get some traitsnum_concealed_states<-3idparslist<-id_paramPos(traits, num_concealed_states)idparslist[[1]][c(1,4,7)]<-1idparslist[[1]][c(2,5,8)]<-2idparslist[[1]][c(3,6,9)]<-3idparslist[[2]][]<-4masterBlock <- matrix(c(5,6,5,6,5,6,5,6,5),ncol =3,nrow =3,byrow =TRUE)diag(masterBlock)<-NAdiff.conceal <-FALSEidparslist[[3]]<- q_doubletrans(traits,masterBlock,diff.conceal)idparsfuncdefpar <- c(3,5,6)idparsopt <- c(1,2)idparsfix <- c(0,4)initparsopt <- c(rep(intGuessLamba,2))parsfix <- c(0,0)idfactorsopt <-1initfactors <-4# functions_defining_params is a list of functions. Each function has no# arguments and to refer# to parameters ids should be indicated as "par_" i.e. par_3 refers to# parameter 3. When a function is defined, be sure that all the parameters# involved are either estimated, fixed or# defined by previous functions (i.e, a function that defines parameter in# 'functions_defining_params'). The user is responsible for this. In this# exampl3, par_3 (i.e., parameter 3) is needed to calculate par_6. This is# correct because par_3 is defined in# the first function of 'functions_defining_params'. Notice that factor_1# indicates a value that will be estimated to satisfy the equation. The same# factor can be shared to define several parameters.functions_defining_params <- list()functions_defining_params[[1]]<-function(){ par_3 <- par_1 + par_2
}functions_defining_params[[2]]<-function(){ par_5 <- par_1 * factor_1
}functions_defining_params[[3]]<-function(){ par_6 <- par_3 * factor_1
}tol = c(1e-02,1e-03,1e-04)maxiter =1000* round((1.25)^length(idparsopt))optimmethod ="subplex"cond<-"proper_cond"root_state_weight <-"proper_weights"sampling_fraction <- c(1,1,1)model <- secsse_ml_func_def_pars(phylotree,traits,num_concealed_states,idparslist,idparsopt,initparsopt,idfactorsopt,initfactors,idparsfix,parsfix,idparsfuncdefpar,functions_defining_params,cond,root_state_weight,sampling_fraction,tol,maxiter,optimmethod,num_cycles =1)# ML -136.5796