Splicing of mixed Erlang and Pareto for interval censored data
Splicing of mixed Erlang and Pareto for interval censored data
Fit spliced distribution of a mixed Erlang distribution and a Pareto distribution adapted for interval censoring and truncation.
SpliceFiticPareto(L, U, censored, tsplice, M =3, s =1:10, trunclower =0, truncupper =Inf, ncores =NULL, criterium = c("BIC","AIC"), reduceM =TRUE, eps =10^(-3), beta_tol =10^(-5), maxiter =Inf, cpp =FALSE)
Arguments
L: Vector of length n with the lower boundaries of the intervals for interval censored data or the observed data for right censored data.
U: Vector of length n with the upper boundaries of the intervals.
censored: A logical vector of length n indicating if an observation is censored.
tsplice: The splicing point t.
M: Initial number of Erlang mixtures, default is 3. This number can change when determining an optimal mixed Erlang fit using an information criterion.
s: Vector of spread factors for the EM algorithm, default is 1:10. We loop over these factors when determining an optimal mixed Erlang fit using an information criterion, see Verbelen et al. (2016).
trunclower: Lower truncation point. Default is 0.
truncupper: Upper truncation point. Default is Inf (no upper truncation).
ncores: Number of cores to use when determining an optimal mixed Erlang fit using an information criterion. When NULL (default), max(nc-1,1) cores are used where nc is the number of cores as determined by detectCores.
criterium: Information criterion used to select the number of components of the ME fit and s. One of "AIC" and "BIC" (default).
reduceM: Logical indicating if M should be reduced based on the information criterion, default is TRUE.
eps: Covergence threshold used in the EM algorithm. Default is 10^(-3).
beta_tol: Threshold for the mixing weights below which the corresponding shape parameter vector is considered neglectable (ME part). Default is 10^(-5).
maxiter: Maximum number of iterations in a single EM algorithm execution. Default is Inf meaning no maximum number of iterations.
cpp: Use C++ implementation (cpp=TRUE) or R implementation (cpp=FALSE) of the algorithm. Default is FALSE meaning the plain R implementation is used.
Details
See Reynkens et al. (2017), Section 4.3.2 of Albrecher et al. (2017) and Verbelen et al. (2015) for details. The code follows the notation of the latter. Initial values follow from Verbelen et al. (2016).
Right censored data should be entered as L=l and U=truncupper, and left censored data should be entered as L=trunclower and U=u.
Returns
A SpliceFit object.
References
Albrecher, H., Beirlant, J. and Teugels, J. (2017). Reinsurance: Actuarial and Statistical Aspects, Wiley, Chichester.
Reynkens, T., Verbelen, R., Beirlant, J. and Antonio, K. (2017). "Modelling Censored Losses Using Splicing: a Global Fit Strategy With Mixed Erlang and Extreme Value Distributions". Insurance: Mathematics and Economics, 77, 65--77.
Verbelen, R., Gong, L., Antonio, K., Badescu, A. and Lin, S. (2015). "Fitting Mixtures of Erlangs to Censored and Truncated Data Using the EM Algorithm." Astin Bulletin, 45, 729--758.
Verbelen, R., Antonio, K. and Claeskens, G. (2016). "Multivariate Mixtures of Erlangs for Density Estimation Under Censoring." Lifetime Data Analysis, 22, 429--455.
Author(s)
Tom Reynkens based on R code from Roel Verbelen for fitting the mixed Erlang distribution (without splicing).
See Also
SpliceFitPareto, SpliceFitGPD, Splice
Examples
## Not run:# Pareto random sampleX <- rpareto(500, shape=2)# Censoring variableY <- rpareto(500, shape=1)# Observed sampleZ <- pmin(X,Y)# Censoring indicatorcensored <-(X>Y)# Right boundaryU <- Z
U[censored]<-Inf# Splice ME and Paretosplicefit <- SpliceFiticPareto(L=Z, U=U, censored=censored, tsplice=quantile(Z,0.9))x <- seq(0,20,0.1)# Plot of spliced CDFplot(x, pSplice(x, splicefit), type="l", xlab="x", ylab="F(x)")# Plot of spliced PDFplot(x, dSplice(x, splicefit), type="l", xlab="x", ylab="f(x)")# Fitted survival function and Turnbull survival function SpliceTB(x, L=Z, U=U, censored=censored, splicefit=splicefit)# Log-log plot with Turnbull survival function and fitted survival functionSpliceLL_TB(x, L=Z, U=U, censored=censored, splicefit=splicefit)# PP-plot of Turnbull survival function and fitted survival functionSplicePP_TB(L=Z, U=U, censored=censored, splicefit=splicefit)# PP-plot of Turnbull survival function and # fitted survival function with log-scalesSplicePP_TB(L=Z, U=U, censored=censored, splicefit=splicefit, log=TRUE)# QQ-plot using Turnbull survival function and fitted survival functionSpliceQQ_TB(L=Z, U=U, censored=censored, splicefit=splicefit)## End(Not run)