densityLPS function

Constrained density estimation from censored data with given mean and variance using Laplace P-splines

Constrained density estimation from censored data with given mean and variance using Laplace P-splines

P-spline estimation of the density (pdf), cumulative distribution (cdf), hazard and cumulative hazard functions from interval- or right-censored data under possible marginal mean and/or variance constraints. The penalty parameter τ\tau tuning the smoothness of the log-hazard can be selected using the Laplace P-splines (LPS) method maximizing an approximation to the marginal posterior of τ\tau (also named the 'evidence') or using Schall's method.

densityLPS(obj.data, is.density=TRUE, Mean0=NULL, Var0=NULL, fixed.penalty=FALSE, method=c("LPS","Schall"), fixed.phi=FALSE,phi.ref=NULL, phi0=NULL,tau0=exp(5),tau.min=.1, verbose=FALSE)

Arguments

  • obj.data: a list created from potentially right- or interval-censored data using Dens1d. It includes summary statistics, the assumed density support, the knots for the B-spline basis, etc.
  • is.density: (optional) logical indicating whether the estimated density should integrate to 1.0 over the range of the knots in obj.data$knots (default: TRUE).
  • Mean0: (optional) constrained value for the mean of the fitted density (defaut: NULL).
  • Var0: (optional) constrained value for the variance of the fitted density (defaut: NULL).
  • fixed.penalty: (optional) logical indicating whether the penalty parameter should be selected from the data (fixed.penalty=FALSE) or fixed (fixed.penalty=TRUE) at its initial value τ0\tau_0.
  • method: method used for the selection of the penalty parameter: "LPS" (by maximizing the marginal posterior for τ\tau, cf. "Laplace P-splines") or "Schall" (Schall's method).
  • fixed.phi: (optional) logical indicating whether the spline parameters are fixed (fixed.phi=TRUE) or estimated from the data (default: fixed.phi=FALSE).
  • phi.ref: (optional) reference value for the spline parameters with respect to which deviations are penalized (default: zero vector).
  • phi0: starting value for the spline parameters (default: spline parameters corresponding to a Student density with 5 DF).
  • tau0: (optional) initial value for the penalty parameter τ\tau (default: exp(5)).
  • tau.min: (optional) minimal value for the penalty parameter τ\tau (default: .1).
  • verbose: (optional) logical indicating whether estimation step details should be displayed (default: FALSE).

Returns

a densLPS.object containing the density estimation results.

Examples

library(DALSM) ## Example 1: estimation of the error density in a DALSM model require(DALSM) data(DALSM_IncomeData) resp = DALSM_IncomeData[,1:2] fit = DALSM(y=resp, formula1 = ~twoincomes+s(age)+s(eduyrs), formula2 = ~twoincomes+s(age)+s(eduyrs), data = DALSM_IncomeData) plot(fit$derr) ## Plot the estimated error density print(fit$derr) ## ... and provide summary statistics for it ## Example 2: density estimation from censored income data data(DALSM_IncomeData) resp = DALSM_IncomeData[,1:2] head(resp,n=20) temp = Dens1d(y=resp,ymin=0) ## Create Dens1d object from positive censored data obj = densityLPS(temp) ## Density estimation from IC & RC data print(obj) ## Summary information on the estimated density plot(obj,hist=TRUE) ## Visualize the estimated density legend("topright",col=c("black","grey"),lwd=c(2,20), legend=c("Fitted density","Pseudo-data"),bty="n") ## Example 3: density estimation from simulated RC and IC data ## Data generation set.seed(123) n = 500 ## Sample size x = rgamma(n,10,2) ## Exact (unobserved) data width = runif(n,1,3) ## Width of the IC data (mean width = 2) w = runif(n) ## Positioning of the exact data within the interval xmat = cbind(pmax(0,x-w*width),x+(1-w)*width) ## Generated IC data t.cens = rexp(n,1/15) ## Right-censoring values idx.RC = (1:n)[t.cens<x] ## Id's of the right-censored units xmat[idx.RC,] = cbind(t.cens[idx.RC],Inf) ## Data for RC units: (t.cens,Inf) head(xmat,15) ## Density estimation with mean and variance constraints obj.data = Dens1d(xmat,ymin=0) ## Prepare the data for estimation obj = densityLPS(obj.data,Mean0=10/2,Var0=10/4) ## Density estimation print(obj) plot(obj) ## Plot the estimated density curve(dgamma(x,10,2), ## ... and compare it to the true density (in red) add=TRUE,col="red",lwd=2,lty=2) legend("topright",col=c("black","red"),lwd=c(2,2),lty=c(1,2), legend=c("Estimated density","True density"),bty="n") ## Same story for the cdf with(obj, curve(pdist(x),ymin,ymax,lwd=2,xlab="",ylab="F(x)")) curve(pgamma(x,10,2),add=TRUE,col="red",lwd=2,lty=2) legend("right",col=c("black","red"),lwd=c(2,2),lty=c(1,2), legend=c("Estimated cdf","True cdf"),bty="n")

References

Lambert, P. (2021). Fast Bayesian inference using Laplace approximations in nonparametric double additive location-scale models with right- and interval-censored data. Computational Statistics and Data Analysis, 161: 107250. doi:10.1016/j.csda.2021.107250

See Also

densLPS.object, print.densLPS, plot.densLPS, Dens1d.object, Dens1d.

Author(s)

Philippe Lambert p.lambert@uliege.be

  • Maintainer: Philippe Lambert
  • License: GPL-3
  • Last published: 2023-10-02