densityLPS() R function from [DALSM]

Constrained density estimation from censored data with given mean and variance using Laplace P-splines

P-spline estimation of the density (pdf), cumulative distribution (cdf), hazard and cumulative hazard functions from interval- or right-censored data under possible marginal mean and/or variance constraints. The penalty parameter $\tau$ tuning the smoothness of the log-hazard can be selected using the Laplace P-splines (LPS) method maximizing an approximation to the marginal posterior of $\tau$ (also named the 'evidence') or using Schall's method.


densityLPS(obj.data,
       is.density=TRUE, Mean0=NULL, Var0=NULL,
       fixed.penalty=FALSE, method=c("LPS","Schall"),
       fixed.phi=FALSE,phi.ref=NULL, phi0=NULL,tau0=exp(5),tau.min=.1,
       verbose=FALSE)

Arguments

obj.data: a list created from potentially right- or interval-censored data using Dens1d. It includes summary statistics, the assumed density support, the knots for the B-spline basis, etc.
is.density: (optional) logical indicating whether the estimated density should integrate to 1.0 over the range of the knots in obj.data$knots (default: TRUE).
Mean0: (optional) constrained value for the mean of the fitted density (defaut: NULL).
Var0: (optional) constrained value for the variance of the fitted density (defaut: NULL).
fixed.penalty: (optional) logical indicating whether the penalty parameter should be selected from the data (fixed.penalty=FALSE) or fixed (fixed.penalty=TRUE) at its initial value $\tau_0$ .
method: method used for the selection of the penalty parameter: "LPS" (by maximizing the marginal posterior for $\tau$ , cf. "Laplace P-splines") or "Schall" (Schall's method).
fixed.phi: (optional) logical indicating whether the spline parameters are fixed (fixed.phi=TRUE) or estimated from the data (default: fixed.phi=FALSE).
phi.ref: (optional) reference value for the spline parameters with respect to which deviations are penalized (default: zero vector).
phi0: starting value for the spline parameters (default: spline parameters corresponding to a Student density with 5 DF).
tau0: (optional) initial value for the penalty parameter $\tau$ (default: exp(5)).
tau.min: (optional) minimal value for the penalty parameter $\tau$ (default: .1).
verbose: (optional) logical indicating whether estimation step details should be displayed (default: FALSE).

Returns

a densLPS.object containing the density estimation results.

Examples


library(DALSM)

## Example 1: estimation of the error density in a DALSM model
require(DALSM)
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
fit = DALSM(y=resp,
            formula1 = ~twoincomes+s(age)+s(eduyrs),
            formula2 = ~twoincomes+s(age)+s(eduyrs),
            data = DALSM_IncomeData)
plot(fit$derr)  ## Plot the estimated error density
print(fit$derr) ## ... and provide summary statistics for it

## Example 2: density estimation from censored income data
data(DALSM_IncomeData)
resp = DALSM_IncomeData[,1:2]
head(resp,n=20)
temp = Dens1d(y=resp,ymin=0) ## Create Dens1d object from positive censored data
obj = densityLPS(temp) ## Density estimation from IC & RC data
print(obj) ## Summary information on the estimated density
plot(obj,hist=TRUE) ## Visualize the estimated density
legend("topright",col=c("black","grey"),lwd=c(2,20),
       legend=c("Fitted density","Pseudo-data"),bty="n")

## Example 3: density estimation from simulated RC and IC data
## Data generation
set.seed(123)
n = 500 ## Sample size
x = rgamma(n,10,2) ## Exact (unobserved) data
width = runif(n,1,3) ## Width of the IC data (mean width = 2)
w = runif(n) ## Positioning of the exact data within the interval
xmat = cbind(pmax(0,x-w*width),x+(1-w)*width) ## Generated IC data
t.cens = rexp(n,1/15) ## Right-censoring values
idx.RC = (1:n)[t.cens<x] ## Id's of the right-censored units
xmat[idx.RC,] = cbind(t.cens[idx.RC],Inf) ## Data for RC units: (t.cens,Inf)
head(xmat,15)
## Density estimation with mean and variance constraints
obj.data = Dens1d(xmat,ymin=0) ## Prepare the data for estimation
obj = densityLPS(obj.data,Mean0=10/2,Var0=10/4) ## Density estimation
print(obj)
plot(obj) ## Plot the estimated density
curve(dgamma(x,10,2), ## ... and compare it to the true density (in red)
      add=TRUE,col="red",lwd=2,lty=2)
legend("topright",col=c("black","red"),lwd=c(2,2),lty=c(1,2),
       legend=c("Estimated density","True density"),bty="n")
## Same story for the cdf
with(obj, curve(pdist(x),ymin,ymax,lwd=2,xlab="",ylab="F(x)"))
curve(pgamma(x,10,2),add=TRUE,col="red",lwd=2,lty=2)
legend("right",col=c("black","red"),lwd=c(2,2),lty=c(1,2),
       legend=c("Estimated cdf","True cdf"),bty="n")

References

Lambert, P. (2021). Fast Bayesian inference using Laplace approximations in nonparametric double additive location-scale models with right- and interval-censored data. Computational Statistics and Data Analysis, 161: 107250. doi:10.1016/j.csda.2021.107250

Author(s)

Philippe Lambert p.lambert@uliege.be

DALSM package Read PDF manual

Maintainer: Philippe Lambert
License: GPL-3
Last published: 2023-10-02

Useful links

densityLPS function