data: Data format depends on the estimation method used!!
Data with hist method - list with 2 components:
**dataTime∗∗:[0,dataTime] is the total time interval of observation
**dataNijk∗∗:DatamatrixwithcountsperprocessN_{ij}andsub−intervals;matrixofsizeN*DmaxwhereN = n(n-1)orn(n-1)/2isthenumberofpossiblenodepairsinthegraphandDmax = 2^{dmax}$ is the size of the finest partition in the histrogram approach
Counts are pre-computed - Obtained through function 'statistics' (auxiliary.R) on data with second format
Data with kernel method - list with 3 components:
data$time.seq: Sequence of observed time points of the m-th event (M-vector)
data$type.seq: Sequence of observed values convertNodePair(i,j,n,directed) (auxiliary.R) of process that produced the mth event (M-vector).
**dataTime∗∗:[0,dataTime] is the total time interval of observation
n: Total number of nodes
Qmin: Minimum number of groups
Qmax: Maximum number of groups
directed: Boolean for directed (TRUE) or undirected (FALSE) case
sparse: Boolean for sparse (TRUE) or not sparse (FALSE) case
method: List of string. Can be "hist" for histogram method or "kernel" for kernel method
init.tau: List of initial values of τ - all tau's are matrices with size Q×n (might be with different values of Q)
cores: Number of cores for parallel execution
If set to 1 it does sequential execution
Beware: parallelization with fork (multicore) : doesn't work on Windows!
d_part: Maximal level for finest partition of time interval [0,T] used for k-means initializations.
Algorithm takes partition up to depth 2d with d=1,...,dpart
n_perturb: Number of different perturbations on k-means result
When Qmin<Qmax, number of perturbations on the result with Q−1 or Q+1 groups
perc_perturb: Percentage of labels that are to be perturbed (= randomly switched)
n_random: Number of completely random initial points. The total number of initializations for the VEM is npart∗(1+nperturb)+nrandom
nb.iter: Number of iterations of the VEM algorithm
fix.iter: Maximum number of iterations of the fixed point into the VE step
epsilon: Threshold for the stopping criterion of VEM and fixed point iterations
filename: Name of the file where to save the results along the computation (increasing steps for Q, these are the longest).
The file will contain a list of 'best' results.
Details
The sparse version works only for the histogram approach.
Examples
# load data of a synthetic graph with 50 individuals and 3 clustersn <-20Q <-3Time <- generated_Q3_n20$data$Time
data <- generated_Q3_n20$data
z <- generated_Q3_n20$z
step <-.001x0 <- seq(0,Time,by=step)intens <- generated_Q3_n20$intens
# VEM-algo kernelsol.kernel <- mainVEM(data,n,Q,directed=FALSE,method='kernel', d_part=0, n_perturb=0)[[1]]# compute smooth intensity estimatorssol.kernel.intensities <- kernelIntensities(data,sol.kernel$tau,Q,n,directed=FALSE)# eliminate label switchingintensities.kernel <- sortIntensities(sol.kernel.intensities,z,sol.kernel$tau, directed=FALSE)# VEM-algo hist# compute data matrix with precision d_max=3Dmax <-2^3Nijk <- statistics(data,n,Dmax,directed=FALSE)sol.hist <- mainVEM(list(Nijk=Nijk,Time=Time),n,Q,directed=FALSE, method='hist', d_part=0,n_perturb=0,n_random=0)[[1]]log.intensities.hist <- sortIntensities(sol.hist$logintensities.ql,z,sol.hist$tau, directed=FALSE)# plot estimatorspar(mfrow=c(2,3))ind.ql <-0for(q in1:Q){for(l in q:Q){ ind.ql <- ind.ql +1 true.val <- intens[[ind.ql]]$intens(x0) values <- c(intensities.kernel[ind.ql,],exp(log.intensities.hist[ind.ql,]),true.val) plot(x0,true.val,type='l',xlab=paste0("(q,l)=(",q,",",l,")"),ylab='', ylim=c(0,max(values)+.1)) lines(seq(0,1,by=1/Dmax),c(exp(log.intensities.hist[ind.ql,]), exp(log.intensities.hist[ind.ql,Dmax])),type='s',col=2,lty=2) lines(seq(0,1,by=.001),intensities.kernel[ind.ql,],col=4,lty=3)}}
References
DAUDIN, J.-J., PICARD, F. & ROBIN, S. (2008). A mixture model for random graphs. Statist. Comput. 18, 173–183.
DEMPSTER, A. P., LAIRD, N. M. & RUBIN, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39, 1–38.
JORDAN, M., GHAHRAMANI, Z., JAAKKOLA, T. & SAUL, L. (1999). An introduction to variational methods for graphical models. Mach. Learn. 37, 183–233.
MATIAS, C., REBAFKA, T. & VILLERS, F. (2018). A semiparametric extension of the stochastic block model for longitudinal networks. Biometrika.
MATIAS, C. & ROBIN, S. (2014). Modeling heterogeneity in random graphs through latent space models: a selective review. Esaim Proc. & Surveys 47, 55–74.