nReplicates: An integer controlling how many dyad ordering to perform.
dyadInclusionRate: Controls what proportion of non-edges in each ordering should be dropped.
edgeInclusionRate: Controls what proportion of edges in each ordering should be dropped.
targetFrameSize: Sets dyadInclusionRate so that the model frame for the logistic regression will have on average this amount of observations.
Returns
An object of class c('lologVariationalFit','lolog','list') consisting of the following items: - formula: The model formula
method: "variational"
theta: The fit parameter values
vcov: The asymptotic covariance matrix for the parameter values.
nReplicates: The number of replicates
dyadInclusionRate: The rate at which non-edges are included
edgeInclusionRate: The rate at which edges are included
allDyadIndependent: Logical indicating model dyad independence
likelihoodModel: An object of class *LatentOrderLikelihood at the fit parameters
outcome: The outcome vector for the logistic regression
predictors: The change statistic predictor matrix for the logistic regression
Details
This function approximates the maximum likelihood solution via a variational inference on the graph (y) over the latent edge variable inclusion order (s). Specifically, it replaces the conditional probability p(s | y) by p(s). If the LOLOG model contains only dyad independent terms, then these two probabilities are identical, and thus variational inference is exactly maximum likelihood inference. The objective function is
Ep(s)(logp(y∣S,θ))
This can be approximated by drawing samples from p(s) to approximate the expectation. The number of samples is controlled by the nReplicates parameter. The memory required is on the order of nReplicates * (# of dyads). For large networks this can be impractical, so adjusting dyadInclusionRate and edgeInclusionRate allows one to down sample the # of dyads in each replicate. By default these are set attempting to achieve as equal a number of edges and non-edges as possible while targeting a model frame with targetFrameSize number of rows.
If the model is dyad independent, replicates are redundant, and so nReplicates is set to 1 with a note.
The functional form of the objective function is equivalent to logistic regression, and so the glm function is used to maximize it. The asymptotic covariance of the parameter estimates is calculated using the methods of Westling (2015).
Westling, T., & McCormick, T. H. (2015). Beyond prediction: A framework for inference with variational approximations in mixture models. arXiv preprint arXiv:1510.08151.