Given a hyper2 object and a point in probability space, function gradient() returns the gradient of the log-likelihood; function hessian() returns the bordered Hessian matrix. By default, both functions are evaluated at the maximum likelihood estimate for p, as given by maxp().
border: Boolean, with default TRUE meaning to return the bordered Hessian and FALSE meaning to return the Hessian (warning: this option does not respect the unit sum constraint)
M: A bordered Hessian matrix, understood to have a single constraint (the unit sum) at the last row and column; the output of hessian(border=TRUE)
give: Boolean with default FALSE meaning for function is_ok_hessian() to return whether or not M
corresponds to a negative-definite matrix, and TRUE meaning to return more details
Details
Function gradient() returns the gradient of the log-likelihood function. If the hyper2 object is of size n, then argument probs may be a vector of length n−1 or n; in the former case it is interpreted as indep(p). In both cases, the returned gradient is a vector of length n−1. The function returns the derivative of the loglikelihood with respect to the n−1 independent components of (p1,...,pn), namely (p1,...,pn−1). The fillup value pn is calculated as 1−(p1+...+pn−1).
Function gradientn() returns the gradient of the loglikelihood function but ignores the unit sum constraint. If the hyper2
object is of size n, then argument probs must be a vector of length n, and the function returns a named vector of length n. The last element of the vector is not treated differently from the others; all n elements are treated as independent. The sum need not equal one.
Function hessian() returns the bordered Hessian , a matrix of size (n+1)∗(n+1), which is useful when using Lagrange's method of undetermined multipliers. The first row and column correspond to the unit sum constraint, p1+...+pn=1. Row and column names of the matrix are the pnames() of the hyper2 object, plus ‘usc’ for Unit Sum Constraint .
The unit sum constraint borders could have been added with idiom magic::adiag(0,pad=1,hess), which might be preferable.
Function is_ok_hessian() returns the result of the second derivative test for the maximum likelihood estimate being a local maximum on the constraint hypersurface. This is a generalization of the usual unconstrained problem, for which the test is the Hessian's being negative-definite.
Function hessian_lowlevel() is a low-level helper function that calls the routine.
Further examples and discussion is given in file inst/gradient.Rmd. See also the discussion at maxp on the different optimization routines available.
Returns
Function gradient() returns a vector of length n−1 with entries being the gradient of the log-likelihood with respect to the n−1 independent components of (p1,...,pn), namely (p1,...,pn−1). The fillup value pn is calculated as 1−(p1,...,pn−1).
If argument border is TRUE, function hessian()
returns an n-by-n matrix of second derivatives; the borders are as returned by gradient(). If border is FALSE, ignore the fillup value and return an n−1-by-n−1 matrix.
Calling hessian() at the evaluate will not return exact zeros for the constraint on the fillup value; gradient() is used and this does not return exactly zeros at the evaluate.
Author(s)
Robin K. S. Hankin
Examples
data(chess)p <- c(1/2,1/3)delta <- rnorm(2)/1e5# delta needs to be quite smalldeltaL <- loglik(p+delta,chess)- loglik(p,chess)deltaLn <- sum(delta*gradient(chess,p + delta/2))# numericdeltaL - deltaLn # should be small [zero to first order]H <- hessian(icons)is_ok_hessian(H)