gradient function

Differential calculus

Differential calculus

Given a hyper2 object and a point in probability space, function gradient() returns the gradient of the log-likelihood; function hessian() returns the bordered Hessian matrix. By default, both functions are evaluated at the maximum likelihood estimate for pp, as given by maxp().

gradient(H, probs=indep(maxp(H))) hessian(H,probs=indep(maxp(H)),border=TRUE) hessian_lowlevel(L, powers, probs, pnames,n) is_ok_hessian(M, give=TRUE)

Arguments

  • H: A hyper2 object

  • L,powers,n: Components of a hyper2 object

  • probs: A vector of probabilities

  • pnames: Character vector of names

  • border: Boolean, with default TRUE meaning to return the bordered Hessian and FALSE meaning to return the Hessian (warning: this option does not respect the unit sum constraint)

  • M: A bordered Hessian matrix, understood to have a single constraint (the unit sum) at the last row and column; the output of hessian(border=TRUE)

  • give: Boolean with default FALSE meaning for function is_ok_hessian() to return whether or not M

    corresponds to a negative-definite matrix, and TRUE meaning to return more details

Details

Function gradient() returns the gradient of the log-likelihood function. If the hyper2 object is of size nn, then argument probs may be a vector of length n1n-1 or nn; in the former case it is interpreted as indep(p). In both cases, the returned gradient is a vector of length n1n-1. The function returns the derivative of the loglikelihood with respect to the n1n-1 independent components of (p1,...,pn)(p_1,...,p_n), namely (p1,...,pn1)(p_1,...,p_n-1). The fillup value pnp_n is calculated as 1(p1+...+pn1)1-(p_1+...+p_n-1).

Function gradientn() returns the gradient of the loglikelihood function but ignores the unit sum constraint. If the hyper2

object is of size nn, then argument probs must be a vector of length nn, and the function returns a named vector of length nn. The last element of the vector is not treated differently from the others; all nn elements are treated as independent. The sum need not equal one.

Function hessian() returns the bordered Hessian , a matrix of size (n+1)(n+1)(n+1)*(n+1), which is useful when using Lagrange's method of undetermined multipliers. The first row and column correspond to the unit sum constraint, p1+...+pn=1p_1+...+p_n=1. Row and column names of the matrix are the pnames() of the hyper2 object, plus ‘usc’ for Unit Sum Constraint .

The unit sum constraint borders could have been added with idiom magic::adiag(0,pad=1,hess), which might be preferable.

Function is_ok_hessian() returns the result of the second derivative test for the maximum likelihood estimate being a local maximum on the constraint hypersurface. This is a generalization of the usual unconstrained problem, for which the test is the Hessian's being negative-definite.

Function hessian_lowlevel() is a low-level helper function that calls the routine.

Further examples and discussion is given in file inst/gradient.Rmd. See also the discussion at maxp on the different optimization routines available.

Returns

Function gradient() returns a vector of length n1n-1 with entries being the gradient of the log-likelihood with respect to the n1n-1 independent components of (p1,...,pn)(p_1,...,p_n), namely (p1,...,pn1)(p_1,...,p_n-1). The fillup value pnp_n is calculated as 1(p1,...,pn1)1-(p_1,...,p_n-1).

If argument border is TRUE, function hessian()

returns an nn-by-nn matrix of second derivatives; the borders are as returned by gradient(). If border is FALSE, ignore the fillup value and return an n1n-1-by-n1n-1 matrix.

Calling hessian() at the evaluate will not return exact zeros for the constraint on the fillup value; gradient() is used and this does not return exactly zeros at the evaluate.

Author(s)

Robin K. S. Hankin

Examples

data(chess) p <- c(1/2,1/3) delta <- rnorm(2)/1e5 # delta needs to be quite small deltaL <- loglik(p+delta,chess) - loglik(p,chess) deltaLn <- sum(delta*gradient(chess,p + delta/2)) # numeric deltaL - deltaLn # should be small [zero to first order] H <- hessian(icons) is_ok_hessian(H)
  • Maintainer: Robin K. S. Hankin
  • License: GPL (>= 2)
  • Last published: 2024-05-31