rankMatrix function

Rank of a Matrix

Rank of a Matrix

Compute the matrix rank, a well-defined functional in theory(*), somewhat ambiguous in practice. We provide several methods, the default corresponding to Matlab's definition.

(*) The rank of a nxmn x m matrix AA, rk(A)rk(A), is the maximal number of linearly independent columns (or rows); hence rk(A)<=min(n,m)rk(A) <= min(n,m).

rankMatrix(x, tol = NULL, method = c("tolNorm2", "qr.R", "qrLINPACK", "qr", "useGrad", "maybeGrad"), sval = svd(x, 0, 0)$d, warn.t = TRUE, warn.qr = TRUE) qr2rankMatrix(qr, tol = NULL, isBqr = is.qr(qr), do.warn = TRUE)

Arguments

  • x: numeric matrix, of dimension nxmn x m, say.

  • tol: nonnegative number specifying a (relative, scalefree ) tolerance for testing of practically zero with specific meaning depending on method; by default, max(dim(x)) * .Machine$double.eps

    is according to Matlab's default (for its only method which is our method="tolNorm2").

  • method: a character string specifying the computational method for the rank, can be abbreviated:

    • "tolNorm2":: the number of singular values >= tol * max(sval);

    • "qrLINPACK":: for a dense matrix, this is the rank of qr(x, tol, LAPACK=FALSE) (which is qr(...)$rank);

       This ("qr*", dense) version used to be **the** recommended way to compute a matrix rank for a while in the past.
       
       For sparse `x`, this is equivalent to `"qr.R"`.
      
    • "qr.R":: this is the rank of triangular matrix RR, where qr() uses LAPACK or a "sparseQR" method (see qr-methods) to compute the decomposition QRQR. The rank of RR is then defined as the number of non-zero diagonal entries did_i of RR, and non-zero s fulfill di>=tolmax(di)|d_i| >= tol * max(|d_i|).

    • "qr":: is for back compatibility; for dense x, it corresponds to "qrLINPACK", whereas for sparse x, it uses "qr.R".

       For all the "qr*" methods, singular values `sval` are not used, which may be crucially important for a large sparse matrix `x`, as in that case, when `sval` is not specified, the default, computing `svd()` currently coerces `x` to a dense matrix.
      
    • "useGrad":: considering the gradient of the (decreasing) singular values, the index of the smallest gap.

    • "maybeGrad":: choosing method "useGrad" only when that seems reasonable; otherwise using "tolNorm2".

  • sval: numeric vector of non-increasing singular values of x; typically unspecified and computed from x when needed, i.e., unless method = "qr".

  • warn.t: logical indicating if rankMatrix() should warn when it needs t(x) instead of x. Currently, for method = "qr" only, gives a warning by default because the caller often could have passed t(x) directly, more efficiently.

  • warn.qr: in the QRQR cases (i.e., if method starts with "qr"), rankMatrix() calls qr2rankMarix(.., do.warn = warn.qr), see below.

  • qr: an object resulting from qr(x,..), i.e., typically inheriting from class "qr" or "sparseQR".

  • isBqr: logical indicating if qr is resulting from base qr(). (Otherwise, it is typically from Matrix package sparse qr.)

  • do.warn: logical; if true, warn about non-finite diagonal entries in the RR matrix of the QRQR decomposition. Do not change lightly!

Details

qr2rankMatrix() is typically called from rankMatrix() for the "qr"* methods, but can be used directly - much more efficiently in case the qr-decomposition is available anyway.

Note

For large sparse matrices x, unless you can specify sval yourself, currently method = "qr" may be the only feasible one, as the others need sval and call svd() which currently coerces x to a denseMatrix which may be very slow or impossible, depending on the matrix dimensions.

Note that in the case of sparse x, method = "qr", all non-strictly zero diagonal entries did_i where counted, up to including Matrix version 1.1-0, i.e., that method implicitly used tol = 0, see also the set.seed(42) example below.

Returns

If x is a matrix of all 0 (or of zero dimension), the rank is zero; otherwise, typically a positive integer in 1:min(dim(x))

with attributes detailing the method used.

There are rare cases where the sparse QRQR decomposition fails in so far as the diagonal entries of RR, the did_i (see above), end with non-finite, typically NaN

entries. Then, a warning is signalled (unless warn.qr / do.warn is not true) and NA (specifically, NA_integer_) is returned.

Author(s)

Martin Maechler; for the "*Grad" methods building on suggestions by Ravi Varadhan.

See Also

qr, svd.

Examples

rankMatrix(cbind(1, 0, 1:3)) # 2 (meths <- eval(formals(rankMatrix)$method)) ## a "border" case: H12 <- Hilbert(12) rankMatrix(H12, tol = 1e-20) # 12; but 11 with default method & tol. sapply(meths, function(.m.) rankMatrix(H12, method = .m.)) ## tolNorm2 qr.R qrLINPACK qr useGrad maybeGrad ## 11 11 12 12 11 11 ## The meaning of 'tol' for method="qrLINPACK" and *dense* x is not entirely "scale free" rMQL <- function(ex, M) rankMatrix(M, method="qrLINPACK",tol = 10^-ex) rMQR <- function(ex, M) rankMatrix(M, method="qr.R", tol = 10^-ex) sapply(5:15, rMQL, M = H12) # result is platform dependent ## 7 7 8 10 10 11 11 11 12 12 12 {x86_64} sapply(5:15, rMQL, M = 1000 * H12) # not identical unfortunately ## 7 7 8 10 11 11 12 12 12 12 12 sapply(5:15, rMQR, M = H12) ## 5 6 7 8 8 9 9 10 10 11 11 sapply(5:15, rMQR, M = 1000 * H12) # the *same* ## "sparse" case: M15 <- kronecker(diag(x=c(100,1,10)), Hilbert(5)) sapply(meths, function(.m.) rankMatrix(M15, method = .m.)) #--> all 15, but 'useGrad' has 14. sapply(meths, function(.m.) rankMatrix(M15, method = .m., tol = 1e-7)) # all 14 ## "large" sparse n <- 250000; p <- 33; nnz <- 10000 L <- sparseMatrix(i = sample.int(n, nnz, replace=TRUE), j = sample.int(p, nnz, replace=TRUE), x = rnorm(nnz)) (st1 <- system.time(r1 <- rankMatrix(L))) # warning+ ~1.5 sec (2013) (st2 <- system.time(r2 <- rankMatrix(L, method = "qr"))) # considerably faster! r1[[1]] == print(r2[[1]]) ## --> ( 33 TRUE ) ## another sparse-"qr" one, which ``failed'' till 2013-11-23: set.seed(42) f1 <- factor(sample(50, 1000, replace=TRUE)) f2 <- factor(sample(50, 1000, replace=TRUE)) f3 <- factor(sample(50, 1000, replace=TRUE)) D <- t(do.call(rbind, lapply(list(f1,f2,f3), as, 'sparseMatrix'))) dim(D); nnzero(D) ## 1000 x 150 // 3000 non-zeros (= 2%) stopifnot(rankMatrix(D, method='qr') == 148, rankMatrix(crossprod(D),method='qr') == 148) ## zero matrix has rank 0 : stopifnot(sapply(meths, function(.m.) rankMatrix(matrix(0, 2, 2), method = .m.)) == 0)
  • Maintainer: Martin Maechler
  • License: GPL (>= 2) | file LICENCE
  • Last published: 2025-03-11