Y: The data. A m by n matrix, where m is the number of samples and n is the number of features.
X: The factor(s) of interest. A m by p matrix, where m is the number of samples and p is the number of factors of interest. Very often p = 1. Factors and dataframes are also permissible, and converted to a matrix by design.matrix.
ctl: An index vector to specify the negative controls. Either a logical vector of length n or a vector of integers.
Z: Any additional covariates to include in the model, typically a m by q matrix. Factors and dataframes are also permissible, and converted to a matrix by design.matrix. Alternatively, may simply be 1 (the default) for an intercept term. May also be NULL.
eta: Gene-wise (as opposed to sample-wise) covariates. These covariates are adjusted for by RUV-1 before any further analysis proceeds. Can be either (1) a matrix with n columns, (2) a matrix with n rows, (3) a dataframe with n rows, (4) a vector or factor of length n, or (5) simply 1, for an intercept term.
include.intercept: Applies to both Z and eta. When Z or eta (or both) is specified (not NULL) but does not already include an intercept term, this will automatically include one. If only one of Z or eta should include an intercept, this variable should be set to FALSE, and the intercept term should be included manually where desired.
fullW0: Can be included to speed up execution. Is returned by previous calls of RUV4, RUVinv, or RUVrinv (see below).
invsvd: Can be included to speed up execution. Generally used when calling RUV(r)inv many times with different values of lambda. Is returned by previous calls of RUV(r)inv (see below).
lambda: Ridge parameter. If specified, the ridged inverse method will be used.
randomization: Whether the inverse-method variances should be computed using randomly generated factors of interest (as opposed to a numerical integral).
iterN: The number of random "factors of interest" to generate (used only when randomization=TRUE).
inputcheck: Perform a basic sanity check on the inputs, and issue a warning if there is a problem.
Details
Implements the RUV-inv algorithm as described in Gagnon-Bartsch, Jacob, and Speed (2013).
Returns
A list containing - betahat: The estimated coefficients of the factor(s) of interest. A p by n matrix.
sigma2: Estimates of the features' variances. A vector of length n.
t: t statistics for the factor(s) of interest. A p by n matrix.
p: P-values for the factor(s) of interest. A p by n matrix.
Fstats: F statistics for testing all of the factors in X simultaneously.
Fpvals: P-values for testing all of the factors in X simultaneously.
multiplier: The constant by which sigma2 must be multiplied in order get an estimate of the variance of betahat
df: The number of residual degrees of freedom.
W: The estimated unwanted factors.
alpha: The estimated coefficients of W.
byx: The coefficients in a regression of Y on X (after both Y and X have been "adjusted" for Z). Useful for projection plots.
bwx: The coefficients in a regression of W on X (after X has been "adjusted" for Z). Useful for projection plots.
X: X. Included for reference.
k: k. Included for reference.
ctl: ctl. Included for reference.
Z: Z. Included for reference.
eta: eta. Included for reference.
fullW0: Can be used to speed up future calls of RUV4.
lambda: lambda. Included for reference.
invsvd: Can be used to speed up future calls of RUV(r)inv.
include.intercept: include.intercept. Included for reference.
method: Character variable with value "RUVinv". Included for reference.