gep_by_nera() R function from [disprofas]

Get points on confidence region bounds by Newton-Raphson search

The function gep_by_nera() is a function for finding points that ideally sit on specific confidence region bounds ( $\textit{CRB}$ ) by aid of the Method of Lagrange Multipliers (MLM) and by Newton-Raphson (nera) optimisation. The multivariate confidence interval for profiles with four time points, e.g., is an ellipse

in four dimensions.


gep_by_nera(n_p, kk, mean_diff, m_vc, ff_crit, y, max_trial, tol)

Arguments

n_p: A positive integer that specifies the number of (time) points $n_p$ .
kk: A non-negative numeric value that specifies the scaling factor $kk$ for the calculation of the Hotelling's $T^2$ statistic.
mean_diff: A vector of the mean differences between the dissolution profiles or model parameters of the reference and the test batch(es) or the averages of the model parameters of a specific group of batch(es) (reference or test). It must have the length specified by the parameter $n_p$ .
m_vc: The pooled variance-covariance matrix of the dissolution profiles or model parameters of the reference and the test batch(es) or the variance-covariance matrix of the model parameters of a specific group of batch(es) (reference or test). It must have the dimension $n_p \times n_p$ .
ff_crit: The critical $F$ value (i.e. a non-negative numeric).
y: A numeric vector of $y$ values that serve as starting points for the Newton-Raphson search, i.e. values supposed to lie on or close to the confidence interval bounds. It must have a length of $n_p + 1$ .
max_trial: A positive integer that specifies the maximum number of Newton-Raphson search rounds to be performed.
tol: A non-negative numeric that specifies the accepted minimal difference between two consecutive search rounds.

Returns

A list with the following elements is returned: - points: A matrix with one column and $n_p + 1$ rows is returned, where rows $1$ to $n_p$ represent, for each time point or model parameter, the points on the $\textit{CRB}$ . For symmetry reasons, the points on the opposite side are obtained by addition/subtraction. The last row in the matrix, with index $n_p + 1$ , represents the $\lambda$ parameter of the MLM, also known as lambda multiplier method, that is used to optimise under constraint(s). The variable $\lambda$ is thus called the Lagrange multiplier.

converged: A logical indicating whether the NR algorithm converged or not.
points.on.crb: A logical indicating whether the points found by the NR algorithm sit on the sit on the confidence region bounds (TRUE) or not (FALSE). Since it is not know a priori it is NA by default. The parameter is set by the check_point_location()

function.
n.trial: Number of trials until convergence.
max.trial: Maximal number of trials.
tol: A non-negative numeric value that specifies the accepted minimal difference between two consecutive search rounds, i.e. the tolerance.

Details

The function gep_by_nera() determines the points on the $\textit{CRB}$ for each of the $n_p$ time points. It does so by aid of the Method of Lagrange Multipliers (MLM) and by Newton-Raphson (nera) optimisation, as proposed by Margaret Connolly (Connolly 2000).

For more information, see the sections Comparison of highly variable dissolution profiles and Similarity limits in terms of MSD below.

Comparison of highly variable dissolution profiles

When comparing the dissolution data of a post-approval change product and a reference approval product, the goal is to assess the similarity between the mean dissolution values at the observed sample time points. A widely used method is the $f_2$ method that was introduced by Moore & Flanner (1996). Similarity testing criteria based on $f_2$ can be found in several FDA guidelines and in the guideline of the European Medicines Agency (EMA) On the investigation of bioequivalence (EMA 2010).

In situations where within-batch variation is greater than 15%, FDA guidelines recommend use of a multivariate confidence interval as an alternative to the $f_2$ method. This can be done using the following stepwise procedure:

Establish a similarity limit in terms of Multivariate Statistical Distance (MSD) based on inter-batch differences in % drug release from reference (standard approved) formulations, i.e. the so-called Equivalence Margin (EM).
Calculate the MSD between test and reference mean dissolutions.
Estimate the 90% confidence interval (CI) of the true MSD as determined in step 2.
Compare the upper limit of the 90% CI with the similarity limit determined in step 1. The test formulation is declared to be similar to the reference formulation if the upper limit of the 90% CI is less than or equal to the similarity limit.

Similarity limits in terms of MSD

For the calculation of the Multivariate Statistical Distance (MSD), the procedure proposed by Tsong et al. (1996) can be considered as well-accepted method that is actually recommended by the FDA. According to this method, a multivariate statistical distance, called Mahalanobis distance, is used to measure the difference between two multivariate means. This distance measure is calculated as

D_M = \sqrt{ \left( \bm{x}_T - \bm{x}_R \right)^{\top}\bm{S}_{pooled}^{-1} \left( \bm{x}_T - \bm{x}_R \right)} ,%D_M = sqrt((x_T - x_R)^{\top} S_{pooled}^{-1} (x_T - x_R)) ,

where $\bm{S}_{pooled}$ is the sample variance-covariance matrix pooled across the comparative groups, $x_T$ and $x_R$

are the vectors of the sample means for the test ( $T$ ) and reference ( $R$ ) profiles, and $S_T$ and $x_R$ are the variance-covariance matrices of the test and reference profiles. The pooled variance-covariance matrix $S_{pooled}$ is calculated by

\bm{S}_{pooled} = \frac{(n_R - 1) \bm{S}_R + (n_T - 1) \bm{S}_T}{%n_R + n_T - 2} .S_{pooled} = ((n_R - 1) S_R + (n_T - 1) S_T) /(n_R + n_T - 2) .

In order to determine the similarity limits in terms of the MSD, i.e. the Mahalanobis distance between the two multivariate means of the dissolution profiles of the formulations to be compared, Tsong et al. (1996) proposed using the equation

D_M^{max} = \sqrt{ \bm{d}_g^{\top} \bm{S}_{pooled}^{-1} \bm{d}_g} ,%D_M^{max} = sqrt(d_g^{\top} S_{pooled}^{-1} d_g) ,

where $d_g$ is a $1 x p$ vector with all $p$ elements equal to an empirically defined limit $d_g$ , e.g., $15$ %, for the maximum tolerable difference at all time points, and $p$ is the number of sampling points. By assuming that the data follow a multivariate normal distribution, the 90% confidence region ( $\textit{CR}$ ) bounds for the true difference between the mean vectors, $\mu_T - \mu_R$ , can be computed for the resultant vector $\mu$ to satisfy the following condition:

\bm{\textit{CR}} = K \left( \bm{\mu} - \left( \bm{x}_T -\bm{x}_R \right) \right)^{\top} \bm{S}_{pooled}^{-1} \left( \bm{\mu} -\left( \bm{x}_T - \bm{x}_R \right) \right) \leqF_{p, n_T + n_R - p - 1, 0.9} ,%CR = sqrt((\mu - (x_T - x_R))^{\top} S_{pooled}^{-1} (\mu - (x_T - x_R)))\leq F_{p, n_T + n_R - p - 1, 0.9} ,

where $K$ is the scaling factor that is calculated as

K = \frac{n_T n_R}{n_T + n_R} \; \frac{n_T + n_R - p - 1}{\left( n_T + n_R - 2 \right) p} ,%(n_T n_R) / (n_T + n_R) (n_T + n_R - p - 1) / ((n_T + n_R - 2) p) ,

and $F_{p, n_T + n_R - p - 1, 0.9}$ is the $90^{th}$ percentile of the $F$ distribution with degrees of freedom $p$ and $n_T + n_R - p - 1$ , where $n_T$ and $n_R$ are the number of observations of the reference and the test group, respectively, and $p$

is the number of sampling or time points, as mentioned already. It is obvious that $(n_T + n_R)$ must be greater than $(p + 1)$ . The formula for $\textit{CR}$ gives a $p$ -variate 90% confidence region for the possible true differences.

Examples


# Collecting the required information
time_points <- suppressWarnings(as.numeric(gsub("([^0-9])", "",
                                                colnames(dip1))))
tcol <- which(!is.na(time_points))
b1 <- dip1$type == "R"

# Hotelling's T2 statistics
l_hs <- get_T2_two(m1 = as.matrix(dip1[b1, tcol]),
                   m2 = as.matrix(dip1[!b1, tcol]),
                   signif = 0.05)

# Calling gep_by_nera()
res <- gep_by_nera(n_p = as.numeric(l_hs[["Parameters"]]["df1"]),
                   kk = as.numeric(l_hs[["Parameters"]]["K"]),
                   mean_diff = l_hs[["means"]][["mean.diff"]],
                   m_vc = l_hs[["S.pool"]],
                   ff_crit = as.numeric(l_hs[["Parameters"]]["F.crit"]),
                   y = rep(1, times = l_hs[["Parameters"]]["df1"] + 1),
                   max_trial = 100, tol = 1e-9)

# Expected result in res[["points"]]
#              [,1]
# t.5   -15.7600077
# t.10  -13.6501734
# t.15  -11.6689469
# t.20   -9.8429369
# t.30   -6.6632182
# t.60   -0.4634318
# t.90    2.2528551
# t.120   3.3249569
#       -17.6619995

# Rows t.5 to t.120 represent the points on the CR bounds.The unnamed last row
# represents the Lagrange multiplier lambda.

# If 'max_trial' is too small, the Newton-Raphson search may not converge.
## Not run:

  tryCatch(
    gep_by_nera(n_p = as.numeric(l_hs[["Parameters"]]["df1"]),
                kk = as.numeric(l_hs[["Parameters"]]["K"]),
                mean_diff = l_hs[["means"]][["mean.diff"]],
                m_vc = l_hs[["S.pool"]],
                ff_crit = as.numeric(l_hs[["Parameters"]]["F.crit"]),
                y = rep(1, times = l_hs[["Parameters"]]["df1"] + 1),
                max_trial = 5, tol = 1e-9),
    warning = function(w) message(w),
    finally = message("\nMaybe increasing the number of max_trial could help."))
## End(Not run)

References

United States Food and Drug Administration (FDA). Guidance for industry: dissolution testing of immediate release solid oral dosage forms. 1997.

https://www.fda.gov/media/70936/download

United States Food and Drug Administration (FDA). Guidance for industry: immediate release solid oral dosage form: scale-up and post-approval changes, chemistry, manufacturing and controls, in vitro dissolution testing, and in vivo bioequivalence documentation (SUPAC-IR). 1995.

https://www.fda.gov/media/70949/download

European Medicines Agency (EMA), Committee for Medicinal Products for Human Use (CHMP). Guideline on the Investigation of Bioequivalence. 2010; c("\n", "CPMP/EWP/QWP/1401/98 Rev. 1")..

Moore, J.W., and Flanner, H.H. Mathematical comparison of curves with an emphasis on in-vitro dissolution profiles. Pharm Tech. 1996; 20 (6): 64-74.

Tsong, Y., Hammerstrom, T., Sathe, P.M., and Shah, V.P. Statistical assessment of mean differences between two dissolution data sets. Drug Inf J. 1996; 30 : 1105-1112.

tools:::Rd_expr_doi("10.1177/009286159603000427")

Connolly, M. SAS(R) IML Code to calculate an upper confidence limit for multivariate statistical distance; 2000; Wyeth Lederle Vaccines, Pearl River, NY.

https://analytics.ncsu.edu/sesug/2000/p-902.pdf

gep_by_nera function