Provides a quantitative assessment of the dataset by computing the Events per Variable (EPV) metric, which gauges the proportionality between observed events and the number of explanatory variables.
getEPV(X, Y)
Arguments
X: Numeric matrix or data.frame. Explanatory variables. Qualitative variables must be transform into binary variables.
Y: Numeric matrix or data.frame. Response variables. Object must have two columns named as "time" and "event". For event column, accepted values are: 0/1 or FALSE/TRUE for censored and event observations.
Returns
Return the EPV value for a specific X (explanatory variables) and Y (time and censored variables) data.
Details
In the realm of survival analysis, the balance between observed events and explanatory variables is paramount. The getEPV function serves as a tool for researchers to ascertain this balance, which can be pivotal in determining the robustness and interpretability of subsequent statistical models. By evaluating the ratio of events in the Y matrix to the variables in the X
matrix, the function yields the EPV metric. It is of utmost importance that the Y matrix encompasses two distinct columns, namely "time" and "event". The latter, "event", should strictly encapsulate binary values, delineating censored (either 0 or FALSE) and event (either 1 or TRUE) observations. To ensure the integrity of the data and the precision of the computation, the function is equipped with an error mechanism that activates if the "event" column remains undetected.
Examples
data("X_proteomic")data("Y_proteomic")X <- X_proteomic
Y <- Y_proteomic
getEPV(X,Y)