RankData-class function

RankData Class

RankData Class

A S4 class to represent ranking data

It is well understood that the ranking representation and ordering representation of ranking data can easily be confused. I thus use a S4 class to store all the information about the ranking data. This can avoid unnecessary confusion. class

Details

It is possible to store both complete and top-q rankings in the same RankData object. Three slots topq, subobs, and q_ind are introduced for this purpose. Note that there is generally no need to specify these slots if your data set only contains a single "q" level (for example all data are top-10 rankings). The "q" level for complete ranking should be nobj-1. Moreover, if the rankings are organized in chunks of increasing "q" levels (for example, top-2 rankings followed by top-3 rankings followed by top-5 rankings etc.), then slots subobs, and q_ind can also be inferred correctly by the initializer. Therefore it is highly recommender that you organise the ranking matrix in this way and utilize the initializer.

Slots

  • nobj: The number of ranked objects. If not provided, it will be inferred as the maximum ranking in the data set. As a result, it must be provided if the data is top-q ranking.
  • nobs: the number of observations. No need to be provided during initialization since it must be equal to the sum of slot count.
  • ndistinct: the number of distinct rankings. No need to be provided during initialization since it must be equal to the number of rows of slot ranking.
  • ranking: a matrix that stores the ranking representation of distinct rankings. Each row contains one ranking. For top-q ranking, all unobserved objects have ranking q+1.
  • count: the number of observations for each distinct ranking corresponding to each row of ranking.
  • topq: a numeric vector to store top-q ranking information. More information in details section.
  • subobs: a numeric vector to store number of observations for each chunk of top-q rankings.
  • q_ind: a numeric vector to store the beginning and ending of each chunk of top-q rankings. The last element has to be ndistinct+1.

Examples

# creating a data set with only complete rankings rankmat <- replicate(10,sample(1:52,52), simplify = "array") countvec <- sample(1:52,52,replace=TRUE) rankdat <- new("RankData",ranking=rankmat,count=countvec) # creating a data set with both complete and top-10 rankings rankmat_in <- replicate(10,sample(1:52,52), simplify = "array") rankmat_in[rankmat_in>11] <- 11 rankmat_total <- cbind(rankmat_in, rankmat) countvec_total <- c(countvec,countvec) rankdat2 <- new("RankData",ranking=rankmat_total,count=countvec_total, nobj=52, topq=c(10,51))

References

Qian Z, Yu L. H. P (2019) "Weighted Distance-Based Models for Ranking Data Using the R Package rankdist." Journal of Statistical Software, 90 (5), 1-31. doi: 10.18637/jss.v090.i05

See Also

RankInit, RankControl

  • Maintainer: Zhaozhi Qian
  • License: GPL (>= 2)
  • Last published: 2019-07-27

Useful links