Find the Monte Carlo (MC) p-value by generating N replications of a statistic.
mc( y, statistic,..., dgp =function(y) sample(y, replace =TRUE), N =99, type = c("geq","leq","absolute","two-tailed"))
Arguments
y: A vector or data frame.
statistic: A function or a character string that specifies how the statistic is computed. The function needs to input the y and output a scalar.
...: Other named arguments for statistic which are passed unchanged each time it is called
dgp: A function. The function inputs the first argument y and outputs a simulated y. It should represent the data generating process under the null. Default value is the function sample(y, replace = TRUE), i.e. the bootstrap resampling of y.
N: An atomic vector. Number of replications of the test statistic.
type: A character string. It specifies the type of test the p-value function produces. The possible values are geq, leq, absolute and two-tailed. Default is geq.
Returns
The returned value is an object of class mc
containing the following components: - S0: Observed value of statistic.
pval: Monte Carlo p-value of statistic.
y: Data specified in call.
statistic: statistic function specified in call.
dgp: dgp function specified in call.
N: Number of replications specified in call.
type: type of p-value specified in call.
call: Original call to mmc.
seed: Value of .Random.seed at the start of mc call.
Details
The dgp function defined by the user is used to generate new observations in order to compute the simulated statistics.
Then pvalue is applied to the statistic and its simulated values. pvalue computes the p-value by ranking the statistic compared to its simulated values. Ties in the ranking are broken according to a uniform distribution.
We allow for four types of p-value: leq, geq, absolute and two-tailed. For one-tailed test, leq returns the proportion of simulated values smaller than the statistic while geq returns the proportion of simulated values greater than the statistic. For two-tailed test with a symmetric statistic, one can use the absolute value of the statistic and its simulated values to retrieve a two-tailed test (i.e. type = absolute). If the statistic is not symmetric, one can specify the p-value type as two-tailed which is equivalent to twice the minimum of leq and geq.
Ties in the ranking are broken according to a uniform distribution.
Examples
## Example 1## Kolmogorov-Smirnov Test using Monte Carlo# Set seedset.seed(999)# Generate sample datay <- rgamma(8, shape =2, rate =1)# Set data generating process functiondgp <-function(y) rgamma(length(y), shape =2, rate =1)# Set the statistic function to the Kolomogorov-Smirnov test for gamma distributionstatistic <-function(y){ out <- ks.test(y,"pgamma", shape =2, rate =1) return(out$statistic)}# Apply the Monte Carlo test with tie-breakermc(y, statistic = statistic, dgp = dgp, N =999, type ="two-tailed")
References
Dufour, J.-M. (2006), Monte Carlo Tests with nuisance parameters: A general approach to finite sample inference and nonstandard asymptotics in econometrics. Journal of Econometrics, 133(2) , 443-447.
Dufour, J.-M. and Khalaf L. (2003), Monte Carlo Test Methods in Econometrics. in Badi H. Baltagi, ed., A Companion to Theoretical Econometrics, Blackwell Publishing Ltd, 494-519.