Paired Subsampling to enable inference on the generalization error. One should not directlu call $aggregate() with a non-CI measure on a resample result using paired subsampling, as most of the resampling iterations are only intended
Details
The first repeats_in iterations are a standard ResamplingSubsampling
and should be used to obtain a point estimate of the generalization error. The remaining iterations should be used to estimate the standard error. Here, the data is divided repeats_out times into two equally sized disjunct subsets, to each of which subsampling which, a subsampling with repeats_in repetitions is applied. See the $unflatten(iter) method to map the iterations to this nested structure.
Parameters
repeats_in :: integer(1)
The inner repetitions.
repeats_out :: integer(1)
The outer repetitions.
ratio :: numeric(1)
The proportion of data to use for training.
Examples
pw_subs = rsmp("paired_subsampling")pw_subs
References
Nadeau, Claude, Bengio, Yoshua (1999). Inference for the generalization error.
Advances in neural information processing systems, 12 .
Unflatten the resampling iteration into a more informative representation:
inner: The subsampling iteration
outer: NA for the first repeats_in iterations. Otherwise it indicates the outer iteration of the paired subsamplings.
partition: NA for the first repeats_in iterations. Otherwise it indicates whether the subsampling is applied to the first or second partition Of the two disjoint halfs.
Usage
ResamplingPairedSubsampling$unflatten(iter)
Arguments
iter: (integer(1))
Resampling iteration.
Returns
list(outer, partition, inner)
Method clone()
The objects of this class are cloneable with this method.