design: an object of class survey.design; see svydesign.
LB: [double] lower bound of winsorization such that 0≤LB<UB≤1.
UB: [double] upper bound of winsorization such that 0≤LB<UB≤1.
na.rm: [logical] indicating whether NA values should be removed before the computation proceeds (default: FALSE).
trim_var: [logical] indicating whether the variance should be approximated by the variance estimator of the trimmed mean/ total (default: FALSE).
k: [integer] number of observations to be winsorized at the top of the distribution.
...: additional arguments (currently not used).
Details
Package survey must be attached to the search path in order to use the functions (see library or require).
Characteristic.: Population mean or total. Let μ
denote the estimated winsorized population mean; then, the estimated winsorized total is given by $Nhat \mu$ with $Nhat = sum(w[i])$, where summation is over all observations in the sample.
Modes of winsorization.: The amount of winsorization can be specified in relative or absolute terms:
* **Relative:** By specifying `LB` and `UB`, the method winsorizes the `LB`$~\cdot 100\%$
of the smallest observations and the (1 - `UB`)$~\cdot 100\%$ of the largest observations from the data.
* **Absolute:** By specifying argument `k` in the functions with the "infix" `_k_` in their name (e.g., `svymean_k_winsorized`), the largest $k$ observations are winsorized, $0\<k\<n$, where $n$ denotes the sample size. E.g., `k = 2`
implies that the largest and the second largest observation are winsorized.
Variance estimation.: Large-sample approximation based on the influence function; see Huber and Ronchetti (2009, Chap. 3.3) and Shao (1994). Two estimators are available:
- **`simple_var = FALSE`**: Variance estimator of the winsorized mean/ total. The estimator depends on the estimated probability density function evaluated at the winsorization thresholds, which can be -- depending on the context -- numerically unstable. As a remedy, a simplified variance estimator is available by setting `simple_var = TRUE`.
- **`simple_var = TRUE`**: Variance is approximated using the variance estimator of the trimmed mean/ total.
Utility functions.: summary, coef, SE, vcov, residuals, fitted and robweights.
Huber, P. J. and Ronchetti, E. (2009). Robust Statistics, New York: John Wiley and Sons, 2nd edition. tools:::Rd_expr_doi("10.1002/9780470434697")
Shao, J. (1994). L-Statistics in Complex Survey Problems. The Annals of Statistics 22 , 976--967. tools:::Rd_expr_doi("10.1214/aos/1176325505")
See Also
Overview (of all implemented functions)
weighted_mean_winsorized, weighted_mean_k_winsorized, weighted_total_winsorized and weighted_total_k_winsorized
Examples
head(workplace)library(survey)# Survey design for stratified simple random sampling without replacementdn <-if(packageVersion("survey")>="4.2"){# survey design with pre-calibrated weights svydesign(ids =~ID, strata =~strat, fpc =~fpc, weights =~weight, data = workplace, calibrate.formula =~-1+ strat)}else{# legacy mode svydesign(ids =~ID, strata =~strat, fpc =~fpc, weights =~weight, data = workplace)}# Estimated winsorized population mean (5% symmetric winsorization)svymean_winsorized(~employment, dn, LB =0.05)# Estimated one-sided k winsorized population total (2 observations are# winsorized at the top of the distribution)svytotal_k_winsorized(~employment, dn, k =2)