weighted_mean_winsorized function

Weighted Winsorized Mean and Total (bare-bone functions)

Weighted Winsorized Mean and Total (bare-bone functions)

Weighted winsorized mean and total (bare-bone functions with limited functionality; see svymean_winsorized and svytotal_winsorized for more capable methods)

weighted_mean_winsorized(x, w, LB = 0.05, UB = 1 - LB, info = FALSE, na.rm = FALSE) weighted_mean_k_winsorized(x, w, k, info = FALSE, na.rm = FALSE) weighted_total_winsorized(x, w, LB = 0.05, UB = 1 - LB, info = FALSE, na.rm = FALSE) weighted_total_k_winsorized(x, w, k, info = FALSE, na.rm = FALSE)

Arguments

  • x: [numeric vector] data.
  • w: [numeric vector] weights (same length as x).
  • LB: [double] lower bound of winsorization such that 00 \leq LB << UB 1\leq 1.
  • UB: [double] upper bound of winsorization such that 00 \leq LB << UB 1\leq 1.
  • info: [logical] indicating whether additional information should be returned (default: FALSE).
  • na.rm: [logical] indicating whether NA values should be removed before the computation proceeds (default: FALSE).
  • k: [integer] number of observations to be winsorized at the top of the distribution.

Details

  • Characteristic.: Population mean or total. Let μ\mu

     denote the estimated winsorized population mean; then, the estimated population total is given by $Nhat \mu$
     
     with $Nhat = sum(w[i])$, where summation is over all observations in the sample.
    
  • Modes of winsorization.: The amount of winsorization can be specified in relative or absolute terms:

      * **Relative:** By specifying `LB` and `UB`, the methods winsorizes the `LB`$~\cdot 100\%$
        
        of the smallest observations and the (1 - `UB`)$~\cdot 100\%$ of the largest observations from the data.
      * **Absolute:** By specifying argument `k` in the functions with the "infix" `_k_` in their name, the largest $k$ observations are winsorized, $0\<k\<n$, where $n$ denotes the sample size. E.g., `k = 2`
        
        implies that the largest and the second largest observation are winsorized.
    
  • Variance estimation.: See survey methods:

      * `svymean_winsorized`,
      * `svytotal_winsorized`,
      * `svymean_k_winsorized`,
      * `svytotal_k_winsorized`.
    

Returns

The return value depends on info:

  • info = FALSE:: estimate of mean or total [double]

  • info = TRUE:: a [list] with items:

      * `characteristic` `[character]`,
      * `estimator` `[character]`,
      * `estimate` `[double]`,
      * `variance` (default: `NA`),
      * `robust` `[list]`,
      * `residuals` `[numeric vector]`,
      * `model` `[list]`,
      * `design` (default: `NA`),
      * `[call]`
    

See Also

Overview (of all implemented functions)

svymean_winsorized, svymean_k_winsorized, svytotal_winsorized and svytotal_k_winsorized

Examples

head(workplace) # Estimated winsorized population mean (5% symmetric winsorization) weighted_mean_winsorized(workplace$employment, workplace$weight, LB = 0.05) # Estimated one-sided k winsorized population total (2 observations are # winsorized at the top of the distribution) weighted_total_k_winsorized(workplace$employment, workplace$weight, k = 2)
  • Maintainer: Tobias Schoch
  • License: GPL (>= 2)
  • Last published: 2024-08-22