ridge_muhat_lfo_pai() R function from [banditsCI]

Leave-future-out ridge-based estimates for arm expected rewards.

Computes leave-future-out ridge-basedn estimates of arm expected rewards based on provided data.


ridge_muhat_lfo_pai(xs, ws, yobs, K, batch_sizes, alpha = 1)

Arguments

xs: Matrix. Covariates of shape [A, p], where A is the number of observations and p is the number of features. Must not contain NA values.
ws: Integer vector. Indicates which arm was chosen for observations at each time t. Length A. Must not contain NA values.
yobs: Numeric vector. Observed outcomes, length A. Must not contain NA values.
K: Integer. Number of arms. Must be a positive integer.
batch_sizes: Integer vector. Sizes of batches in which data is processed. Must be positive integers.
alpha: Numeric. Ridge regression regularization parameter. Default is 1.

Returns

A 3D array containing the expected reward estimates for each arm and each time t, of shape [A, A, K].

Examples


set.seed(123)
p <- 3
K <- 5
A <- 100
xs <- matrix(runif(A * p), nrow = A, ncol = p)
ws <- sample(1:K, A, replace = TRUE)
yobs <- runif(A)
batch_sizes <- c(25, 25, 25, 25)
muhat <- ridge_muhat_lfo_pai(xs, ws, yobs, K, batch_sizes)
print(muhat)

banditsCI package Read PDF manual

Maintainer: Molly Offer-Westort
License: GPL (>= 3)
Last published: 2024-11-29

Useful links

ridge_muhat_lfo_pai function

Leave-future-out ridge-based estimates for arm expected rewards.

Arguments

Returns

Examples