Convenience function for generating functional data
Convenience function for generating functional data
This models generates shape outliers with a different covariance structure from that of the main model. The main model is of the form: [REMOVE_ME]Xi(t)=μt+ei(t),[REMOVEME2] contamination model of the form: [REMOVE_ME]Xi(t)=μt+e~i(t),[REMOVEME2] where t∈[0,1], and ei(t)
and e~i(t) are Gaussian processes with zero mean and covariance function of the form: [REMOVE_ME]γ(s,t)=αexp(−β∣t−s∣ν)[REMOVEME2]
Please see the simulation models vignette with vignette("simulation_models", package = "fdaoutlier") for more details.
n: The number of curves to generate. Set to 100 by default.
p: The number of evaluation points of the curves. Curves are usually generated over the interval [0,1]. Set to 50 by default.
outlier_rate: A value between [0,1] indicating the percentage of outliers. A value of 0.06 indicates about 6% of the observations will be outliers depending on whether the parameter deterministic is TRUE or not. Set to 0.05 by default.
mu: The mean value of the functions. Set to 4 by default.
cov_alpha, cov_alpha2: A value indicating the coefficient of the exponential function of the covariance matrix, i.e., the α in the covariance function. cov_alpha is for the main model while cov_alpha2 is for the covariance function of the contamination model. cov_alpha is set to 1 by default while cov_alpha2 is set to 5 by default.
cov_beta, cov_beta2: A value indicating the coefficient of the terms inside the exponential function of the covariance matrix, i.e., the β in the covariance function. cov_beta
is for the main model while cov_beta2 is for the covariance function of the contamination model. cov_beta is set to 1 by default while cov_beta2 is set to 2 by default.
cov_nu, cov_nu2: A value indicating the power to which to raise the terms inside the exponential function of the covariance matrix, i.e., the ν in the covariance function. cov_nu is for the main model while cov_nu2 is for the covariance function of the contamination model. cov_nu is set to 1 by default while cov_nu2 is set to 0.5 by default.
deterministic: A logical value. If TRUE, the function will always return round(n*outlier_rate) outliers and consequently the number of outliers is always constant. If FALSE, the number of outliers are determined using n Bernoulli trials with probability outlier_rate, and consequently the number of outliers returned is random. TRUE by default.
seed: A seed to set for reproducibility. NULL by default in which case a seed is not set.
plot: A logical value indicating whether to plot data.
plot_title: Title of plot if plot is TRUE
title_cex: Numerical value indicating the size of the plot title relative to the device default. Set to 1.5 by default. Ignored if plot = FALSE.
show_legend: A logical indicating whether to add legend to plot if plot = TRUE.
ylabel: The label of the y-axis. Set to "" by default.
xlabel: The label of the x-axis if plot = TRUE. Set to "gridpoints" by default.
Returns
A list containing: - data: a matrix of size n by p containing the simulated data set
true_outliers: a vector of integers indicating the row index of the outliers in the generated data.
Description
This models generates shape outliers with a different covariance structure from that of the main model. The main model is of the form:
Xi(t)=μt+ei(t),
contamination model of the form:
Xi(t)=μt+e~i(t),
where t∈[0,1], and ei(t)
and e~i(t) are Gaussian processes with zero mean and covariance function of the form:
γ(s,t)=αexp(−β∣t−s∣ν)
Please see the simulation models vignette with vignette("simulation_models", package = "fdaoutlier") for more details.