Test sampling design of multiple surveys using a stratified analysis
Test sampling design of multiple surveys using a stratified analysis
This function allows a series of sampling design settings to be set and tested on the simulated population. True population values are compared to stratified estimates of abundance.
surveys: A data.frame or data.table with a sequence of surveys and their settings with a format like the data.table returned by expand_surveys.
keep_details: Survey and stratified analysis details are dropped here to minimize object size. This argument allows the user to keep the details of one survey by specifying the survey number in the data.frame supplied to surveys.
n_sims: Number of times to simulate a survey over the simulated population. Requesting a large number of simulations here may max out your RAM.
n_loops: Number of times to run the sim_survey function. Total simulations run will be the product of n_sims and n_loops
arguments. Low numbers of n_sims and high numbers of n_loops
will be easier on RAM, but may be slower.
cores: Number of cores to use in parallel. More cores should speed up the process.
export_dir: Directory for exporting results as they are generated. Main use of the export is to allow this process to pick up where test_survey left off by calling resume_test. If NULL, nothing is exported.
length_group: Size of the length frequency bins for both abundance at length calculations and age-length-key construction. By default this value is inherited from the value defined in sim_abundance from the closure supplied to sim_length ("inherit"). A numeric value can also be supplied, however, a mismatch in length groupings will cause issues with strat_error
as true vs. estimated length groupings will be mismatched.
alk_scale: Spatial scale at which to construct and apply age-length-keys: "division" or "strat".
progress: Display progress bar and messages?
...: Arguments passed on to sim_survey
q: Closure, such as sim_logistic, for simulating catchability at age (returned values must be between 0 and 1)
trawl_dim: Trawl width and distance (same units as grid)
resample_cells: Allow resampling of sampling units (grid cells)? Setting to TRUE may introduce bias because depletion is imposed at the cell level.
binom_error: Impose binomial error? Setting to FALSE may introduce bias in stratified estimates at older ages because of more frequent rounding to zero.
min_sets: Minimum number of sets per strat
age_sampling: Should age sampling be "stratified" (default) or "random"?
age_length_group: Numeric value indicating the size of the length bins for stratified age sampling. Ignored if age_sampling = "random".
age_space_group: Should age sampling occur at the "division" (default), "strat" or "set" spatial scale? That is, age sampling can be spread across each "division", "strat" or "set" in each year to a maximum number within each length bin (cap is defined using the age_cap argument). Ignored if age_sampling = "random".
custom_sets: Supply an object of the same structure as returned by sim_sets which specifies a custom series of set locations to be sampled. Set locations are automated if custom_sets = NULL.
Returns
Adds a table of survey designs tested. Also adds details and summary stats of stratified estimate error to the sim list, ending with "_strat_error" or "_strat_error_stats". Error statistics includes mean error ("ME"), mean absolute error ("MAE"), mean squared error ("MSE"), and root mean squared error ("RMSE"). Also adds a sample size summary table ("samp_totals") to the list. Survey and stratified analysis details are not kept to minimize object size.
Details
Depending on the settings, test_surveys may take a long time to run. The resume_test function is for resuming partial runs of test_surveys. Note that progress bar time estimates will be biased here by previous completions. test_loop is a helper function used in both test_surveys and resume_test. CAUTION: while the dots construct is available in the resume_test
function, be careful adding arguments as it will change the simulation settings if the arguments added were not specified in the initial test_surveys run.
Examples
pop <- sim_abundance(ages =1:20, years =1:5)%>% sim_distribution(grid = make_grid(res = c(10,10)))surveys <- expand_surveys(set_den = c(1,2)/1000, lengths_cap = c(100,500), ages_cap = c(5,20))## This call runs 25 simulations of 8 different surveys over the same## population, and then runs a stratified analysis and compares true vs## estimated values. (Note: total number of simulations are low to decrease## computation time for the example)tests <- test_surveys(pop, surveys = surveys, keep_details =1, n_sims =5, n_loops =5, cores =1)library(plotly)tests$total_strat_error %>% filter(survey ==8, sim %in%1:50)%>% group_by(sim)%>% plot_ly(x =~year)%>% add_lines(y =~I_hat, alpha =0.5, name ="estimated")%>% add_lines(y =~I, color = I("black"), name ="true")%>% layout(xaxis = list(title ="Year"), yaxis = list(title ="Abundance index"))plot_total_strat_fan(tests, surveys =1:8)plot_length_strat_fan(tests, surveys =1:8)plot_age_strat_fan(tests, surveys =1:8)plot_age_strat_fan(tests, surveys =1:8, select_by ="age")plot_error_surface(tests, plot_by ="rule")plot_error_surface(tests, plot_by ="samples")plot_survey_rank(tests, which_strat ="length")plot_survey_rank(tests, which_strat ="age")