autoplot.ResamplingSptCVCstf function

Visualization Functions for SptCV Cstf Methods.

Visualization Functions for SptCV Cstf Methods.

Generic S3 plot() and autoplot() (ggplot2) methods to visualize mlr3 spatiotemporal resampling objects.

## S3 method for class 'ResamplingSptCVCstf' autoplot( object, task, fold_id = NULL, plot_as_grid = TRUE, train_color = "#0072B5", test_color = "#E18727", repeats_id = NULL, tickformat_date = "%Y-%m", nticks_x = 3, nticks_y = 3, point_size = 3, axis_label_fontsize = 11, static_image = FALSE, show_omitted = FALSE, plot3D = NULL, plot_time_var = NULL, sample_fold_n = NULL, ... ) ## S3 method for class 'ResamplingRepeatedSptCVCstf' autoplot( object, task, fold_id = NULL, repeats_id = 1, plot_as_grid = TRUE, train_color = "#0072B5", test_color = "#E18727", tickformat_date = "%Y-%m", nticks_x = 3, nticks_y = 3, point_size = 3, axis_label_fontsize = 11, plot3D = NULL, plot_time_var = NULL, ... ) ## S3 method for class 'ResamplingSptCVCstf' plot(x, ...) ## S3 method for class 'ResamplingRepeatedSptCVCstf' plot(x, ...)

Arguments

  • object: [Resampling]

    mlr3 spatial resampling object of class ResamplingSptCVCstf or ResamplingRepeatedSptCVCstf .

  • task: [TaskClassifST]/[TaskRegrST]

    mlr3 task object.

  • fold_id: [numeric]

    Fold IDs to plot.

  • plot_as_grid: [logical(1)]

    Should a gridded plot using via list("patchwork") be created? If FALSE

    a list with of list("ggplot2") objects is returned. Only applies if a numeric vector is passed to argument fold_id.

  • train_color: [character(1)]

    The color to use for the training set observations.

  • test_color: [character(1)]

    The color to use for the test set observations.

  • repeats_id: [numeric]

    Repetition ID to plot.

  • tickformat_date: [character]

    Date format for z-axis.

  • nticks_x: [integer]

    Number of x axis breaks.

  • nticks_y: [integer]

    Number of y axis breaks.

  • point_size: [numeric]

    Point size of markers.

  • axis_label_fontsize: [integer]

    Font size of axis labels.

  • static_image: [logical]

    Whether to create a static image from the plotly plot via plotly::orca(). This requires the orca utility to be available. See https://github.com/plotly/orca for more information. When used, by default a file named plot.png is created in the current working directory.

  • show_omitted: [logical]

    Whether to show points not used in train or test set for the current fold.

  • plot3D: [logical]

    Whether to create a 2D image via ggplot2 or a 3D plot via plotly.

  • plot_time_var: [character]

    The variable to use for the z-axis (time). Remove the column role feature for this variable to only use it for plotting.

  • sample_fold_n: [integer]

    Number of points in a random sample stratified over partitions. This argument aims to keep file sizes of resulting plots reasonable and reduce overplotting in dense datasets.

  • ...: Passed down to plotly::orca(). Only effective when static_image = TRUE.

  • x: [Resampling]

    mlr3 spatial resampling object of class ResamplingSptCVCstf or ResamplingRepeatedSptCVCstf .

Details

This method requires to set argument fold_id. No plot showing all folds in one plot can be created. This is because the LLTO method does not make use of all observations but only a subset of them (many observations are omitted). Hence, train and test sets of one fold are not re-used in other folds as in other methods and plotting these without a train/test indicator would be misleading.

2D vs 3D plotting

This method has both a 2D and a 3D plotting method. The 2D method returns a ggplot with x and y axes representing the spatial coordinates. The 3D method uses plotly to create an interactive 3D plot. Set plot3D = TRUE to use the 3D method.

Note that spatiotemporal datasets usually suffer from overplotting in 2D mode.

Examples

if (mlr3misc::require_namespaces(c("sf", "plotly"), quietly = TRUE)) { library(mlr3) library(mlr3spatiotempcv) task_st = tsk("cookfarm_mlr3") task_st$set_col_roles("SOURCEID", "space") task_st$set_col_roles("Date", "time") resampling = rsmp("sptcv_cstf", folds = 5) resampling$instantiate(task_st) # with both `"space"` and `"time"` column roles set (LLTO), the omitted # observations per fold can be shown by setting `show_omitted = TRUE` autoplot(resampling, task_st, fold_id = 1, show_omitted = TRUE) }

See Also