Takes a sits tibble with different labels and returns a new tibble. Deals with class imbalance using the synthetic minority oversampling technique (SMOTE) for oversampling. Undersampling is done using the SOM methods available in the sits package.
n_samples_over: Number of samples to oversample for classes with samples less than this number.
n_samples_under: Number of samples to undersample for classes with samples more than this number.
method: Method for oversampling (default = "smote")
multicores: Number of cores to process the data (default 2).
Returns
A sits tibble with reduced sample imbalance.
Examples
if(sits_run_examples()){# print the labels summary for a sample set summary(samples_modis_ndvi)# reduce the sample imbalance new_samples <- sits_reduce_imbalance(samples_modis_ndvi, n_samples_over =200, n_samples_under =200, multicores =1)# print the labels summary for the rebalanced set summary(new_samples)}
References
The reference paper on SMOTE is N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of artificial intelligence research, 321-357, 2002.
Undersampling uses the SOM map developed by Lorena Santos and co-workers and used in the sits_som_map() function. The SOM map technique is described in the paper: Lorena Santos, Karine Ferreira, Gilberto Camara, Michelle Picoli, Rolf Simoes, “Quality control and class noise reduction of satellite image time series”. ISPRS Journal of Photogrammetry and Remote Sensing, vol. 177, pp 75-88, 2021. https://doi.org/10.1016/j.isprsjprs.2021.04.014.