azp_greedy function

A greedy algorithm to solve the AZP problem

A greedy algorithm to solve the AZP problem

The automatic zoning procedure (AZP) was initially outlined in Openshaw (1977) as a way to address some of the consequences of the modifiable areal unit problem (MAUP). In essence, it consists of a heuristic to find the best set of combinations of contiguous spatial units into p regions, minimizing the within sum of squares as a criterion of homogeneity. The number of regions needs to be specified beforehand.

azp_greedy( p, w, df, bound_variable = data.frame(), min_bound = 0, inits = 0, initial_regions = vector("numeric"), scale_method = "standardize", distance_method = "euclidean", random_seed = 123456789, rdist = numeric() )

Arguments

  • p: The number of spatially constrained clusters
  • w: An instance of Weight class
  • df: A data frame with selected variables only. E.g. guerry[c("Crm_prs", "Crm_prp", "Litercy")]
  • bound_variable: (optional) A data frame with selected bound variabl
  • min_bound: (optional) A minimum bound value that applies to all clusters
  • inits: (optional) The number of construction re-runs, which is for ARiSeL "automatic regionalization with initial seed location"
  • initial_regions: (optional) The initial regions that the local search starts with. Default is empty. means the local search starts with a random process to "grow" clusters
  • scale_method: (optional) One of the scaling methods ('raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust') to apply on input data. Default is 'standardize' (Z-score normalization).
  • distance_method: (optional) The distance method used to compute the distance betwen observation i and j. Defaults to "euclidean". Options are "euclidean" and "manhattan"
  • random_seed: (optional) The seed for random number generator. Defaults to 123456789.
  • rdist: (optional) The distance matrix (lower triangular matrix, column wise storage)

Returns

A names list with names "Clusters", "Total sum of squares", "Within-cluster sum of squares", "Total within-cluster sum of squares", and "The ratio of between to total sum of squares".

Examples

## Not run: library(sf) guerry_path <- system.file("extdata", "Guerry.shp", package = "rgeoda") guerry <- st_read(guerry_path) queen_w <- queen_weights(guerry) data <- guerry[c('Crm_prs','Crm_prp','Litercy','Donatns','Infants','Suicids')] azp_clusters <- azp_greedy(5, queen_w, data) azp_clusters ## End(Not run)