Cluster spatial data with REDCAP (REgionalization with Dynamically Constrained Agglomerative clustering and Partitioning) routines.

scl_redcap(
  xy,
  dmat,
  ncl,
  full_order = TRUE,
  linkage = "single",
  shortest = TRUE,
  nnbs = 6L,
  iterate_ncl = FALSE,
  quiet = FALSE
)

Arguments

xy

Rectangular structure (matrix, data.frame, tibble), containing coordinates of points to be clustered.

dmat

Square structure (matrix, data.frame, tibble) containing distances or equivalent metrics between all points in xy. If xy has n rows, then dat must have n rows and n columns.

ncl

Desired number of clusters. See description of `ncl_iterate` parameter for conditions under which actual number may be less than this value.

full_order

If FALSE, build spanning trees from first-order relationships only, otherwise build from full-order relationships (see Note).

linkage

One of "single", "average", or "complete"; see Note.

shortest

If TRUE, the dmat is interpreted as distances such that lower values are preferentially selected; if FALSE, then higher values of dmat are interpreted to indicate stronger relationships, as is the case for example with covariances.

nnbs

Number of nearest neighbours to be used in calculating clustering trees. Triangulation will be used if nnbs <= 0.

iterate_ncl

Actual numbers of clusters found may be less than the specified value of `ncl`, because clusters formed from < 3 edges are removed. If `iterate_ncl = FALSE` (the default), the value is returned with whatever number of actual clusters is found. Setting this parameter to `TRUE` forces the algorithm to iterate until the exact number of clusters has been found. For large data sets, this may result in considerable longer calculation times.

quiet

If `FALSE` (default), display progress information on screen.

Value

A object of class scl with tree containing the clustering scheme, and xy the original coordinate data of the clustered points. An additional component, tree_rest, enables the tree to be re-cut to a different number of clusters via scl_recluster, rather than calculating clusters anew.

Note

Please refer to the original REDCAP paper ('Regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP)', by D. Guo (2008), Int.J.Geo.Inf.Sci 22:801-823) for details of the full_order and linkage parameters. This paper clearly demonstrates the general inferiority of spanning trees constructed from first-order relationships. It is therefore strongly recommended that the default full_order = TRUE be used at all times.

See also

Other clustering_fns: scl_full(), scl_recluster()

Examples

n <- 100
xy <- matrix (runif (2 * n), ncol = 2)
dmat <- matrix (runif (n ^ 2), ncol = n)
scl <- scl_redcap (xy, dmat, ncl = 4)
# Those clusters will by default be constructed by connecting edges with the
# lowest (\code{shortest}) values of \code{dmat}, and will differ from
scl <- scl_redcap (xy, dmat, ncl = 4, shortest = FALSE)
# using 'full_order = FALSE' constructs clusters from first-order
# relationships only; not recommended, but possible nevertheless:
scl <- scl_redcap (xy, dmat, ncl = 4, full_order = FALSE)