5.1 SKATER

Spatial C(K)luster Analysis by Tree Edge Removal (SKATER) is an optimized algorithm to prune the minimum spanning tree into several clusters that their values of selected variables are as similar as possible while retaining the contiguity structure. For more information, please read https://geodacenter.github.io/workbook/9c_spatial3/lab9c.html#skater

skater()

skater() is a PostgreSQL WINDOW function. Please call it with an OVER clause.

Synopsis

Short version 1:

integer skater(integer k, anyarray vals,  bytea weights)

Short version 2:

integer skater(integer k, anyarray vals,  bytea weights, integer min_region_size)

Short version 3:

integer skater(integer k, anyarray vals,  bytea weights, numeric bound_vas, float min_bound)

Full version 1:

integer skater(integer k, anyarray vals,  bytea weights, integer min_region_size,
    character varying scale_method,
    character varying distance_type,
    integer seed,
    integer cpu_threads)

Full version 2:

integer skater(integer k, anyarray vals,  bytea weights, 
    numeric bound_val, float min_bound,
    character varying scale_method,
    character varying distance_type,
    integer seed,
    integer cpu_threads)

Arguments

Name

Type

Description

k

integer

the number of clusters

vals

anyarray

an array of the numeric columns that contains the values for skater

weights

bytea

a bytea column that stores the spatial weights information

bound_val

numeric

the numeric column of bound variable

min_bound

float

the minimum bound value that applies to all clusters

scale_method

character varying

the scaling method applies on vals. Options are {'raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust'}. Default: 'standardize'

distance_method

character varying

the distance metric used to measure the distance in attribute space of input vals. Options are {'euclidean', 'manhattan'. Default: 'euclidean'.

seed

integer

the seed for random number generator used in LISA statistics. Default: 123456789.

cpu_threads

integer

the number of cpu threads used for parallel LISA computation. Default: 6.

Return

Type

Description

integer

the cluster indicator

Examples

Apply skater to create 10 spatially constrained clusters using variable ["hr60", "ue60", "dv60"] (homicide, unemployment, divorce rate 1960 in natregimes dataset) and queen contiguity weights "queen_w":

Please see chapter 'Contiguity Based Weights' for how to create a Queen contiguity weights.

SELECT skater(10, ARRAY[hr60, ue60, dv60], queen_w) OVER() FROM natregimes;

 skater 
--------
      4
      1
      1
      3
      3
 ...

Use k=0 if you don't know the number of regions. The algorithm will use the bound information (e.g. minimum size of each region, or bound variable, e.g. population, with minimum bound value) to find the best solution.

-- each region has at least 100 counties
SELECT skater(0, ARRAY[hr60, ue60, dv60], queen_w, 100) OVER() FROM natregimes;

 skater 
--------
     10
      2
     16
      2
      9
      4
      1
      1
 ...

Last updated

Was this helpful?