5.1 SKATER
Spatial C(K)luster Analysis by Tree Edge Removal (SKATER) is an optimized algorithm to prune the minimum spanning tree into several clusters that their values of selected variables are as similar as possible while retaining the contiguity structure. For more information, please read https://geodacenter.github.io/workbook/9c_spatial3/lab9c.html#skater
skater()
Synopsis
Short version 1:
integer skater(integer k, anyarray vals, bytea weights)
Short version 2:
integer skater(integer k, anyarray vals, bytea weights, integer min_region_size)
Short version 3:
integer skater(integer k, anyarray vals, bytea weights, numeric bound_vas, float min_bound)
Full version 1:
integer skater(integer k, anyarray vals, bytea weights, integer min_region_size,
character varying scale_method,
character varying distance_type,
integer seed,
integer cpu_threads)
Full version 2:
integer skater(integer k, anyarray vals, bytea weights,
numeric bound_val, float min_bound,
character varying scale_method,
character varying distance_type,
integer seed,
integer cpu_threads)
Arguments
Name
Type
Description
k
integer
the number of clusters
vals
anyarray
an array of the numeric columns that contains the values for skater
weights
bytea
a bytea column that stores the spatial weights information
bound_val
numeric
the numeric column of bound variable
min_bound
float
the minimum bound value that applies to all clusters
scale_method
character varying
the scaling method applies on vals. Options are {'raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust'}. Default: 'standardize'
distance_method
character varying
the distance metric used to measure the distance in attribute space of input vals. Options are {'euclidean', 'manhattan'. Default: 'euclidean'.
seed
integer
the seed for random number generator used in LISA statistics. Default: 123456789.
cpu_threads
integer
the number of cpu threads used for parallel LISA computation. Default: 6.
Return
Type
Description
integer
the cluster indicator
Examples
Apply skater to create 10 spatially constrained clusters using variable ["hr60", "ue60", "dv60"] (homicide, unemployment, divorce rate 1960 in natregimes dataset) and queen contiguity weights "queen_w":
SELECT skater(10, ARRAY[hr60, ue60, dv60], queen_w) OVER() FROM natregimes;
skater
--------
4
1
1
3
3
...
Use k=0 if you don't know the number of regions. The algorithm will use the bound information (e.g. minimum size of each region, or bound variable, e.g. population, with minimum bound value) to find the best solution.
-- each region has at least 100 counties
SELECT skater(0, ARRAY[hr60, ue60, dv60], queen_w, 100) OVER() FROM natregimes;
skater
--------
10
2
16
2
9
4
1
1
...
Last updated
Was this helpful?