ST_ClusterDBSCAN — Windowing function that returns integer id for the cluster each input geometry is in based on 2D implementation of Density-based spatial clustering of applications with noise (DBSCAN) algorithm.
integer ST_ClusterDBSCAN(
geometry winset geom, float8 eps, integer minpoints)
;
Returns cluster number for each input geometry, based on a 2D implementation of the Density-based spatial clustering of applications with noise (DBSCAN) algorithm. Unlike ST_ClusterKMeans, it does not require the number of clusters to be specified, but instead uses the desired distance (eps
) and density(minpoints
) parameters to construct each cluster.
An input geometry will be added to a cluster if it is either:
A "core" geometry, that is within eps
distance of at least minpoints
input geometries (including itself) or
A "border" geometry, that is within eps
distance of a core geometry.
Note that border geometries may be within eps
distance of core geometries in more than one cluster; in this case, either assignment would be correct, and the border geometry will be arbitrarily asssigned to one of the available clusters. In these cases, it is possible for a correct cluster to be generated with fewer than minpoints
geometries. When assignment of a border geometry is ambiguous, repeated calls to ST_ClusterDBSCAN will produce identical results if an ORDER BY clause is included in the window definition, but cluster assignments may differ from other implementations of the same algorithm.
Input geometries that do not meet the criteria to join any other cluster will be assigned a cluster number of NULL. |
Availability: 2.3.0 - requires GEOS
Assigning a cluster number to each parcel point:
SELECT parcel_id, ST_ClusterDBSCAN(geom, eps := 0.5, minpoints := 5) over () AS cid FROM parcels;
Combining parcels with the same cluster number into a single geometry. This uses named argument calling
SELECT cid, ST_Collect(geom) AS cluster_geom, array_agg(parcel_id) AS ids_in_cluster FROM ( SELECT parcel_id, ST_ClusterDBSCAN(geom, eps := 0.5, minpoints := 5) over () AS cid, geom FROM parcels) sq GROUP BY cid;