bundles / scipy 1.17.1 / scipy / cluster / hierarchy / fclusterdata
function
scipy.cluster.hierarchy:fclusterdata
Signature
def fclusterdata ( X , t , criterion = inconsistent , metric = euclidean , depth = 2 , method = single , R = None ) Summary
Cluster observation data using a given metric.
Extended Summary
Clusters the original observations in the n-by-m data matrix X (n observations in m dimensions), using the euclidean distance metric to calculate distances between original observations, performs hierarchical clustering using the single linkage algorithm, and forms flat clusters using the inconsistency method with t as the cut-off threshold.
A 1-D array T of length n is returned. T[i] is the index of the flat cluster to which the original observation i belongs.
Parameters
X: (N, M) ndarrayN by M data matrix with N observations in M dimensions.
t: scalarFor criteria 'inconsistent', 'distance' or 'monocrit',
this is the threshold to apply when forming flat clusters.
For 'maxclust' or 'maxclust_monocrit' criteria,
this would be max number of clusters requested.
criterion: str, optionalSpecifies the criterion for forming flat clusters. Valid values are 'inconsistent' (default), 'distance', or 'maxclust' cluster formation algorithms. See fcluster for descriptions.
metric: str or function, optionalThe distance metric for calculating pairwise distances. See
distance.pdistfor descriptions and linkage to verify compatibility with the linkage method.depth: int, optionalThe maximum depth for the inconsistency calculation. See
inconsistentfor more information.method: str, optionalThe linkage method to use (single, complete, average, weighted, median centroid, ward). See
linkagefor more information. Default is "single".R: ndarray, optionalThe inconsistency matrix. It will be computed if necessary if it is not passed.
Returns
fclusterdata: ndarrayA vector of length n. T[i] is the flat cluster number to which original observation i belongs.
Notes
This function is similar to the MATLAB function clusterdata.
Array API Standard Support
fclusterdata has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.
==================== ==================== ==================== Library CPU GPU ==================== ==================== ==================== NumPy ✅ n/a CuPy n/a ⛔ PyTorch ✅ ⛔ JAX ⚠️ no JIT ⛔ Dask ⚠️ computes graph n/a ==================== ==================== ====================
See
dev-arrayapifor more information.
Examples
from scipy.cluster.hierarchy import fclusterdata
✓X = [[0, 0], [0, 1], [1, 0], [0, 4], [0, 3], [1, 4], [4, 0], [3, 0], [4, 1], [4, 4], [3, 4], [4, 3]]✓
fclusterdata(X, t=1)
✓See also
- scipy.spatial.distance.pdist
pairwise distance metrics
Aliases
-
scipy.cluster.hierarchy.fclusterdata