{ } Raw JSON

bundles / scipy 1.17.1 / scipy / cluster / hierarchy / leaders

function

scipy.cluster.hierarchy:leaders

source: /scipy/cluster/hierarchy.py :4202

Signature

def   leaders ( Z T )

Summary

Return the root nodes in a hierarchical clustering.

Extended Summary

Returns the root nodes in a hierarchical clustering corresponding to a cut defined by a flat cluster assignment vector T. See the fcluster function for more information on the format of T.

For each flat cluster of the flat clusters represented in the n-sized flat cluster assignment vector T, this function finds the lowest cluster node in the linkage tree Z, such that:

  • leaf descendants belong only to flat cluster j (i.e., T[p]==j for all in , where is the set of leaf ids of descendant leaf nodes with cluster node )

  • there does not exist a leaf that is not a descendant with that also belongs to cluster (i.e., T[q]!=j for all not in ). If this condition is violated, T is not a valid cluster assignment vector, and an exception will be thrown.

Parameters

Z : ndarray

The hierarchical clustering encoded as a matrix. See linkage for more information.

T : ndarray

The flat cluster assignment vector.

Returns

L : ndarray

The leader linkage node id's stored as a k-element 1-D array, where k is the number of flat clusters found in T.

L[j]=i is the linkage cluster node id that is the leader of flat cluster with id M[j]. If i < n, i corresponds to an original observation, otherwise it corresponds to a non-singleton cluster.

M : ndarray

The leader linkage node id's stored as a k-element 1-D array, where k is the number of flat clusters found in T. This allows the set of flat cluster ids to be any arbitrary set of k integers.

For example: if L[3]=2 and M[3]=8, the flat cluster with id 8's leader is linkage node 2.

Notes

Array API support (experimental): This function returns arrays with data-dependent shape. In JAX, at the moment of writing this makes it impossible to execute it inside @jax.jit.

Array API Standard Support

leaders has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

====================  ====================  ====================
Library               CPU                   GPU
====================  ====================  ====================
NumPy                 ✅                     n/a                 
CuPy                  n/a                   ⛔                   
PyTorch               ✅                     ⛔                   
JAX                   ⚠️ no JIT
Dask                  ⚠️ merges chunks      n/a                 
====================  ====================  ====================

See dev-arrayapi for more information.

Examples

from scipy.cluster.hierarchy import ward, fcluster, leaders
from scipy.spatial.distance import pdist
Given a linkage matrix ``Z`` - obtained after apply a clustering method to a dataset ``X`` - and a flat cluster assignment array ``T``:
X = [[0, 0], [0, 1], [1, 0],
     [0, 4], [0, 3], [1, 4],
     [4, 0], [3, 0], [4, 1],
     [4, 4], [3, 4], [4, 3]]
Z = ward(pdist(X))
Z
T = fcluster(Z, 3, criterion='distance')
T
`scipy.cluster.hierarchy.leaders` returns the indices of the nodes in the dendrogram that are the leaders of each flat cluster:
L, M = leaders(Z, T)
L
(remember that indices 0-11 point to the 12 data points in ``X``, whereas indices 12-22 point to the 11 rows of ``Z``) `scipy.cluster.hierarchy.leaders` also returns the indices of the flat clusters in ``T``:
M

See also

fcluster

for the creation of flat cluster assignments.

Aliases

  • scipy.cluster.hierarchy.leaders