8.1.2.3. sklearn.cluster.ward_tree

sklearn.cluster.ward_tree(X, connectivity=None, n_components=None, copy=True, n_clusters=None)

Ward clustering based on a Feature matrix.

The inertia matrix uses a Heapq-based representation.

This is the structured version, that takes into account a some topological structure between samples.

Parameters:

X : array of shape (n_samples, n_features)

feature matrix representing n_samples samples to be clustered

connectivity : sparse matrix.

connectivity matrix. Defines for each sample the neigbhoring samples following a given structure of the data. The matrix is assumed to be symmetric and only the upper triangular half is used. Default is None, i.e, the Ward algorithm is unstructured.

n_components : int (optional)

Number of connected components. If None the number of connected components is estimated from the connectivity matrix.

copy : bool (optional)

Make a copy of connectivity or work inplace. If connectivity is not of LIL type there will be a copy in any case.

n_clusters : int (optional)

Stop early the construction of the tree at n_clusters. This is useful to decrease computation time if the number of clusters is not small compared to the number of samples. In this case, the complete tree is not computed, thus the ‘children’ output is of limited use, and the ‘parents’ output should rather be used. This option is valid only when specifying a connectivity matrix.

Returns:

children : 2D array, shape (n_nodes, 2)

list of the children of each nodes. Leaves of the tree have empty list of children.

n_components : sparse matrix.

The number of connected components in the graph.

n_leaves : int

The number of leaves in the tree

parents : 1D array, shape (n_nodes, ) or None

The parent of each node. Only returned when a connectivity matrix is specified, elsewhere ‘None’ is returned.

Previous
Next