Clustering: Hierarchical, BIRCH, and Spectral

Introduction

Not every clustering problem fits K-means. Hierarchical clustering, BIRCH, and spectral clustering offer different tradeoffs.

Use these methods when the structure is nested, large-scale, or not well represented by spherical clusters.

Hierarchical clustering builds a tree of clusters.

Two main styles:

The result is often visualized as a dendrogram.

Common linkage choices:

Use hierarchical clustering when:

Watch out for computational cost on large datasets.

BIRCH builds a compact summary of the data using clustering features, then clusters the summaries.

It is useful for large datasets because it avoids storing every point in memory during clustering.

Use BIRCH when:

It works best when clusters are reasonably compact.

Spectral clustering builds a similarity graph, uses eigenvectors of a graph Laplacian, and clusters in the transformed space.

It is useful for non-convex cluster shapes where distance-to-centroid methods fail.

Use spectral clustering when:

Watch out:

Use:

Always inspect cluster stability and usefulness. The algorithm can produce clusters even when the structure is weak.