Blogs · Unsupervised Learning · Clustering

Clustering: Hierarchical, BIRCH, and Spectral

A practical overview of hierarchical clustering, BIRCH, and spectral clustering, with guidance on when each method is useful.

2020.02.05 · 1 min read · by Zhenlin Wang

Introduction

Not every clustering problem fits K-means. Hierarchical clustering, BIRCH, and spectral clustering offer different tradeoffs.

Use these methods when the structure is nested, large-scale, or not well represented by spherical clusters.

Hierarchical Clustering

Hierarchical clustering builds a tree of clusters.

Two main styles:

The result is often visualized as a dendrogram.

Common linkage choices:

Use hierarchical clustering when:

Watch out for computational cost on large datasets.

BIRCH

BIRCH builds a compact summary of the data using clustering features, then clusters the summaries.

It is useful for large datasets because it avoids storing every point in memory during clustering.

Use BIRCH when:

It works best when clusters are reasonably compact.

Spectral Clustering

Spectral clustering builds a similarity graph, uses eigenvectors of a graph Laplacian, and clusters in the transformed space.

It is useful for non-convex cluster shapes where distance-to-centroid methods fail.

Use spectral clustering when:

Watch out:

Choosing Among Them

Use:

Always inspect cluster stability and usefulness. The algorithm can produce clusters even when the structure is weak.