Blogs · Unsupervised Learning · Clustering

Unsupervised Learning: Measures About Clustering

A practical guide to evaluating clustering with silhouette score, Davies-Bouldin index, Calinski-Harabasz score, external labels, stability, and business usefulness.

2020.02.15 · 1 min read · by Zhenlin Wang

Introduction

Unsupervised learning finds structure without target labels. Clustering is one of the most common unsupervised tasks: grouping similar examples together.

The challenge is evaluation. Without labels, there is no obvious “accuracy.”

Internal Metrics

Internal metrics evaluate clusters using only the input data and cluster assignments.

Silhouette Score

Silhouette score compares how close a point is to its own cluster versus other clusters.

Values range from -1 to 1:

Davies-Bouldin Index

Davies-Bouldin measures average cluster similarity. Lower is better.

It rewards compact clusters that are far apart.

Calinski-Harabasz Score

Calinski-Harabasz compares between-cluster dispersion to within-cluster dispersion. Higher is better.

It often favors well-separated dense clusters.

External Metrics

If labels are available for evaluation, use external metrics:

These labels do not need to be training labels. They can be human categories or downstream outcomes used only for evaluation.

Stability

A clustering result should be stable enough to trust.

Check:

If clusters change completely with tiny changes, the structure may be weak.

Interpretability

A clustering result should be explainable:

Pretty 2D plots are not enough. Useful clusters should support decisions.

Practical Checklist

Before using clusters:

Clustering is exploratory. Treat it as a way to generate structure and hypotheses, not automatic truth.