Blogs · Probability · Machine Learning

An Overview of Hidden Markov Models and Their Algorithms

A practical overview of hidden Markov models, including states, observations, transition probabilities, emission probabilities, forward-backward, Viterbi, and Baum-Welch.

2021.06.22 · 2 min read · by Zhenlin Wang

Introduction

A hidden Markov model (HMM) models a sequence where the observed data is generated by hidden states that evolve over time.

Examples:

The key assumption is Markovian: the next hidden state depends on the current hidden state, not the full history.

Components

An HMM has:

Notation:

Three Core Questions

Evaluation

Given model parameters and observations, what is the probability of the observed sequence?

This is solved with the forward algorithm.

Decoding

Given model parameters and observations, what is the most likely hidden state sequence?

This is solved with the Viterbi algorithm.

Learning

Given observations but unknown parameters, how do we estimate transition and emission probabilities?

This is solved with Baum-Welch, an expectation-maximization algorithm for HMMs.

Forward Algorithm

The forward algorithm computes the probability of observations up to time $t$ while ending in each hidden state.

It avoids enumerating every possible hidden state sequence, which would be exponential in sequence length.

At each step:

  1. Combine previous state probabilities with transition probabilities.
  2. Multiply by the emission probability of the current observation.
  3. Continue forward through the sequence.

Viterbi Algorithm

The Viterbi algorithm finds the most likely hidden state path.

It is similar to the forward algorithm, but instead of summing over possible previous states, it takes the maximum and stores backpointers.

Use Viterbi when you need a single best explanation of the observed sequence.

Baum-Welch

Baum-Welch estimates HMM parameters when the hidden states are unknown.

It alternates:

Like other EM methods, it can converge to local optima, so initialization matters.

Practical Considerations

HMMs are useful when:

They are less suitable when long-range context dominates. In those cases, recurrent networks, transformers, or state-space models may be better.

Closing

Hidden Markov models are a clean framework for sequence data with latent states. Their main value is the combination of interpretable structure and efficient dynamic programming algorithms.