Blogs · Draft Notes · MLOps · Testing

Testing Machine Learning Systems

Draft notes on software tests, model evaluation, and model behavior tests.

2024.02.17 · 1 min read · by Zhenlin Wang

A typical software testing suite will include:

For machine learning systems, we should be running model evaluation and model tests in parallel.

How do you write model tests?

  1. Pre-train test

    • Early bug discovery + training short-circuiting (saves training cost)
    • Things to check:
      • output distribution
      • gradient-related information (training loss curve)
      • data quality
      • label leakage
  2. Post-train test

    • post mortem issue discovery and model behavior analysis
      • Things to check:
        • Invariance Test (use a set of perturbations we should be able to make to the input without affecting the model’s output)
        • Directional Expectation Test
        • Data Unit Test (similar to regression test, with failued model scenarios)
  3. Organizing tests

    • structuring your tests around the “skills” we expect the model to acquire while learning to perform a given task.
  4. Model Dev Pipeline

{source: https://www.jeremyjordan.me/testing-ml/}