Blogs · Draft Notes · MLOps · Feature Engineering

Some Tricks in Real-world Machine Learning Engineering

Draft notes on feature engineering, notebook conversion, missing values, scaling, and categorical encoding.

2024.03.02 · 1 min read · by Zhenlin Wang

Feature Engineering

easy conversion of jupyter notebook

1. Jupyter Notebook based-development

Intuition, develop and test all logic in .ipynb file, then run jupyter nbconvert --to script train_model.ipynb to convert .ipynb file to .py file directly. This is an amazing trick!!!

Handling Missing Values

  1. Cause of missing
    1. The value itself (e.g. certain groups of people)
    2. Another variable
    3. No reason
  2. Handling
    1. Deletion
      • By Row
      • By Column
    2. Imputation

Scaling

Discretization

Categorical Feature Encoding

[IMPT] industrially adopted encoding trick - hashing