Feature Engineering
easy conversion of jupyter notebook
1. Jupyter Notebook based-development
Intuition, develop and test all logic in .ipynb file, then runjupyter nbconvert --to script train_model.ipynb to convert .ipynb file to .py file directly. This is an amazing trick!!!
Handling Missing Values
- Cause of missing- The value itself (e.g. certain groups of people)
- Another variable
- No reason
 
- Handling- Deletion- By Row
- By Column
 
- Imputation
 
- Deletion
Scaling
Discretization
Categorical Feature Encoding
[IMPT] industrially adopted encoding trick - hashing
- Hash each category
- New incoming category gets hashed to an existing index
- Random collision not too bad
- Significantly resolved “Unknown category” problem