Should I apply feature Scaling and Selection before or after the One Hot Encoding/Label Encoding
?
Please Correct me if I'm Wrong-
- Deal with Outliers
- Impute missing Values
- Label Encode/One Hot encode categorical values
- Apply Dimensionality Reduction
- Apply Feature Selection
Please correct if I'm wrong.
Contents
hide
Best Answer
The mentioned steps are correct.
Feature scaling (min/max, mean/stdev) is for numerical values so it doesn't matter to be before or after label encoding; but keep it in mind that you SHOULD NOT do scaling on encoded categorical features.
For dimensionality reduction or feature selection, you need to have numerical values; so you should do them after label encoding.
Similar Posts:
- Solved – Label encoding vs Dumthe variable/one hot encoding – correctness
- Solved – How to do feature transformation on data without knowing what the data mean
- Solved – How to do feature transformation on data without knowing what the data mean
- Solved – Feature selection for MLP in sklearn: Is using PCA or LDA advisable
- Solved – Does it make sense to apply recursive feature elimination on one-hot encoded features