Solved – How do Adasyn and SMOTE handle categorical data, specifically binary features

SMOTE oversamples the minority class by creating synthetic data along the line connecting a minority class sample with each (or how many ever are chosen) of its K neighbors. In other words, xnewsample = xoldsample + lambda*(xneigbhor – xoldsample). How should this approach be modified when binary features are present?

Ok, so I found the answer. Just in case some else is interested here it is: The answer lies within the SMOTE paper ( itself. The SMOTE-NC technique is presented in section 6.1 of the paper that describes how mixed data types (nominal and continuous) can be handled

Similar Posts:

Rate this post

Leave a Comment