Solved – Quadratic terms in regression with categorical variables

I am working my way for the first time through predicting a continuous dependent variable in a problem where all independent variables are categorical using python statsmodels. I would like to add to this model 'y ~ + C(x1) + C(x2) + C(x3)' all possible quadratic terms. What is the right notation for that?

EDIT: one of my categorical variables is age, which I binned in four different bins. All other are transformed into dummy. So the idea was to square age.

Quadratic terms for categorical variables are undefined because you cannot square a categorical variable. On the other hand, given that you have a continuous variable with nonlinear relationship to your outcome/dependent variable, categorization may help and be such that it is synonymous to introducing quadratic terms but with the added benefit of simpler model to build and explain.

Similar Posts:

Rate this post

Leave a Comment