I am working my way for the first time through predicting a continuous dependent variable in a problem where all independent variables are categorical using python statsmodels. I would like to add to this model 'y ~ + C(x1) + C(x2) + C(x3)' all possible quadratic terms. What is the right notation for that?

EDIT: one of my categorical variables is age, which I binned in four different bins. All other are transformed into dummy. So the idea was to square age.

**Contents**hide

#### Best Answer

Quadratic terms for categorical variables are undefined because you cannot square a categorical variable. On the other hand, given that you have a continuous variable with nonlinear relationship to your outcome/dependent variable, categorization may help and be such that it is synonymous to introducing quadratic terms but with the added benefit of simpler model to build and explain.

### Similar Posts:

- Solved – Linear mixed model: Time as continuous or discrete variable
- Solved – Achieve continuous predictions in linear regression with all categorical independent variables
- Solved – Regression for categorical independent variables and a continuous dependent one
- Solved – Different results by using chi square test and logistic regression
- Solved – Multiple regression interaction with categorical IV with 3 levels