Because my dependent variables in a multiple linear regression model are in different units, some coefficients are > 10,000 and others less than 0. I need to standardize them by using the scale
function on R, but because some of my data is negative, it's impossible to take log of some of the variables which I need to do before standardizing. Is there any other step I can take? Can I standardize without log transformation first?
Best Answer
Sometimes people take the natural log (or better yet, the base-two-log!) of a dependent variable because they feel the distribution of the log-transformed variable has better properties than the distribution of the untransformed variable, relative to whatever model they are estimating. One drawback is your regression parameter can not be interpreted in terms of the original variable, only in terms of the log of the DV. And don't be tricked into "back transforming" regression parameters with an inverse log (exponential) function, that is NOT valid.
The explanation you linked to is saying IF you are going to log-transform a variable then you ought to do that before standardization. It is not saying you should always log-transform, in my opinion sometimes people getting into the habit of log-transforming when it isn't even strictly necessary.
Whether on a plain old dependent variable or a log-transformed dependent variable, the R function scale() will let you alter the centering or the scale of a variable. And if you use some estimate of the variables mean and variability to the rescaling then you're said to have standardized the variable (i.e. made it into a de facto Z-score).
Sometimes you only center a variable. Subtract the variable's overall mean from each value and get a centered version that has mean zero. If you center only, the standard deviation will be unaffected. This is not "standardized".
Other times you may or may not center a variable but you don't like the units it is expressed in for some reason. Maybe your DV values are like 15,000g or 12,800g because that's expressed in grams and you'd rather work in kilograms. So you use the scale() function to divide each value by 1,000 and give you numbers like 15.0kg or 12.8kg. Again, this is not standardization. It is just rescaling.
So you can mix and match centering (or not) rescaling (or not) and you can do it with or without converting to a standardized scale. And all of those choices you can do with or without a log-transform. It all depends on what you're trying to accomplish and why.
Similar Posts:
- Solved – When doing principal components regression, do I need to standardize independent variables and/or dependent variable
- Solved – What does it mean to say that a regression method is (not) “scale invariant”
- Solved – Standard error for standardized coefficients
- Solved – Standardizing quadratic variables in linear model
- Solved – Standardizing dumthe variables for variable importance in glmnet