Is it necessary to have data that looks normal if you want to apply a dynamic correlation coeffient (dcc)?
Should the explanatory variables in the dcc method also be normalized?
My dataset has a high kurtosis and bad skewness. I used the log10 to try to make it more normal, but remarkably the kurtosis is much higher after taking the log… Very strange.
What other options do I have to make my data more normal and how do I apply that in Stata?
Best Answer
I would imagine the DCC suffers the same limitations as the regular correlation with non-normal data. That is, there isn't an assumption of normality, but non-normal data can cause odd findings; see the Anscombe quartet, for example.
As for kurtosis, taking the log can certainly make it worse. Take this example of the uniform distribution:
set.seed(2810101) x <- runif(100) logx <- log(x) library(moments) kurtosis(x) kurtosis(logx)
where a Normally distributed variable has kurtosis of 3.
on the other hand, in this example
set.seed(2829101) z <- c(rnorm(1000, 10, 1), rnorm(1000, 10, .01)) kurtosis(z) kurtosis(log(z))
However, you mention skewed data with kurtosis. Was your data right skew or left skew? Since the former is more common, I'll guess that.
set.seed(1919110) x <- c(rnorm(1000, 10, 1), rnorm(300, 30, 2), runif(10, 500, 600)) skewness(x) kurtosis(x) skewness(log(x)) kurtosis(log(x))
Here, taking the log improves kurtosis and skewness.
Taking the log had almost no effect on kurtosis.
As always, try plotting the data to see what is going on in your correlation.
Similar Posts:
- Solved – Kurtosis and skewness with nonparametric bootstrap
- Solved – Is the sampling distribution for skewness and kurtosis normal
- Solved – Generating distributions with with given variance, skewness, and kurtosis
- Solved – Generating distributions with with given variance, skewness, and kurtosis
- Solved – Simulate non-normal data for multiple regression