# Solved – Does it make sense to use PCA when the determinant of the correlation matrix is (almost) zero

I'm running a PCA over a data set of \$N times p\$ size (\$Napprox 1000\$ being the number of measurements and \$papprox 200\$ being the number of dimensions/predictors).

I expect many of the predictors to be correlated and that the dimensions can consequently be reduced. I can even drop some columns that are linearly dependent with respect to the others.

When I run the PCA I find that \$sim 50%\$ of the variance can be explained by the first 5 PCs, suggesting that the predictors can actually be grouped.

But I am concerned about the smallness of the correlation matrix (\$R\$) determinant, which is \$det(R) approx 10^{-100}\$ or a ridiculous number like that.

Do the results make sense with such a small number?

Moreover, I see that the PCA results change (a lot!) if I round the input numbers to drop non-relevant digits, like the 10th digit or so. I think this is linked with the fact we are working with such a small determinant.

Since a small determinant in R indicates that there are redundant dimensions, I would say that the PCA is the way to go to reduce them. Nevertheless, does it make sense to run a PCA with such a small determinant? If not, what is the best way to reduce the dimensionality of the problem?

Contents