How does principal component analysis (PCA) model data of admittedly higher dimensionality with just a few principal components?
Best Answer
I believe your question is something like:
"I have 10000 features and thus very high dimension, why PCA with only 3 principal components work"?
There is a misunderstanding here. We don't represent the original data set with just a few PC, we approximate and thus PCA is a data reduction technique. You will almost likely lose some information, but if you can minimize the information you lose, you should be fine.
PCA works by forming a new set of variables from the original features. It does that by maximising the variance the new variables can account for. You can think of it like an approximation technique. You approximate what you have, but the new approximation is not perfect. In practice, you can decide how many principal components you want. The more you want, the better approximation you have.
Similar Posts:
- Solved – For calculating the distance between different points, does it make sense to use all Principal Components
- Solved – PCA iteratively finds directions of greatest variance; but how to find a whole subspace with greatest variance?
- Solved – Variable Selection using Principal Component Analysis
- Solved – proportion of variance explained in PCA?
- Solved – What methodology does proc varclus use to reduce the number of variables