# Solved – For calculating the distance between different points, does it make sense to use all Principal Components

I have a data frame with about 500 observations and 8 variables that I'd like to run through PCA in order to try and reduce the number of variables to only those with the most variance.

From here, I want to find the [Euclidean] distance between each observation.

Here's my question: should I use every Principal Component to calculate the distances? Or should I just use (by the general rule of thumb) the Principal Components that describe, in total, about 90% of the variance (here, the first 6)?

Here's the importance of components (from R) if you're curious:

``Importance of components:                           PC1    PC2    PC3    PC4    PC5     PC6     PC7     PC8 Standard deviation     1.4652 1.1997 1.0477 0.9630 0.9103 0.87524 0.75321 0.47645 Proportion of Variance 0.2683 0.1799 0.1372 0.1159 0.1036 0.09576 0.07092 0.02838 Cumulative Proportion  0.2683 0.4482 0.5855 0.7014 0.8050 0.90071 0.97162 1.00000 ``

Any ideas? I'd appreciate any insight.

Contents