Solved – How to export and use results of PCA from R

My ultimate goal is to run a cluster analysis on a data set with > 1 million records. The input variables for the cluster analysis will be the results of a Principal Component Analysis, as well as other variables not included in the PCA, for a total of maybe 10 variables input into the clustering (the variables I input into the PCA were all very highly correlated with one another while the other variables are not so I chose not to include them in the PCA).

#read data mydata <- read.csv('mydata.csv')   #import library for robust methods because my data contained outliers library(rrcov)   #run robust PCA method called PcaCov pcaR <- PcaCov(~., mydata, na.action=na.omit, center=TRUE, scale = TRUE, k=8)  #look at results summary(pcaR) screeplot(pcaR) [email protected] 

From the results, I have decided I would like to retain the first three components, which capture ~87% of the total variance in the dataset.

Now I want to extract/save/export these first three components for use in the cluster analysis with my other variables. How do I do this?

For each variable obtained by PCA you have a loading vector (for example $v=(1,-2,5,5)$ this vector define your new variable as combination of the original ones. $x_1-2x_2+5x_3+5x_4$. You can define a new matrix where the variables are obtained as the linear combination defined by the loadings obtained with PCA. So for example $z_1=x_1-2x_2+5x_3+5x_4$.

Similar Posts:

Rate this post

Leave a Comment