I saw this interesting topic: How to reverse PCA and reconstruct original variables from several principal components? and a nice answer with a very useful example of Iris data in Matlab. I would like to do the same using factor analysis instead of PCA. I tried to make it with 'factoran' of Matlab with the help of @ttnphns and @amoeba but I don't obtain a good correlation between my reconstructed data and the original ones.
input_data (*data are EMG measurement from 6 arm muscles in order to identify synergies)
PCA method:
X = input_data; mu = mean(X); [eigenvectors, scores] = pca(X); nComp = 2; Xpca = scores(:,1:nComp) * eigenvectors(:,1:nComp)'; Xpca = bsxfun(@plus, Xpca, mu);
I obtain good correlation between them.
FA method:
X = input_data; mu = mean(X); [LoadingsPM,specVarPM,rotationPM,stats, scores] = ... factoran(X,2,'rotate','promax'); Xfa = scores*LoadingsPM'; Xfa = bsxfun(@plus, Xfa, mu);
But in this case the correlations are bad. I don't know if I forget something? (I divided per 3 the FA reconstruction in order to see better the 3 curves).
@ttnphns note: word "reverse" in the title should be taken in the technical sense of computing variables as they are returned by the computed factors (their scores), – not in the theoretical sense (in which FA model is nothing but predicting variables by factors, so that there is no a "reverse" direction). In PCA, this prediction/direction indeed could be called "reverse" in a theoretical sense, too.
Best Answer
@amoeba and @ttnphns have solved my problem in the comments. I posted the solution if someone is interested.
@amoeba:
Turns out,
factoran
implicitly standardizes all input variables and hence conducts FA on the correlation matrix (it's written in Help: "factoran
standardizes the observed data X to zero mean and unit variance"). I could not find any input option that would turn off this behaviour. Hence, to do the "reconstruction", you need to computestds = std(X);
in the beginning and then to doXfa = bsxfun(@times, Xfa, stds);
after you multiplied scores by loadings and before adding the mean."
So the FA method corrected is:
X = input_data; [LoadingsPM,specVarPM,rotationPM,stats, scores] = ... factoran(X,2,'rotate','promax'); Xfa = scores*LoadingsPM'; Xfa = bsxfun(@times, Xfa, std(X)); Xfa = bsxfun(@plus, Xfa, mean(X)); `
To complete this post, I recommend you this nice explanation made by @ttnphns: What are the differences between Factor Analysis and Principal Component Analysis?
Similar Posts:
- Solved – How to compute varimax-rotated principal components in R
- Solved – How to do factor analysis when the covariance matrix is not positive definite
- Solved – Dimensionality reduction technique similar to LDA when class labels are probabilistic
- Solved – PLS-DA with binary predictors in R (package mixOmics)
- Solved – Chi Squared Kernel and Faster implementation