I'm using PCA to reduce dimensionality before I feed the data into a classifier. My bootstrap/cross-validation has shown a significant reduction in test error as a result of applying PCA and keeping the PCs whose standard deviation is a fraction (say, 0.05) of the standard deviation of the first PC. My features are actually histograms (i.e. vector-valued), so instead of applying PCA once globally to the whole dataset, I applied it locally to *some* features, which I preselected manually based on the number of features (picking the ones with the most columns). I've tried adjusting the aforementioned tolerance, and tried applying PCA to higher and lower numbers of these histogram features.

My question is, can someone please describe a more *precise* way of finding the optimal amount of dimensionality reduction via PCA as applied above which leads to the highest test accuracy of my classifier? Does it come down to running a loop with a sequence of tolerances and different PCA-treated features and computing the test error for each setting? This would be very computationally expensive.

**Contents**hide

#### Best Answer

Note that ridge penalisation/regularisation is basically doing model selection using pca. Although it does it smoothly by shrinking along each principal component axis rather than discretely by dropping small.variance pcs. Note that because you are doing pca on different subsets of variables this would roughly correspond to having different regularising parameters for each group rather than one for all the betas. The elements of statistical learning explains ridging quite well and provides some comparisons.

### Similar Posts:

- Solved – Which one should be applied first: data sampling or dimensionality reduction
- Solved – How exactly do neural network and support vector machine methods reduce dimensionality
- Solved – What should you do if you have too many features in your dataset, dimensionality reduction or regularization
- Solved – What should you do if you have too many features in your dataset, dimensionality reduction or regularization
- Solved – Feature selection for MLP in sklearn: Is using PCA or LDA advisable